I know it’s technically possible in mothur to analyze mixed data, as I’ve already ran analysis on 16S amplicons using data from Sanger (full 16S rRNA gene) and 454 (400bp fragment) without any issues. However, I’m not sure it is scientifically correct way to do it, as lengths of reads differ 4 times (reliabilities are different).
Would you analyze mixed data together or separately? If together, do all steps from Mothur SOP apply?
Good question. I’d say be verrry careful. You’re likely using very different primers with different biases. You also have different length sequences. Both of these factors could really skew your results. If I were to do this, it would really just be for data exploration and not an “official” analysis. My approach would be to classify everything and then to compare the output at a broad taxonomic level. Like you said, you can do the full analysis, but there are probably so many caveats that I would wonder why one would base much on it.
This is essentially also what I’m doing right now.
most of “my” Sanger seqs are about 450 nt long, and are done with the same primers as the pyroseqs. There were also longer seqs available. I trim these down to 500.
Following along the SOP (thank God for that, or something …), I merge the data after the trim.seqs command (before unique.seqs).
Now, doing the forward analysis works fine. The results are peculiar, to say the least, and careful interpretation is probably indeed needed.
The reverse pyro data is not that straightforward, however. I lose a lot of Sangerseqs after alignment (screening), which I find strange, because after all, they should be in the same alignment sphere (first 500 nt) and are aligned against the flipped pyroseqs. Am I overlooking something?
As this comparison should lead to a publication (hey, I’m just an employee following orders), any suggestions on processing/interpretations are welcome! :ugeek:
Hmmm, dunno, Kirk. You sure you’re flipping the right sequences?