why .an.unique_list.0.03.cons.taxonomy results have replicates with same classification?

ch3coch3 · March 17, 2016, 10:51pm

Hi,

When I check the OTU classification results in “test.trim.contigs.trim.good.unique.good.filter.unique.precluster.pick.pick.an.unique_list.0.03.cons”, some of the results were the same, for instance:

Otu0007 294 Bacteria(100);Firmicutes(100);Bacilli(100);Lactobacillales(100);Streptococcaceae(100);Streptococcus(100);
Otu0008 195 Bacteria(100);Firmicutes(100);Bacilli(100);Lactobacillales(100);Streptococcaceae(100);Streptococcus(100);
Otu0009 115 Bacteria(100);Firmicutes(100);Bacilli(100);Lactobacillales(100);Streptococcaceae(100);Streptococcus(100);
Otu0010 256 Bacteria(100);Firmicutes(100);Bacilli(100);Lactobacillales(100);Streptococcaceae(100);Streptococcus(100);

Are OTU 7,8,9 and 10 belong to the same genus? If so, can I summarize them together? And why the outputs are separated? If not, why they all have the same name?

Thanks

dwaite · March 18, 2016, 3:43am

This is because OTUs are not the same as taxonomy. OTUs are defined by the difference between sequences in your data set, taxonomy is defined by the similarity of those sequences to an external reference database.

I see you used 97% similarity as your OTU definition, which is the common proxy for species-level differentiation. If you think about that, the results you’re showing here are all genus-level classifications. Those OTUs may all be difference species (or strains), but a genus-level classification still groups them the same. More realistically though, they’re simply different metrics for describing the same data. Using what I said above, OTU_7 and OTU_8 could be ~4% different from each other, but when tested against your database still be a closest match to the same sequence in the database.

If you want to get taxonomy-based clustering, the phylotype command builds a shared table off your taxonomy/count data.

ch3coch3 · March 21, 2016, 8:55pm

dwaite:

This is because OTUs are not the same as taxonomy. OTUs are defined by the difference between sequences in your data set, taxonomy is defined by the similarity of those sequences to an external reference database.

I see you used 97% similarity as your OTU definition, which is the common proxy for species-level differentiation. If you think about that, the results you’re showing here are all genus-level classifications. Those OTUs may all be difference species (or strains), but a genus-level classification still groups them the same. More realistically though, they’re simply different metrics for describing the same data. Using what I said above, OTU_7 and OTU_8 could be ~4% different from each other, but when tested against your database still be a closest match to the same sequence in the database.

If you want to get taxonomy-based clustering, the phylotype command builds a shared table off your taxonomy/count data.

that makes lots senses. Thanks much

Topic		Replies	Views
taxonomy problems Commands in mothur	3	1343	May 31, 2016
classify.otu Commands in mothur	1	1106	August 28, 2015
Classify OTUs by sample using all sequences Commands in mothur	5	5981	June 20, 2014
OTU based analysis vs phylotype based analysis Theory behind mothur	5	2742	June 2, 2016
Many OTUs classify into the same species?? Commands in mothur	2	1829	February 9, 2016

why .an.unique_list.0.03.cons.taxonomy results have replicates with same classification?

Related topics