OTU classification and minimum entropy decomposition

claude · May 26, 2016, 3:19pm

I’m a new user of mothur and I’ve been reading about the methodology of OTU generation. Last year a paper came out by Eren et al. that generates OTUs by using minimum entropy decomposition (MED), where information-rich base positions are used to separate a group of sequences into smaller groups iteratively, ultimately ending with small groups/final OTUs. The conventional approach implemented by the mothur uses, initially, binning sequences into a taxonomic level (e.g. Order), then doing de novo clustering for each bin. (Please let me know if I am inaccurate with my statements).

2 questions come to mind:

Is MED potentially a “better” approach than the current one? Or is MED good only in the sense that it can tease apart very similar sequences into separate OTUs?
How do you deal with multiple copies of 16S genes in one organism? I think I read once that the # of 16S genes within one species can range up to tens of copies, pseudogenes included. Would this result in inaccuracy when inferring organismal abundance from 16S reads?

Ref:
Eren et. al ISME 2015: http://www.nature.com/ismej/journal/v9/n4/full/ismej2014195a.html

dwaite · May 26, 2016, 10:04pm

I wouldn’t comment on whether it’s ‘better’ or not, but it’s more just a different way to analyse the data. If I remember the the manuscript correctly, the authors pull out a single genus or sequence cluster to analyse - they don’t use oligotyping across the full data set. The approach is designed to look for vary subtle differentiation within a mostly identical set of sequences.

I would say that it’s inappropriate to apply oligotyping to your full data, but it’s more of a downstream extension in your analysis. For example, you build up your OTU table and look at all the differences, then if you notice a particular OTU or genus is doing interesting things you could examine it further with oligotyping.

For your second question - yes, it does. But if you’re looking at 16S data alone there’s nothing you can do about it :mrgreen: It’s just one of those limitations of the method. You can kind of get around this (at least the straight copy number difference) by using presence/absence methods of comparison.

Topic		Replies	Views
Using MED (minimum entropy decomposition) Integrating mothur with other programs	0	1546	October 7, 2016
Result after classifying.otu Commands in mothur	2	1123	August 29, 2016
The same OTUs are not merged Theory behind mothur	4	691	September 27, 2019
OTU based analysis and general Mothur outputs Commands in mothur	1	3183	May 22, 2014
Robustness and Reproducibility in the Demarcation of OTUs Journal club	2	5927	February 11, 2015

OTU classification and minimum entropy decomposition

Related topics