Hello,
I am currently analysing pig faecal samples using Mothur, and I was wondering if somebody could summarise the advantages/disadvantages of binning into phylotypes and OTUs? I am trying to make a decision as to what would be the best choice for my project.
Cheers,
Jo
1 Like
First, some definitionsā¦
Phylotype - your sequence compared to a database and then binned into a group based on its similarity to the database
OTU - your sequence compared to the other sequences in you dataset and binned into a group based on its similarity to other sequences in the dataset
So the primary disadvantage of phylotypes is that it is database dependent: people call the same thing multiple names, some sequences aren't in the database, some genes have really bad databases (e.g. nifH), and usually you aren't able to classify all the way to the genus or species level. The advantages are that it is very fast, forgiving of sequencing errors, and you get a name directly. Names give people warm fuzzy feelings :)
So the primary disadvantage of OTUs is that it is slow and computationally āhardā. It is sensitive to sequencing error rates and so if you have a high error rate you can easily get a gigantic distance matrix that will never cluster. The advantages are that you donāt have to worry about a database and you can tag names onto OTUs later. Also, you tend to get greater resolution - we frequently have many OTUs that have the same genus name because they represent some āsub-genusā taxonomic level.
Weāre biased towards the OTU-based approaches around here.
1 Like
Dear Pat,
Thank you very much for your clear reply, much appreciated.
Yes, OTU-based analysis does appear to be more advantageous. As you say, though, people like to see pretty names!
So, I guess I am getting a lot more resolution with the OTU-based analysis (2300 OTUs) rather than the phylotype-based analysis (180 OTUs).
Cheers,
Jo
Hi All,
I just want to jump on this topic, to dig a little deeper.
Patrick, you already convinced me of using OTUs. Iām not very inclined to go towards phylotyping.
But, in your 2013 paper in AEM, you propose a heuristic that is basically doing phylotyping before OTU building?
I understand this would speed up things, so it seems valid enough.
But if you are a huge fan of OTUs, can you live with basing your OTUs on a phylotype input, and thereby drag the disadvantages of phylotyping along your further OTU analysis? I donāt understand how this approach can be even remotely comparable to a ārealā OTU approach clustering. This way, you can never come up with a OTU that groups sequences from different phyla that might be very much alike, which I thought was one of the strengths of OTU-clustering?
Or am I reading this paper the wrong way?
I would very much appreciate some insight in this proposed analysis pathway!
Thanks,
Katinka
Great question. So if you look at the figure in the paper, youāll see that our F statistic for a 0.03 cutoff doesnāt really differ between the heuristic and non-heuristic approaches. This gives me the confidence that we get a speed up without a loss in clustering quality. To be safe, I do generally first cluster to the class or family level and then do OTUs. If itās a gnarly dataset that is huge, weāll go to genus level.
Pat