I was reading the blog post about large distance matrices. I understand phylotyping is not very good but I was wondering if there’s any more information (e.g. papers?) about your other concerns:
- the problems with phylotyping are numerous
- all databases are lousy
- names have questionable meaning
Phylotyping is a good alternative when you do not have enough computational resources although it is not as accurate. When comparing communities it can give you very similar results. It just depends on your question and resources.
see: Bacterial community comparisons by taxonomy-supervised analysis independent of sequence alignment and clustering
It also depends on what system your working in. Mouse/human gut, phylotyping isn’t going to be as bad as if you were to try it on something like soils.
I wouldn’t say phylotyping is awful - It’s limited because our databases are limited. I would think of it as a different and coarser, level of looking at a community. Note that phylotyping using the naive Bayesian classifier is quite a bit different than what is commonly referred to as “closed reference clustering”. In closed reference clustering sequences are mapped to a reference and the tag of whatever is found in the database is applied to the novel sequence. We hammered this method in a recent paper here: