Is it possible to use phylogenetic diversity measures on data that includes sequence from several regions of the 16S gene (eg. if they’ve been summarized/combined as a single taxonomy file)? The programs for building massive trees seem to require aligned sequences - but it seems like you ought to be able to map directly from a taxonomy file to branches on an already-built tree (eg. derived from an external database). Has anyone done this?
Thanks!
Yeah Rob Knight’s group has done this. I personally think this is fraught with all sorts of problems. Problem #1 is that all primer sets have different biases.
What if wanted to do anyway, how would I do that? Rob Knight has published that?
Thanks.
It sounds exactly like closed-reference OTU picking in QIIME. It’s here, just use one of the ‘_ref’ methods.
As Pat says though, this isn’t a great method to use without serious justification.
Their own paper seems to indicate that it’s pretty worthless:
“Together, these results show that cross-study comparisons of human microbiota are valuable when the studied parameter has a large effect size, but studies of more subtle effects on the human microbiota require carefully selected control populations and standardized protocols.”
In other words, why bother?