We are new to building phylogenetic trees not from whole length 16S rRNA genes but from MiSeq generated sequences. As we are investigating environmental samples we are applying oligos for the V3-V4-region. Now we are asking ourself some more fundamental questions…
Q1: Does it make any sense to calculate a phylogenetic tree using such a short sequence of roughly 400 bases?
Q2: What would be the minimum length to calculate a phylogenetic tree with confidence (e.g. publishing the results)?
Q3: How does the recommended depth of tree branching depend on sequence length?
We are aware that these are very basic questions. Nevertheless many thanks for reading, commenting, and your support!
We are very grateful for any advices to publications/ literature!
Q1: Does it make any sense to calculate a phylogenetic tree using such a short sequence of roughly 400 bases?
Hard to say. I suspect it really depends on what you are trying to say with the tree. UniFrac and Phylogenetic diversity seem to be robust to having such short sequences. I wouldn’t propose new lineages with such sequences.
Q2: What would be the minimum length to calculate a phylogenetic tree with confidence (e.g. publishing the results)?
Dunno. To me, phylogenetic is about proposing new lineages and looking at how things branch from each other. You should really want full length sequences to do that.
Q3: How does the recommended depth of tree branching depend on sequence length?
I’m not sure what you mean by “depth of tree branching”.
The other thing you’ll have going against you is that your reads are not going to overlap with each other very well. We have found (and continue to find as Illumina changes their chemistry) that unless the reads fully overlap you are going to have a very high sequencing error rate. See http://blog.mothur.org/2014/09/11/Why-such-a-large-distance-matrix%3F/.
thank you very much for taking the time and answering our questions!
…I suspect it really depends on what you are trying to say with the
tree. …
On the one hand we aim at describing the community of extreme habitats (halophiles and acidophiles). On the other hand it would be of interest to find and describe new lineages, indeed. And there it seems we get to the limitations of this short sequences sequencing. However, we could narrow down possible cultivation strategies as we then know to which order/ family the possibly new lineages would belong to.
Q3: How does the recommended depth of tree branching depend on
sequence length?
I’m not sure what you mean by “depth of tree branching”.
Sorry for being not clear enough. Let us try to explain it by an example…
We are interested in the genus Acidiphilium, which contains two clusters (e.g. Ullrich et al., 2015, Fig.1, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4571130/pdf/40793_2015_Article_40.pdf). Now we sequence the amplification products of a new AMD-sample. The aim would be to be able to tell to which taxonomic cluster a sequence belongs to. So in principle define the species. But I guess, we can answer this question by ourself… as long as the amplified sequence does not contain the defining sequence differences it is not possible. And vice versa.
If you’re looking just within one group 16S may not be the target you want. (don’t get me wrong, I love 16S but I’m interested in whole communities). Any other marker genes that people have studied for your groups that are ~250-300bp long?
I think the other thing to keep in mind is that classification != phylogenetics. If you want to see which of two groups a sequence belongs to, you can do this with classify.seqs assuming that you have representative reference reads for your different groups.
thank you very much for your inputs!
Yes, we are getting the feeling that 16S rRNA genes might not be the sole amplification product which might make us happy.
As said, we got the MiSeq just recently, and I guess we are getting slowly to the point that we recognize that it is not the
prime method to generate phylogenetic trees.
We will have to look for such marker genes in our target groups.
Thanks again both of you for your thoughts!
Regards,