Tree.shared vs. Clearcut.


I’m working with some sequence from several different samples that I want to compare using unifrac. I had done the analysis previously building my tree file using the tree.shared() command, but I recently went back and tried building the tree file using clearcut() and found that in the subsequent unifrac it gave a different result. It doesn’t bother me what the true result is (whether the communities are significantly different or not) but it worries me that I can get different results from the same data depending on which way I build the tree. Reading through the mothur wiki I’ve seen both methods used for performing unifrac so I don’t see a clear preference for performing the analysis. Can anyone enlighten me as to which results to trust?

Also, on a general note, is there a good rule of thumb for ideal sequence length when build the distance matrix for these sort of analysis? I ask because there seems to be a bit of a judgement call when screening the aligned sequences. I suppose it’s the choice between more sequences in the data set or fewer, longer, sequence to analyze.

Within mothur, I would encourage one to use clearcut to build trees with sequences and tree.shared to build trees of groups. clearcut uses the neighbor joining algorithm whereas tree.shared uses average neighbor. They’re pretty different algorithms.