Theory behind unifrac in mothur

mrm · January 16, 2015, 10:41pm

Hi,

I have some questions regarding the theory behind unifrac in mothur. For my samples, I have followed the Miseq OTU-based analyses and run unifrac with the tree generated with the tree-shared command as an input. I understand that the distance file that is used as the input for the tree.shared command is generated by calculating the distances between samples based on OTU presence/absence, or relative abundance (depending on the calculator you pick), and not by looking at sequence similarities.

However, talking to a colleague today, he seemed confused about the mothur approach since he mentioned that the original idea behind Unifrac was to take a phylogenetic tree containing all the reads, and a group file mapping read to group, and determine whether the groups are evenly spread across the tree or not, taking into account hierarchies. In his opinion, the tree we are reading into the Unifrac command in mothur, however, is a dendrogram describing similarity between samples.

I wonder if anybody has run into this idea before. It might have a very simple explanation but we don’t seem to figure it out.

Thanks!

dwaite · January 19, 2015, 4:06am

I’d say your colleague is right. UniFrac distances should be calculated off a phylogenetic tree of sequence data. You can build one from your samples using the dist.seqs/clearcut combination to get an neighbor-joining tree. You can then use this as the input for your unifrac distances. I wouldn’t use the output of tree.shared as the input for a unifrac, since this function uses UPGMA to build the tree.

The unifrac command in mothur (and QIIME) just takes a valid tree file and runs the analysis on it, so you won’t see an error if you pass the output of tree.shared into the unifrac functions, but the output won’t be very informative.

mrm · January 19, 2015, 7:30pm

Thank you for the clear response! It is a bit confusing in the Miseq SOP why they use unifrac for OTU-based analyses, then…

pschloss · January 21, 2015, 2:24pm

We use unifrac in two places in the SOP. In the OTU-based analysis, it is to analyze a tree of samples and at the end of the SOP in the phylogenetic analysis section it is to analyze a tree of sequences.

Topic		Replies	Views
Ordination using unifrac distances? Theory behind mothur	5	5763	August 18, 2014
dendrogram based on unifrac metrics Commands in mothur	3	5354	September 1, 2010
help understanding Unifrac Theory behind mothur	7	7705	May 27, 2014
Comparison of environment by UniFrac Theory behind mothur	1	1764	August 1, 2016
Unifrac with unifrac distance Commands in mothur	2	2218	February 10, 2016

Theory behind unifrac in mothur

Related topics