A little confused by which distance matrix is best/most appropriate to use to compare community structures/abundances between two groups of samples.
I am following the MiSeq SOP, which calculates Unifrac but then visualizes the Thetayc and Jclass distance matrices.
Just wondering if someone could clarify this - Is thetayc best to use? Most publications seem to use Unifrac.
Each measure emphasises different aspects of the community. There is no best. And I’d suggest that “most use Unifraq” has little to do with appropriateness of the distance measure, and everything to do with the fact that qiime spits out plots that use it. Easy and appropriate are not the same thing. If you use a distance metric, you should be able to explain why.
Jaccard (very similar to Ochai which qiime uses for no apparent reason) presence/absence upweights rares
Bray-Curtis incorporates abundance- down weights rares
Theta YC- punishes abundant but patchy, really down weights rares
unifraq unweighted- phylogenetic distance p/a, upweights rares
unifraq weighted - phylogenetic distance with abundance
Thank you for your reply kmitchell
I just want to clarify - I know in the MiSeq pipeline thetaYC is kind of promoted, could you explain what you mean by how this metric ‘punishes abundance but patchy, and really down-weights rares’…it just seems difficult to know which one is appropriate for my experiment.
Basically I am comparing gut microbiomes of two populations fed two different diets and want to visualize differences in the community structure. Maybe i’ll just stick with thetaYC?
Thanks again for your help
read the thetayc paper, it very nicely demonstrates what it does. Or go through the math of the various calculation