Unifrac Distance Matrix -> Newick using R


Ran into this today and wanted to share. We are looking at describing beta diversity between our samples using UniFrac as described in the Costello example. However, there doesn’t appear to be an easy way in mothur to get a tree from the resutling clustering of the UniFrac distance matrix. So, I hacked this together this morning for doing it in R. Wanted to share…

distFile <-"/path/to/your/file.dist"
size <- numberOfSamplesInDistFile
method <- "average"

mat <- data.matrix(read.table(distFile, fill=T, row.names=1, skip=1, col.names=1:size))
d <- as.dist(mat)
c <- hclust(d, method=method)
p <- as.phylo(c)
write.tree(p, file=paste(distFile, ".", method, ".tre", sep=""))

Simple enough. R complains about the matrix being non-square, but I don’t think it’s a big deal. Method can be “ward”, “single”, “complete”, “average”, “mcquitty”, “median” or “centroid”.


Thanks, Chris. Actually you can generate a tree in mothur using tree.shared(phylipe=yourfile.unweighted.dist)

Thanks, Pat. I admit to having overlooked the tree.shared command using a distance matrix as input. It’s also nice to have R available for more than UPGMA as clustering method.