using unifrac after cluster.split

FM_Kerckhof · September 23, 2015, 10:58am

Hi,

In the Miseq SOP it is suggested to use cluster.split as alternative to dist.seqs/cluster.
I prefer this way of working because even with very low cutoffs the distance matrices from dist.seqs take a lot of disk space.
However, to use unifrac you need a phylogenetic tree and to make a phylogenetic tree you do need to run the dist.seqs.
Is there no way at all to avoid this?
For instance, RaXML can take any alignment (I generally use it on mothur NAST alignments) and construct a phylogenetic tree without polluting the disk.
Or is this the reason the SOP mentions “this process gets mess as your number of sequences increases”?

Kind regards,

FM

pschloss · September 28, 2015, 12:16pm

Yeah, this gets messy with more sequences. The problem is you have to build a tree from all of those sequences unless you do some weird mapping procedure. If you can build a tree with RaXML, go for it.

Pat

FM_Kerckhof · September 29, 2015, 2:20pm

Hi thanks for your reply, I was afraid there’s indeed no other option.
I am not very well aware of the licensing of RaXML but could RaxML be integrated in mothur?
In my humble opinion phylogeny (not taxonomy) is an interesting measure to assess some hypotheses on NGS data (implicitely OTU binning is some kind of “phylogenetic” binning too).
RaXML is fast and has several HPC extensions (MPI, …) additionally, for Illumina MiSeq read lengths, I think the EPA (evolutionary placement algorithm, http://sysbio.oxfordjournals.org/content/60/3/291.full) is something interesting?
Or am I missing something here?

Kind regards,

FM

Topic		Replies	Views
Use cluster.split on MiSeq data Commands in mothur	15	13896	May 9, 2013
Using cluster.split with large data Commands in mothur	2	2698	March 31, 2014
cluster.split and unifraq Theory behind mothur	1	3262	January 14, 2013
Theory behind unifrac in mothur Theory behind mothur	3	3708	January 21, 2015
Making OTUs without distance matrix Theory behind mothur	8	845	September 29, 2019

using unifrac after cluster.split

Related topics