Tree for Unifrac.Weighted?

katinker · November 29, 2017, 7:50pm

I would like to run unifrac.weighted to get weighted unifrac distances from my dataset. I have my count table, however I need some help figuring out what to use for my tree file.

I used Silva to align my sequences–can I use the .arb files associated with Silva for my tree file? Or do I need to use clearcut to construct a tree from my own data? I would love some assistance understanding what I should use and why.

Thanks!

dwaite · November 30, 2017, 7:56pm

The unifrac commands just require a newick formatted tree file, so any software that give you one will work for this command. You can create a neighbor-joining tree in mothur using the clearcut command. Another option would be to use FastTree to create a pseudo-ML tree.

As long as you use the fasta file that corresponds with your count table, the results of any tree builder should work fine.

katinker · December 4, 2017, 8:14pm

Can I use the .arb files associated with Silva for my tree file? Or do I need to use clearcut to construct a tree from my own data?

I have been trying to use clearcut to construct a tree, however it’s been running for well over 7 days on my computer. My understanding is that FastTree is less computationally intensive, but I’m not sure if that means the quality is compromised.

I would appreciate any additional insight you can provide.

dwaite · December 4, 2017, 9:06pm

You can’t use the ARB files, but if you have a tree inside your ARB database you can export it out and use (Tree -> Tree Admin -> Export).

How many sequences are you trying to build your tree from? FastTree basically builds a BIONJ tree and then refines the tree using ML criteria, so I wouldn’t expect it to be any quicker than clearcut.

katinker · December 4, 2017, 9:53pm

I have quite a few sequences–I think that is part of my problem at the moment. I was running clearcut based on the fasta file generated immediately before the dist.seqs step in the MiSeq SOP. It seems to me that instead I need to build a tree based on representative sequences for each OTU in my dataset. It looks like I can use get.oturep for this.

Does this seem reasonable to you? Any other suggestions? I appreciate the help!

dwaite · December 5, 2017, 7:21pm

One thing you could try to speed the process up is to run dist.seqs before clearcut. The reason for this is that dist.seqs can be split over multiple processors but clearcut sits in a single process. I’m not sure how clearcut handles a fasta input, but given your run time I’m assuming that it uses a single thread.

Topic		Replies	Views
Tree.shared vs. Clearcut. Commands in mothur	1	3171	December 13, 2011
Unifrac with unifrac distance Commands in mothur	2	2206	February 10, 2016
Phylogenetic Tree Programs Compatible With MOTHUR Commands in mothur	4	5797	July 23, 2010
Does tree for unifrac tree to be rooted? Commands in mothur	5	33410	June 11, 2013
cluster.split and unifraq Theory behind mothur	1	3262	January 14, 2013

Tree for Unifrac.Weighted?

Related topics