Readtree error message

Hi all,

I’m getting an error message when I try to import a tree into mothur using the “read.tree” command (for use in parsimony and unifrac analysis). I used a program called “fasttree” to build the tree. It is a program that can build a tree using the alignment instead of a distance matrix. I’m finding I can’t use phylip to build my tree since my distance matrix is too large (nearly 8 gb). Unfortunately, when I try to read my tree into mothur, it errors out. The trees from fasttree are supposed to be in newick format.

Mothur closes and I have to open the log to get the error message, which reads: “Error: Expected comma in input file. Found ).” I’m not sure what this error means. I’m assuming it’s a formatting issue with my tree. I ran a small sample through fasttree and it read fine, but I got this error when I ran a sample size of about 2000 samples and my large alignment of 48000 samples. Has anyone used this program successfully with mothur or can someone tell me what this error message means? Any help would be appreciated.

I’m using a windows-based pc with 8 processors and 48 gb RAM, so I don’t think it’s a computer issue. Thanks!

-Damon

sounds like the tree file is corrupted. Have you tried opening it with a tree viewer, like figtree?

yeah, it’s corrupted. by the node labels that fasttree puts in and by the fact that they don’t give bifurcating trees. i don’t actually think the fasttree trees have branch lengths (so why do unifrac?), but i might be wrong.

I recommend clearcut for building trees, very fast even on very big trees.

Pat: I get the same error message even when I turn the internal node labels off in fasttree. I’ve heard that other researchers are using fasttree trees to do unifrac analysis, so I was under the impression there is a way.

James: Thanks for the suggestion on figtree. I wasn’t aware of that program. I’m able to open my trees in figtree without any error messages, so I don’t think my tree files are corrupted. Unfortunately, I can’t use clearcut since I’m running a windows based pc and it looks like clearcut only works on linux.

It sounds like we might just have to bite the bullet and look into getting a linux based PC so we can use clearcut. That seems to be the standard method. Just out of curiosity, has anyone successfully run clearcut with an 8 gb distance matrix?

Thanks for all the advice.

could you email us an exampmle fasttree tree (mothur.bugs@gmail.com)? i’ve successfully used clearcut to build nj and r-nj trees for datasets up to 14,000 sequences

better yet, get a mac!

Good luck.