I’m wondering which phylogenetic tree building programs are compatible with MOTHUR.  I would like to be able to perform the parsimony test on a very large pyrosequencing data set (~120,000 sequences split among 5 groups).  My distance matrices are upwards of 10 GB.  Unfortunately the Clearcut website seems to be down and I cannot obtain that software that has been integrated for use in MOTHUR.  I have been looking at FastTree as an alternative but am getting errors about reading the … in my alignment files.  Does anybody know if this will cause a problem in downstream in the generation of the tree accuracy?
I would appreciate any suggestions or comments,
Elizabeth
             
            
              
              
              
            
            
           
          
            
            
              I am attaching source code for clearcut 1.0.9. I don’t think James will mind.
I deleted the examples to get under the Mothur forum’s 256kb upload limit.
But honestly, with the huge number of sequences you have, there is an over
9000% chance that tree-based methods will declare your samples to be statistically,
significantly different.
Robin
             
            
              
              
              
            
            
           
          
            
            
              Thanks for the software, I appreciate it.  Since the tree building Parsimony Test seems likely to not be the best choice for comparison of libraries, do you think that using LIBSHUFF would be a a better approach?  We first want to see if these libraries are significantly different (as we suspect that some are not) before we go into any other in depth analysis.
Elizabeth
             
            
              
              
              
            
            
           
          
            
            
              Although people’s jaws may drop, if you want to do a hypothesis test, the unifrac.weighted/unweighted test might be the way to go.
             
            
              
              
              
            
            
           
          
            
            
              Elizabeth,
I have been using fasttree with no problems for about a month on some fairly large 16S datasets (>100k reads). The warning about the “.” characters can be dismissed as long as your alignments have minimal “.” (mine have 10-20 on either end). I have run the fasttree output through unifrac within mothur on three different sample sets and the sample distance matrices are comparable to those generated using various OTU-based distance matrices (97% similarity clusters, phylotype levels 4-7 with SILVA). I looked into a variety of tree-building algorithms and fasttree seems to work very well, very quickly, and with minimal headaches. Feel free to contact me if you are having issues with fasttree, though I am no expert.
Craig Nelson