I love MOTHUR (Thought about getting a tattoo saying these exact words). The only thing that is killing me at the moment is the clustering into OTUs. It is just no feasible with the cluster command to cluster >50,000 reads. Uclust is so much faster but I am stuck with their output. Looks like a happy Perl file format conversion orgy ahead of me.
mothur > cluster.split(column=2.TCA.454Reads.trim.unique.pick.filter.dist,name=2.TCA.454Reads.trim.pick.names,method=average)
Splitting the file…
It took 1288 seconds to split the distance file.
Clustering 2.TCA.454Reads.trim.unique.pick.filter.dist.0.temp
It took 141830 seconds to cluster
Merging the clustered files…
It took 31 seconds to merge.
Sorry, our documentation is pretty poor at the moment. The way you are running cluster.split there isn’t actually any splitting going on. Try one of the following options…
With each of these, you will probably want the processors option to make use of the parallelization. The final option is the fastest and was the method that was most comparable to Uclust for speed in the AEM paper we published at the beginning of the summer.