Is it possible the HPC-Clust algorithm could be implemented in mothur? Its much faster than mothur’s cluster(), probably largely due to multithreading, and in a brief evaluation, it doesnt seem to suffer from a problem I’m having (failure to cluster past unique, which could get its own post, but I believe the authors are familiar with the issue).
The paper is here: http://bioinformatics.oxfordjournals.org/content/30/2/287.long
Source code is here: http://meringlab.org/software/hpc-clust/
The paper is brief and mostly focuses on performance, but there is a bit more information in the supplemental info. Not sure how much work it would be to implement, but the relatively slow, flaky clustering algorithm is one of mothur’s only weak spots at the moment, in my opinion.