Problems with MPI version of pairwise.seqs

kkmattil · February 28, 2014, 2:10pm

Hi

I am having some problems with the MPI version of mothur.

Here at the Finnish supercomputing center (CSC) we have been testing mothur for a researcher who would like to run mothur command: pairwise.seqs for a very large dataset.

Using the mpi version of mothur looks like the only option, if we want to run the analysis in reasonable time. As the all-against-all pairwise sequence alignment can be easily parallelized, the command actually scales nicely to over 1000 cores in our Cary XC30 system.

However, it looks like that if we use more than four cores, part of the results is lost and the distance matrix is incomplete. There are no error messages in mothur log file, and the results that find their way to the distance matrix file are correct, but not all the distances are found in the result file.

Further, the error is not systematic, some times more results are missing, sometimes less. Also when we increase the amount of computing cores to be used, the tendency of loosing results seems to increase.

We have observed the same behavior, both in our Cray XC30 supercomputer and in our HP SL230s G8 cluster.

Any ideas how we could fix this?

Regards,

Kimmo Mattila / CSC

westcott · February 28, 2014, 8:07pm

Are you using a cutoff with the command that would cause mothur to eliminate distances from the matrix? Have you tried our non-MPI version?

kkmattil · March 3, 2014, 7:43am

Hi.

We have been testing the command with and without cutoff.
The thread based mothur version works file, but that of cause limits
the computing as we can use just one cluster node.

Regards,

Kimmo Mattila

Topic		Replies	Views
trim.seqs() with "processors" >1 produces corrupt files mothur bugs	10	9251	September 3, 2013
Applying pairwise.seqs for ITS1-ITS2 Commands in mothur	6	52	January 16, 2025
Distance matrix issues - still running Commands in mothur	6	722	October 14, 2019
Issues with cluster command Commands in mothur	5	4453	December 19, 2012
Segmentation fault using align.seqs mothur bugs	8	3888	November 2, 2015

Problems with MPI version of pairwise.seqs

Related topics