parallel capabilities?

Hi,

I am helping someone run Mothur on our HPC system. She is trying to get shhh.flows to run faster and it appears that it has parallel capabilities although I found a post that indicates that maybe it doesn’t provide much speedup using MPI:

http://mothur.ltcmp.net/t/time-required-for-shhh-flows/774/1

Can you summarize the MPI capabilities and expectations of the version? I downloaded version 1.31.2 last week and I compiled it with OpenMPI 1.5.4 with Infiniband. The way I am running it is:

mpirun -np 10 mothur “#shhh.flows(file=061113MA_rep.flow.files)” &> run.out

where 061113MA_rep.flow.files has the contents:

061113MA_rep.50113.1A.16S_515F.flow
061113MA_rep.50113.2A.16S_515F.flow
061113MA_rep.61312.3.16S_515F.flow
061113MA_rep.61812.1.16S_515F.flow
061113MA_rep.61812.4.16S_515F.flow
061113MA_rep.61812.7.16S_515F.flow
061113MA_rep.62712.15.16S_515F.flow
061113MA_rep.62712.18.16S_515F.flow
061113MA_rep.72712.1.16S_515F.flow
061113MA_rep.72712.2.16S_515F.flow

This is being submitted to a Torque/Moab resource manager/sheduler. The output (both stdio and stderr) to run.out shows:

TERM environment variable not set.
TERM environment variable not set.
TERM environment variable not set.
TERM environment variable not set.
TERM environment variable not set.
TERM environment variable not set.
TERM environment variable not set.
TERM environment variable not set.
TERM environment variable not set.
TERM environment variable not set.






mothur v.1.31.2
Last updated: 6/13/2013

by
Patrick D. Schloss

Department of Microbiology & Immunology
University of Michigan
pschloss@umich.edu
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

Type 'quit()' to exit program



mothur > shhh.flows(file=061113MA_rep.flow.files)

Using 1 processors.

Getting preliminary data...

>>>>>   Processing 061113MA_rep.50113.1A.16S_515F.flow (file 1 of 10)   <<<<<
Reading flowgrams...
Identifying unique flowgrams...
Calculating distances between flowgrams...
0       0       0
100     0       0.01
200     0       0.04
300     0       0.1
400     0       0.19
500     0       0.29
600     0       0.43
700     0       0.6
800     1       0.79
900     1       0.99
977     1       1.17

Clustering flowgrams...
********************#****#****#****#****#****#****#****#****#****#****#
Reading matrix:     ||||||||||||||||||||||||||||||||||||||||||||||||||||
***********************************************************************

Denoising flowgrams...
iter    maxDelta        nLL             cycletime
1       438     -62495.8        5       4.78
2       114.847 2.50974e+06     27      27.87
...

Also, another file gets created that looks similar but not exactly:

[cousins@marconi 1446]$ cat mothur.1376922106.logfile
Linux version

Using ReadLine

Running 64Bit Version

mothur v.1.31.2
Last updated: 6/13/2013

by
Patrick D. Schloss

Department of Microbiology & Immunology
University of Michigan
pschloss@umich.edu
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

Type 'quit()' to exit program
Using MPI       version 2.1
Script Mode


mothur > shhh.flows(file=061113MA_rep.flow.files)

Using 1 processors.

Getting preliminary data...

>>>>>   Processing 061113MA_rep.50113.1A.16S_515F.flow (file 1 of 10)   <<<<<
Reading flowgrams...
Identifying unique flowgrams...
Calculating distances between flowgrams...

Clustering flowgrams...
********************#****#****#****#****#****#****#****#****#****#****#
Reading matrix:     |||||||||||||||||||||||||||||||||||||||||||||||||||

So, it shows that it is the MPI version of the program. The program runs and the processes on the nodes show that they are running at 100% implying that they are doing something. However, the program doesn’t run any faster than in serial mode.

First of all, do I have it set up correctly? My assumption is that it is working on each data set sequentially and it is breaking that data set into pieces to send to the different MPI processes. Is that correct?

In the message that I put a link to at the beginning of this post it sounded like I might be better off not using MPI and specifying “processors=10” in the ssh.flows call. The nodes we have have 16 cores. Is it expected that that would be faster?

Thanks for your help,

Steve

There’s no speed up for shhh.flows with mpi. The advantage is that it is possible to put different files on separate nodes - this isn’t possible with MPI. She should be using the non-MPI version and use the processors option.

I’m not sure I understand. You say:

The advantage is that it is possible to put different files on separate nodes - this isn’t possible with MPI.

This seems to be inconsistent. Do you mean it isn’t possible with OpenMP/threads? What is the advantage of putting different files on separate nodes? Is there a document or post that details how you might benefit from using the MPI version?

I have tried just running it with processors=10 and I don’t get any speedup but that was also with the version compiled with MPI. I am compiling it without MPI now to give it a try… It runs very differently and I’m not sure if it is running well or not. It has forked the number of processes, not using threads as I had assumed. I’ll check with Jean to see if it is doing what she expects.

It has stopped sending anything to STDOUT or STDERR or any other files. Looking at the node that it is running on it has 4 running mothur processes and 4 defunct processes (this job was started with processors=8 instead of 10). My guess is that the defunct processes are just done with what they are supposed to do and are waiting for the forking process to acknowledge that they are done. The 4 “running” processes are at 100% but still no output (over an hour). I’ll let it continue to run and see if it proceeds after a while.

If you can let me know if this is all expected behavior I’d appreciate it.

Thanks,

Steve

What is the advantage of putting different files on separate nodes?

Sorry - Say you have 8 flow files in your files file and you have 8 processors. Using processors=8 will put one flow file on each processor that will each theoretically finish in 1/8th the time of just doing one at a time. I don’t believe the MPI version is set up to do this in mothur.