Compute Canada

Hello!

Anybody ran Mothur on a Compute Canada machine?

I have a question for how to set-up the number of processors that Mothur can use.

In my .batch file, I have used set.current(processors=16)

In my .sh file for SLURM, I have put the following along side other stuff necessary for the job to be submitted,

#SBATCH --mem=64G
#SBATCH --cpu-per-task=16
mothur alex.batch

Will this allow me to run Mothur with 64G on 16 processors? Or must I launch using mpimothur alex.batch? I am getting confused.

It is my first time running Mothur remotely.

Thank you!

Well, I did run it yesterday. Lot’s of problems to figure out.

1st: I have a weird thing happening in the output: sometimes I can see 8 times the phrases "using 8 processors$ un a row. Also, when aligning, the first time I ran out of memory. I was using 16 processors and 212G were needed for this step. I reduced the numbers of processors used and it still needed 101 G to run, exactly half processors and half memory. So based on this I think the way I set-up my file, Mothur is not using very well the available resources. Even though I ask to8 CPU and told mothur to run processors=8. Something is happening and I cannot tell what.

Moreover, during preculster, I received a very weird message that told me that groups were missing from my count table, which is absolutely impossible since I was using “current” for all count tables… So definitely something going on with the use of multiple processors.

And when I run the same mothur script on my local machine, everything is fine, so definitely something with how I assign resources on the super computer.

I will post back when I figure this one out!

So I am not completely satisfied with the support I get. they do not know Mohur (A shame) and therefore can give me a little help. Guess it is another case of find out about it yourself. I was told that Mothur cannot support more than 1 CPU so that I should just submit it like it is. I get the feeling that I am not using the correct wording/word definition which is throwing support to believe that I am a complete noob. So bases on what I could find on the good old Internet, I am trying this, with processors=8 set in the batch file. For now, I am just trying to replicate what I can do on my local machine on which my batch file is running just fine…

So I am now trying with the following SLURM script.


#!/bin/bash #SBATCH --account=xxxx #SBATCH -t 0-08:00 #SBATCH --mem 64G #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=8 #SBATCH --job-name=alextest #SBATCH --mail-user=myemail.live.ca #SBATCH --mail-type=ALL mothur alex.batch

Just submitted this morning, I will post back if the run was successful or not.

well, that is definitely not working, I can see each samples being assembled 8 times so not very useful. So I am canceling it.

Now trying with the following script, with processors=8 in my batch file.

#!/bin/bash
#SBATCH --account=xxxx
#SBATCH -t 0-08:00
#SBATCH --mem 64G
#SBATCH --job-name=alextest
#SBATCH --mail-user=myemail.live.ca
#SBATCH --mail-type=ALL
mothur alex.batch

humm here is what is showing in my SLURM output…
so yea, definitely something appears off doest not look like the output on my desktop computer but looks logical in the end.
so I will let it go tonight, we will see.
…

but Mothur log file is ok though.
mystery! at least to me it is.

SLURM output:


It took 109 secs to assemble 126390 reads.


>>>>> Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R2.fastq (files 26 of 96) <<<<< Making contigs... 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 13869

Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R2.fastq (files 26 of 96) <<<<<
Making contigs…
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
13867

Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R2.fastq (files 26 of 96) <<<<<
Making contigs…
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
13870

Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R2.fastq (files 26 of 96) <<<<<
Making contigs…
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
13870

Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R2.fastq (files 26 of 96) <<<<<
Making contigs…
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
13868

Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R2.fastq (files 26 of 96) <<<<<
Making contigs…
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
13868

Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R2.fastq (files 26 of 96) <<<<<
Making contigs…
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
13869

Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R2.fastq (files 26 of 96) <<<<<
Making contigs…
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
13871
Done.

It took 95 secs to assemble 110952 reads.


but mothur log is ok:

Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0119.5-59_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0119.5-59_R2.fastq (files 23 of 96) <<<<<
Making contigs…
Done.

It took 119 secs to assemble 138131 reads.


>>>>> Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0120.5-147_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0120.5-147_R2.fastq (files 24 of 96) <<<<< Making contigs... Done.

It took 81 secs to assemble 94949 reads.


>>>>> Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0121.1-59_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0121.1-59_R2.fastq (files 25 of 96) <<<<< Making contigs... Done.

It took 109 secs to assemble 126390 reads.


>>>>> Processing file pair /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R1.fastq - /scratch/alexthi/nassima1/MI.M03555_0209.001.FLD0122.1-428_R2.fastq (files 26 of 96) <<<<< Making contigs... Done.

It took 95 secs to assemble 110952 reads.

ok, this where I am now with this problem.

I have changed my slurm job to at least try to have it run using processor=1. At this point, I don’t really care if it takes 2 days to run, I just want to have a way to analyze it elsewhere than on our already overused computer.


#!/bin/bash
#SBATCH --account=def-xxx
#SBATCH --time=0-24:00
#SBATCH --mem=64G
#SBATCH --cpus-per-task=1
#SBATCH --job-name=nassima16S
#SBATCH --mail-user=xx@live.ca
#SBATCH --mail-type=ALL

export MKL_NUM_THREADS=$SLURM_CPUS_PER_TASK

mothur nassima1_final.batch


And it fails at the end of precluster. After all my samples have been processed, Mothur tells me that a bunch of sequences are not in my count table, which is highly improbable since I am always using the “current” function and that my batch file works perfectly fine on my computer. The job is canceled because of an error (precluster failed), not because of time issue.

I am still trying few things but I am starting to run out of ideas.

I will post back only if I am somewhat successful.

Cheers

are you using mothur v.1.39.5?

Hello and thanks for the reply!

yep I am using 1.39.5.

The computer cluster support team tried it on another computer system.

They were able to run it completely. So i now believe the problem is on how they actually compiled Mothur on the system I am trying to use.

Another thing that I did not notice, “modules” needed to be loaded in the computer cluster environment in order to have mothur work, like vsearch and blast, so it might have something to do also that.

The other grape also works under PBS instead of Slurm

Still, under investigaton but some progress were made last Friday

It finally ran but on a different server.

I used Briarée on Calcul Québec for those interested. If you have a Canadian collaborator, then you should have access to some of Compute Canada resources for free (basic allocation) that can be expanded through a request or a competition.

This is the job submission I used. I was only testing 6 samples. They ran like in 2 minutes with processors=12!

#!/bin/bash

#PBS -A xxx
#PBS -l walltime=01:00:00
#PBS -l nodes=1:ppn=12
#PBS -l mem=4gb
#PBS -o output.txt
#PBS -j oe
#PBS -r n

module purge
module load mothur/1.39.5
module load vsearch
module load usearch
module load blast
module load blast+

cd $SCRATCH/data

mothur workshop1.batch

Hope it helps.

1 Like