mothur

Slurm parameters to run Mothur in batch mode

Hi Mothur community!

I’m trying to set up a slurm .sh file to run the cluster.split function of Mothur linux 7 version 1.46.1 onto a cluster (9 nodes, 44 CPUs, 777GB RAM).
I have tested two different kinds of scripts so far; one that supposes a distributed memory and the other one a shared memory configuration for multiprocessing.

Here is a part of the distributed memory configuration I tried:
[…]
#SBATCH -n 25
#SBATCH–ntasks-per-node=25
#SBATCH --mem=190000
[…]

All the 25 processors and 140GB RAM (of the 190GB reserved) were used during this run that leads to a failure of the cluster.split function with the following error message: [ERROR]: Could not open stability.trim.contigs.trim.good.unique.good.filter.unique.precluster.pick.renamed.pick.0.dist (the largest one).
This error showed up in the mothur logfile several times while the others temp.dist files were being processed smoothly.

The shared memory configuration was the following:
[…]
#SBATCH -n 1
#SBATCH -c 25
#SBATCH --mem=190000
[…]

In this case, the cluster.split command only used 15 processors (on the 25 reserved) and 10GB RAM, and appeared to run slower. Much slower actually when run on a smaller cluster (20 cores, 125GB). That leads me to think that this configuration could be optimized.

Considering all this, my question is, what would be the best way to set up the slurm parameters?

Thank you,

Guillaume

Hi,

You’ll want to use the shared memory approach on a single node.

Pat

1 Like

This is my sh file and it works fine

#!/bin/bash

#SBATCH --time=20:00:00
#SBATCH --account=def-blablabla
#SBATCH --mem=128000M
#SBATCH --mail-user=blabla@blablabla.ca
#SBATCH --cpus-per-task=32
#SBATCH --mail-type=BEGIN
#SBATCH --mail-type=END
#SBATCH --mail-type=FAIL
#SBATCH --mail-type=REQUEUE
#SBATCH --mail-type=ALL
#SBATCH --output=blablablar.out

cd $SCRATCH/blablabla

module purge

module load gcc/9.3.0
module load mothur/1.46.1
module load vsearch/2.15.2

mothur blablabla.batch

i set processors to 32 at the begining of my batch file.

#!/bin/bash
#SBATCH --cpus-per-task=25
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --time=24:00:00
#SBATCH --partition=general
#SBATCH --qos=general
#SBATCH --mail-type=END
#SBATCH --mem=80G
#SBATCH --mail-type=ALL
#SBATCH -o PPCESS_%j.out
#SBATCH -e PPCESS_%j.err
#SBATCH --mail-user=lyouremail

But, I think your memory and processors have to be hand by hand… In a server I use, you cannot scape the 4GB per processor…

Hello!

I am working on compute Canada, works pretty fine!

@leocadio this is what we’re using at UMichigan… I think the difference are the cpus-per-task and mem-per-cpu settings

#SBATCH --mail-user=xxxxxxxxx@umich.edu
#SBATCH --mail-type=BEGIN,END
#SBATCH --cpus-per-task=8
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --mem-per-cpu=4g 
#SBATCH --time=100:00:00
#SBATCH --account=xxxxxxxxx
#SBATCH --partition=standard
#SBATCH --output=%x.o%A_%a

the limit in our server is 4G per node (that is actually what also @Alexandre_Thibodeau has: 128G in 32 cpus), but we request it in bulk (not per cpu). Not sure if this is something that the people handling the system prefer, or just is open but they only included that option in the examples they distribute

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.