Dist.seqs, cluster.split and split.abund taking long time

Hello!

I have soil and leaf samples from the plant. I run soil and leaf sequences separately and I was able run all commands to get leaf data. However, I’m stuck at dist.seqs command line for soil sample. I run/submit as a job(slrum)my commands line in the supercomputer of the university.
This is what I prefer to submit:

#!/bin/bash

#SBATCH --export=NONE               # do not export current env to the job
#SBATCH --job-name=dist.seqs        # job name
#SBATCH --time=21-00:00:00          # max job run time 21 days
#SBATCH --partition=xlong           # partition used for jobs 21 days
#SBATCH --ntasks-per-node=1         # tasks (commands) per compute node
#SBATCH --cpus-per-task=48          # CPUs (threads) per command
#SBATCH --mem=360G                  # total memory per node
#SBATCH --output=dist.seqs.%x.%j       # save stdout to file
#SBATCH --error=dist.seqs.%x.%j        # save stderr to file

module load  GCC/10.3.0
module load OpenMPI/4.1.1
module load  Mothur/1.48.0

I submitted a batch job for dist.seqs(cutoff=0.03) for 21days and it didn’t completed and then I submitted cluster.split (taxlevel=4, cutoff=0.03), and it didn’t completed either.
Lastly, I submitted a job for split.abund (cutoff=9) for 21days, and it didn’t work.

Is anybody has any suggestions?
Thanks for all your help!

Hi there -

What region are you sequencing? What does your pipeline look like? I worry that you might be running into this problem…

Pat

1 Like

I would bet that it’s a storage space issue. If there is not enough storage space on the computing cluster you’re using, the .dist files that mothur creates will be too large for the amount of space that you’re given.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.