pre.cluster only uses 1 processor


I’ve been running into this situation with one particular data set for a while: it takes more than 10 days and still counting to run pre.cluster command in a data set with 12 samples. I tried the local workstation and HPC in our university, same slow process. I used mothur 1.39.0

The data set targets ITS1 for fungal community survey. All the parameters and commands I used for this data were the same I used before (also ITS1 for fungal community). Previously all the commands ran through without a problem (used mothur 1.38.1)

One of the potential explanations might be the fact that pre.cluster command only used 1 core in HPC node according to HPC log file, even though it says “using 12 processors” when pre.cluster was running.

What would be the solution for the slow process, please?

Thank you so much!

The number of processors can vary as the command processes. This is because for some parts of the command only one processor is running, while in other areas all 12 will be used. How many sequences are in your dataset?

It has 1,357,056 raw seqs, 930,737 in xx.trim.contigs.good.fasta, and 211, 068seqs in xx.trim.contigs.good.unique.fasta

There are 12 samples in the dataset