How to speed up the process using an HPC cluster for OTU allocation

macoba · November 13, 2022, 4:35am

Good evening everyone, I want to ask you a question, maybe you can help me. I am analyzing my data for the V3-V4 region (I have previously read Why do I have such a large distance matrix which tells me why these two regions should not be sequenced, however, these are the data that we have at the moment and therefore we are looking for some preliminary results). I have been doing the analyzes in an HPC cluster, however following the MiSeq tutorial in the OTUs section I have problems with the time it takes with at least 2 million sequences at least for what I think it should be in an HPC cluster. high performance. The command that I use is cluster.split but there comes a point where it hangs and does not advance any further, I have tried the other alternative of dist.seqs but this has taken much longer, for the execution I do it in a node of I work with 256GB of RAM and 256 processors, we also have access to reserve the use of GPUs in the cluster but apparently this does not speed up the process either.

I would like to know if there is any way this can be set up so that the process of this stage progresses better. I leave you an image of the resources that we can reserve in the cluster

pschloss · November 21, 2022, 3:57pm

Hi - the problem you are running into is why I strongly advise against sequencing the V3-V4 region. You can probably (1) wait longer (i.e., days) to see if it finishes, (2) use a larger diffs value, (3) use taxlevel=6 in cluster.split, or (4) use the classification based approach described in the SOP

Hope these suggestions help a bit
Pat

Topic		Replies	Views
Stuck at cluster.split Commands in mothur	2	155	February 19, 2024
cluster.split V4 MiSeq runtime problem Commands in mothur	7	3540	February 25, 2016
Cluster.split â€“ limit of sequences it can handle? Commands in mothur	3	2933	August 22, 2014
Facing clustering issue Commands in mothur	10	513	September 19, 2022
cluster.split Commands in mothur	10	10448	March 12, 2015

How to speed up the process using an HPC cluster for OTU allocation

Related topics