RAM issue with clustering OTUs

Abi_Gault · January 21, 2021, 12:27pm

Hi,

Just looking for some advice.

I have 32GB of RAM running on windows 10 (64 bit) with mothur 1.44.3.

I’m running 266 fastq files totaling 5GB in data so I don’t have much of a feel for how much RAM this should need but I’m having an issue at the cluster stage of my analysis after dist.seqs, following the MiSeq SOP with my RAM running out.

The logfile suggests mothur gets to around 31GB about 12-18 hours in and crashes (I imagine windows is using the rest of the RAM). I tried to use 1 processor instead of 12 after seeing that recommended, which started running at 4GB but then rapidly escalated to 31GB again and crashed like before. Does anyone have any suggestions on how to fix this?

Many thanks in advance!

pschloss · January 21, 2021, 6:58pm

Are you sequencing the V4 region or something else? You might want to consult this… Why do I have such a large distance matrix

leocadio · January 26, 2021, 4:22pm

Have you tried run a pre.cluster instead of the dist.seqs and then cluster? The results should be very similar… And you might be able to do it with that RAM. The problem is also that windows is not that good handling RAM, and it ends up running out of memory.

Abi_Gault · January 27, 2021, 10:28am

Thank you both so much for your help, apologies for my lack of knowledge - I’m still quite new to mothur.

It is V4 only, but it is already published data I am trying to reanalyse (with a plan to combine with similar published data sets in the future to analyse them altogether and hopefully pick out organisms of interest) so I did not generate it myself.

I ran pre.cluster first which ran fine, if taking a little while. After reading Pat’s thread I have tried running the phylotype based analysis since Monday and it hasn’t crashed so far.

I went back to the original paper just to try to work out why it might be having such issues and realised this dataset was run on a HiSeq optimised panel instead of MiSeq Illumina and as such the data is paired in 2 x 151bp reads rather than 2 x 250bp. Could this be causing the issue with RAM?

Also would this mean that I can’t merge in mothur with MiSeq datasets targeting the V4 region that are 2 x 250bp? Just if this is the case I would drop this study from the ones I plan to analyse, hence making the RAM issue obsolete.

Thanks
Abi

system · February 6, 2021, 10:28am

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Stuck at cluster.split -- how do I overcome RAM issue? Commands in mothur	12	12755	August 20, 2013
Problems handling a >50 Gb distance matrix (cluster command) mothur bugs	12	14734	October 18, 2013
Large datasets: out of memory mothur bugs	1	2615	May 25, 2015
Clustering OTUs Commands in mothur	5	1426	March 1, 2017
Command cluster_issue	17	890	November 28, 2021

RAM issue with clustering OTUs

Related topics