RAM issue with clustering OTUs


Just looking for some advice.

I have 32GB of RAM running on windows 10 (64 bit) with mothur 1.44.3.

I’m running 266 fastq files totaling 5GB in data so I don’t have much of a feel for how much RAM this should need but I’m having an issue at the cluster stage of my analysis after dist.seqs, following the MiSeq SOP with my RAM running out.

The logfile suggests mothur gets to around 31GB about 12-18 hours in and crashes (I imagine windows is using the rest of the RAM). I tried to use 1 processor instead of 12 after seeing that recommended, which started running at 4GB but then rapidly escalated to 31GB again and crashed like before. Does anyone have any suggestions on how to fix this?

Many thanks in advance!

Are you sequencing the V4 region or something else? You might want to consult this… Why do I have such a large distance matrix

Have you tried run a pre.cluster instead of the dist.seqs and then cluster? The results should be very similar… And you might be able to do it with that RAM. The problem is also that windows is not that good handling RAM, and it ends up running out of memory.

Thank you both so much for your help, apologies for my lack of knowledge - I’m still quite new to mothur.

It is V4 only, but it is already published data I am trying to reanalyse (with a plan to combine with similar published data sets in the future to analyse them altogether and hopefully pick out organisms of interest) so I did not generate it myself.

I ran pre.cluster first which ran fine, if taking a little while. After reading Pat’s thread I have tried running the phylotype based analysis since Monday and it hasn’t crashed so far.

I went back to the original paper just to try to work out why it might be having such issues and realised this dataset was run on a HiSeq optimised panel instead of MiSeq Illumina and as such the data is paired in 2 x 151bp reads rather than 2 x 250bp. Could this be causing the issue with RAM?

Also would this mean that I can’t merge in mothur with MiSeq datasets targeting the V4 region that are 2 x 250bp? Just if this is the case I would drop this study from the ones I plan to analyse, hence making the RAM issue obsolete.


This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.