Memory leak


Im running mother on a cluster (however not MPI) since the cluster is not fully set up. Im running on the master node using 6-12 processors (out of 72) having 80 GB of RAM. However, when Im running the shhh.flows cmd the process will eventually be killed due to memory shortage (see below).

214700 56915 56678.4
214738 56934 56697.2

Total time: 56934 58048.9

Clustering flowgrams...
Reading matrix:     ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Mar 12 03:40:23 biocluster kernel: sge_qmaster invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Mar 12 03:40:23 biocluster kernel: [<ffffffff8111400a>] ? oom_kill_process+0x8a/0x2c0
Mar 12 03:40:23 biocluster kernel: Out of memory: Kill process 28047 (mothur) score 982 or sacrifice child
Mar 12 03:40:23 biocluster kernel: Killed process 28047, UID 509, (mothur) total-vm:134438668kB, anon-rss:81057244kB, file-rss:8kB

Is there a way to set the amount of memory usage? Or any advice would be appreciated. Thanks!

It looks like you have 214738 unique flow grams… I find this very hard to believe. How are you running trim.flows? Hopefully you are doing this analysis following trim.flows with minflows and maxflows=450 (the defaults) and are splitting the flowgrams up by sample. If not, you are likely in for headaches like the one you’ve got here. MPI won’t help and I suspect all the RAM in the world won’t help you in this case.


Hi, thanks for your reply!

Yeah, Im running;

trim.flows(flow=HX1JDSX01.flow,processors=12, minflows=360,maxflows=720)
shhh.flows(flow=HX1JDSX01.trim.flow, processors=12, lookup=LookUp_Titanium.pat)
  1. You need to set minflows=450, maxflows=450. We showed in the PLoS ONE paper, that 360/720 does little to nothing to improve data quality. It also makes it a lot harder to process.

  2. Do you only have one sample in there? If not, you could really help yourself by including the oligos file.