Glibc error with pre.cluster

I’m analyzing a dataset with 12 groups and on the order of half a million sequences. When I get to the pre.cluster step I have to ramp down the number of processors significantly to get pre.cluster to complete. For example, on a 48-core system with 1.5Tb RAM, if I try to use all 48 processors I get:

*** glibc detected *** mothur: double free or corruption (fasttop): 0x00002ae840003280 ***
======= Backtrace: =========
/opt/glibc-2.14/build/lib/libc.so.6(+0x7374e)[0x2ae81440c74e]
/opt/glibc-2.14/build/lib/libc.so.6(cfree+0x6c)[0x2ae81441074c]
/act/gcc-4.9.2/lib64/libstdc++.so.6(_ZNSs6assignERKSs+0x87)[0x2ae813a94107]
mothur[0x41b1e3]
mothur[0x85064c]
mothur[0xcbcaa9]
mothur[0x13d991f]
/opt/glibc-2.14/build/lib/libpthread.so.0(+0x6e2b)[0x2ae814182e2b]
/opt/glibc-2.14/build/lib/libc.so.6(clone+0x6d)[0x2ae81447131d]
======= Memory map: ========
00400000-016c0000 r-xp 00000000 00:13 32755946 /gpfs0/export/opt/mothur/1.42.3/mothur
018bf000-018c1000 r–p 012bf000 00:13 32755946 /gpfs0/export/opt/mothur/1.42.3/mothur
018c1000-018c7000 rw-p 012c1000 00:13 32755946 /gpfs0/export/opt/mothur/1.42.3/mothur

With fewer and fewer processors I get part of the analysis to complete, e.g. with 20:

Using 20 processors.
Reducing processors to 12.

/******************************************/
Running command: split.groups(groups=lbc10, fasta=combined.good.unique.good.good.filter.unique.fasta, count=combined.good.unique.good.good.filter.count_table)

/******************************************/
Running command: split.groups(groups=lbc9, fasta=combined.good.unique.good.good.filter.unique.fasta, count=combined.good.unique.good.good.filter.count_table)

/******************************************/
Running command: split.groups(groups=libc11, fasta=combined.good.unique.good.good.filter.unique.fasta, count=combined.good.unique.good.good.filter.count_table)

/******************************************/
Running command: split.groups(groups=libc12, fasta=combined.good.unique.good.good.filter.unique.fasta, count=combined.good.unique.good.good.filter.count_table)
/var/spool/slurmd/job1933416/slurm_script: line 101: 23939 Aborted (core dumped) mothur mothurpacbio-multisample.sop

The analysis only completes if I ramp it down to 2 processors. At no point in time does the RAM consumption ever appear to be anywhere approaching the available ram. Any ideas why this would happen?

I think this is related to a race condition we found in the current file class. Are you seeing this with the 1.43.0 beta version?

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.