trim.seqs running out of memory?

Hi All,

I am encountering a problem during trim.seqs (version 1.31.2…64-bit, MPI enabled). I am splitting ion torrent reads labeled with dual barcodes (one at the start of the read and one at the end of the read) using this command:

trim.seqs(fasta=nofilter.fasta, qfile=nofilter.qual, oligos=steph_oligos, checkorient=T, maxhomop=8, maxambig=2, qaverage=20, qwindowaverage=20, processors=10, minlength=200, bdiffs=1, pdiffs=2)

I am occasionally see these when the program is running:

[ERROR]: std::bad_alloc has occurred in the QualityScores class function trimQScores. This error indicates your computer is running out of memory. This is most commonly caused by trying to process a dataset too large, using multiple processors, or a file format issue. If you are running our 32bit version, your memory usage is limited to 4G. If you have more than 4G of RAM and are running a 64bit OS, using our 64bit version may resolve your issue. If you are using multiple processors, try running the command with processors=1, the more processors you use the more memory is required. Also, you may be able to reduce the size of your dataset by using the commands outlined in the Schloss SOP, http://www.mothur.org/wiki/Schloss_SOP. If you are uable to resolve the issue, please contact Pat Schloss at mothur.bugs@gmail.com, and be sure to include the mothur.logFile with your inquiry.

And also these errors when the program is compiling output.

Appending files from process 5839
Appending files from process 5840
Appending files from process 5841
[ERROR]: Could not open /data2/sgeiboverflow/FungusMetagenomics/nofilter.fasta5841.num.temp
Appending files from process 5842
Appending files from process 5843
Appending files from process 5844
[ERROR]: Could not open /data2/sgeiboverflow/FungusMetagenomics/nofilter.fasta5844.num.temp
Appending files from process 5845
Appending files from process 5846
Appending files from process 5847
[ERROR]: Could not open /data2/sgeiboverflow/FungusMetagenomics/nofilter.fasta5847.num.temp

Group count:
site10_sample1 28480
site10_sample10 28556
site10_sample2 70524
site10_sample3 45932
site10_sample4 23900
site10_sample5 41919
site10_sample6 31779
site10_sample7 52507
etc…

If I proceed anyhow, I get errors about sequences being in my group file, but not my fasta file. This machine has 250 Gb RAM, the fasta/qual files have only 7 million sequences in it, and there are no other processes running. I don’t think the machine could really be running out of memory? Changing the number of processors (more/less) does not fix this error.

Any ideas about what could be causing this? Thanks so much for your help!

Erin

The [ERROR]: Could not open /data2/sgeiboverflow/FungusMetagenomics/nofilter.fasta5847.num.temp and file mismatch errors are related to the bad alloc error. After the bad alloc the process dies and does not finish processing it’s section of sequences. It does not create the nofilter.fasta5847.num.temp file and the missing sequences are the sequences it did not finish processing completely.

Trim.seqs is not a command with mpi-enabled source. The main process will run the command and the child processes will wait. Have you tried it with our prebuilt version? Have you tried it with processors=1? You could rule out a file issue by running it with 1 processor with the debug flag set.

set.dir(debug=t)
trim.seqs(… processors=1)

Thanks for the suggestions! It also crashes when I set processors=1. I tried it with both your pre-compiled 64-bit version for linux and the version that I compiled on the cluster and it crashes no matter which program I am using I am using. I tried on a few different machines to make sure it wasn’t an issue with the computer (bad RAM or something Same behavior. I also tracked RAM usage while the program is running and it is only using 3.5 to 4.0 Gb RAM max, so the machine isn’t really running out of memory. It seems like the 64-bit version of the program should be able to use more RAM if it is available?

I tried to run it in debug mode to make sure it wasn’t a file format issue, but it was taking its sweet time! Maybe I will let that go overnight and see what happens. Thanks!

Erin