Mothur quit during chimera.uchime

Hello,
My chimera analysis ran for the entirety of yesterday. I came in this morning and is said Mothur unexpectedly quit. The newest files created are
stabilityp1.trim.contigs.good.unique.good.filter.good.precluster.denovo.uchime.accnos.byCount.1.temp
stabilityp1.trim.contigs.good.unique.good.filter.good.precluster.denovo.uchime.accnos1.temp
stabilityp1.trim.contigs.good.unique.good.filter.good.precluster.denovo.uchime.chimeras1.temp
stabilityp1.trim.contigs.good.unique.good.filter.good.precluster.denovo.uchime.accnos.byCount
stabilityp1.trim.contigs.good.unique.good.filter.good.precluster.denovo.uchime.accnos
stabilityp1.trim.contigs.good.unique.good.filter.good.precluster.denovo.uchime.chimeras

Because there are temp files, am I to assume that the analysis didn’t finish? If so, is there a way to restart it from where it left off? From the output on the screen it looks like it got through 8 or 9 of my groups. I have 10 total.

I suspect it ran out of RAM. You might try decreasing the number of processors you are using.

Pat

Thank you for the suggestion. I changed processors=12 to processors=8. It’s running right now but I suspect it will give me an error again because I noticed an anomaly after I sent the last question to the forum. I was hoping redoing the precluster would solve it but I don’t think it has. From looking at the various temp files that are being written during the chimera.uchime step, I noticed that one of the groups looks like it was skipped. In my working directory, I have the outputs of the precluster. These include .map files and there are 10 of those corresponding to my groups and there is a precluster.count_table file, and the fasta file. There are a bunch of temp files for the chimera results right now, but I noticed that of the ones that end with .good.precluster.tempx.temp, there isn’t one for my group number 4. However, there is one for the files ending in .denovo.uchime.chimerasx.temp, and for .denovo.uchime.accnosx.temp and for denovo.uchime.accnos.byCountx.temp. I noticed the same thing for the previous aborted run.

(As an aside, I have found the output and subsequent input files to be named slightly differently than what the Miseq SOP specifies. The Miseq SOP calls for pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.trim.contigs.good.unique.good.filter.count_table
but the files I have are called pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.good.fasta, count=stability.trim.contigs.good.good.good.count_table. Anytime the Miseq SOP refers to a file with more than one “unique” in the name, the instances after the first one are always called “good”. Also, the count table above didn’t have “filter” in the extension.)

Update: I did indeed give me an error. I restarted AGAIN with processors=4. It quit on me AGAIN. This time, it only made files for groups 1 through 3. Temp files were never created for groups 4 through 10. I wonder if there is something wrong with my group 4? Or I’m using the wrong inputs? Or if there is a way to do this step per group?

I’m still getting strange behavior from chimera.uchime.

I’m running it on a Windows machine. It’s on a hard drive with mother and my files and nothing else on it. This computer has 32 processors and 128GB of RAM. No other analyses are running on this machine in tandem. I started another run of chimera.uchime using 8 processors and when I returned to the computer ~2 hours later, there was a window up saying “mothur.exe. has stopped working. A problem has caused the program to stop working correctly. Please close the program.” However, I have not closed it because the terminal still shows activity - there is output coming across the screen. Also, the CPU usage still shows activity. Only 14.5 GB of the 128GB of RAM is being used. Why is there an error message saying mothur stopped working if there is still activity on the screen? The last logfile for mothur was generated 2 hours ago. It says “it took 7501 sec to check 24868 sequences from group Bathym34_1.” That’s it. What scrolls across the screen is really strange too. For example:

03:39:5023 : 3 99:55.17 % 49810.16/%3 62011561 /c3h4i5m0e1r acsh ifmoeurnads
03:39:54 0 39:53.97:%5 34 8 0 19/53.670%1 94 5c7h7i/m3e5r3a2s7 fcohuinmde r(a1

And yet again, temp files weren’t made for all my groups. This time, it looks like 7 were made. I will abort this run and restart with one processor. However, I have a colleague who ran a mothur analysis on this computer with no problem. His dataset was smaller than mine. This makes me think there is something wrong with my data. I redid the precluster step (the one immediately preceding the chimera step according to the Miseq SOP) when one of the earlier chimera attempts failed. I guess it didn’t help. The current run I am about to abort will be the 5th time I’ve tried this. Any guidance would be appreciated but also necessary for me to progress.

The uchime program is 32bit. This means uchime can only use 4G of RAM. I suspect that group 4 requires more memory than that which is causing uchime to crash. When this happens, it causes mothur to crash or the command to fail. The chimera.vsearch command would be a good option if you have access to a mac or linux machine. We are working with the vsearch developers to add vsearch to mothur for our Windows users in our next release.