I’m doing some analyses on a fairly large dataset (about 20 Gbytes of raw sequences obtained using the Kozich protocol). I did a first run of the mothur SOP and had no problem. However, when I repeated the analysis (again, following the SOP) it would’t finish the classification of the sequences right after the removal of the chimeras.
This is the command as appears on the generated logfile
mothur > classify.seqs(fasta=/home/jdelacuesta/microbio/output/microbio.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=/home/jdelacuesta/microbio/output/microbio.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.count_table, reference=/home/jdelacuesta/microbio/db/trainset14_032015.pds.fasta, taxonomy=/home/jdelacuesta/microbio/db/trainset14_032015.pds.tax, cutoff=80, processors=1) Using 1 processors. Reading template taxonomy... DONE. Reading template probabilities... DONE. It took 121 seconds get probabilities. Classifying sequences from /home/jdelacuesta/microbio/output/microbio.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta ...
and this is the stderr
/var/spool/torque/mom_priv/jobs/15787.apolo.eafit.edu.co.SC: line 1: !/bin/bash: No such file or directory TERM environment variable not set. /var/spool/torque/mom_priv/jobs/15787.apolo.eafit.edu.co.SC: line 13: 10491 Segmentation fault mothur taxonomia.batch
after some cursing I decided to reupload the RDP reference files thinking they could have been somehow corrupted when I noticed several new files, byproducts of the first time I executed the analysis
trainset14_032015.pds.8mer trainset14_032015.pds.trainset14_032015.pds.8mer.numNonZero trainset14_032015.pds.trainset14_032015.pds.8mer.prob trainset14_032015.pds.tree.sum trainset14_032015.pds.tree.train
Once those files were removed, everything came back to normal and I was able to finish the run.
What are those files, why do they mess mothur (or my computer) and would it be possible in future releases that they automatically removed when mothur quits?