I am working with about 34K sequences, that are trimmed of oligos and all range between 250 to 560bp size, no ambiguities, and therefore I jumped into unique option to obtain a non-redundant set of sequences. With this I reduce my dataset to about 32.8K, which is not impressive. Am I skipping something?
I’ve aligned the file to see how it works, and then hopping to use the chimera slayer, the mothur comes back with an error:
mothur > chimera.slayer(fasta=hgut.align, template=silva.bacteria.fasta)
Checking sequences from hgut.align …
Reading sequences from silva.bacteria.fasta…Done.
Reading sequences from hgut.align…[ERROR]: St9bad_alloc has occurred in the Sequence class function Sequence.
I kind of guess that this might be because of the large alignment and I probably have to obtain a non-redundant dataset…but would appreciate your suggestions.
Thanks.
I am working with about 34K sequences, that are trimmed of oligos and all range between 250 to 560bp size, no ambiguities, and therefore I jumped into unique option to obtain a non-redundant set of sequences. With this I reduce my dataset to about 32.8K, which is not impressive. Am I skipping something?
You’ll probably see the biggest reduction after running filter.seqs with trump=. and running pre.cluster
[ERROR]: St9bad_alloc has occurred in the Sequence class function Sequence.
We typically see this when you don’t have enough memory. Do you know how much RAM you have on your computer?
When I use trump option with v.1.13.0, I am getting:
mothur > trim.seqs(fasta=hgut.fasta, qfile=hgut.qual, maxambig=0, maxhomop=8, flip=T, processors=2, trump=.)
trump is not a valid parameter.
The valid parameters are: fasta, flip, oligos, maxambig, maxhomop, minlength, maxlength, qfile, qthreshold, qwindowaverage, qstepsize, qwindowsize, qaverage, rollaverage, allfiles, qtrim, tdiffs, pdiffs, bdiffs, processors, outputdir, and inputdir.
It seems that this version does not recognise trump=. command.
Without trim option, precluster run removed 914 out of 32867 sequences (total left=31K)
Any other suggestions/advises on how to get less redundant dataset?
thank you.
Ok, tried after preclustering>alignment>filtering, and without preclustering and the resultimg alignment is realy bad:
mothur > summary.seqs(fasta=hgut.trim.unique.align)
Be sure to follow the Costello example analysis. Even if you have removed the oligos, you need to trim the bad quality scoring bases from the sequences. Also, you want to do unique.seqs > align.seqs > chimera.slayer > filter.seqs > unique.seqs > preclustering. You seem to have it a bit backward.
I think your sequences are oriented in the wrong direction. When you run trim.seqs you can flip the sequences. Alternatively, in align.seqs you can also have the aligner automatically try to flip them for you.
For some reason it looks like mothur is not trimming at all based on qual file (see bellow).
mothur > trim.seqs(fasta=hgut.fasta, qfile=hgut.qual, maxambig=0, maxhomop=8, flip=T, processors=2)
_GM2SIF004IV1XN is in your fasta file and not in your quality file, not using quality file.
When checked, both files, the fasta and qual have the sequence.
I repeated from scratch and again mothur doesn’t read qual telling that there is an error.
mothur > trim.seqs(fasta=hg.fasta, qfile=hg.qual)
Error reading quality file, name blank at position, -1
I am opening both files and both are completely identical.
thanks.
Any idea what went wrong with qual files and why mothur is not recognizing them? i sent them separately each with hgut name to mothur.bugs@gmail.com.
thanks a lot for your help!!!
I seem to be having similar problems as the original poster. Can you give some elaboration on “flipping the sequences” as suggested above. How would I have the aligner automatically flip my sequences.
Thanks. I realized that I was following the Costello example too closely. I had included the Flip=T when it was not necessary and that sent me on the wrong trajectory.
When I run the chimera slayer, after following Costello’s example:
mothur > chimera.slayer(fasta=1234.trim.unique.good.filter.fasta, template=silva.gold.filter.fasta)
Checking sequences from 1234.trim.unique.good.filter.fasta …
Reading sequences from silva.gold.filter.fasta…Done.
Reading sequences from 1234.trim.unique.good.filter.fasta…Done.
and after this mothur closes instantly.
How much RAM does your computer have? That would be my first guess. That or the reference files got corrupted and you may need to download fresh copies.