Following the Costello stool analysis example with my own data, I have had trouble with the trim sequences command.
When I type:
“trim.seqs(fasta=3A.fasta, oligos=trimming.oligos, qfile=3A.qual, maxambig=0, maxhomop=8, pdiffs=2, qwindowaverage=35, qwindowsize=50)”
The output returned is:
"3A_ GONO6GH0416XB7 is in your fasta file and not in your quality file, not using your quality file.
Output file names:
3A_ GONO6GH0416XB7 is definitely in the sequence quality file…
Any thoughts on how to resolve this issue?
What version of mothur are you using?
This bug has been fixed in the current version. Here’s the link to download version 1.18.1 http://www.mothur.org/wiki/Download_mothur
Hello All, and thanks for Mothur!
My issue is similar to this one, I am using version 1.21.1 on our uni server.
I want to use a qual file in my trimming, but there are some problems 1)The fasta files we got from our sequencing service are already barcode-trimmed and binned into groups (group name appended before sequence ID), whereas the qual files do not have the group appended, so we get a sequence not found notice. I can get around this somewhat by manually stripping out the pre-fixed group names and some other mucking about. 2) the qual file has other people’s sequences in it, not just my ones. When I use a small potion of the dataset (6000 seq) with the massive qual file, it works fine, but when I use all (50000) I get something like this:
18000 <—Works fine to a point
sequence name mismatch btwn fasta: GO2I7MQ02ISDP1 and qual file: GO2I7MQ02IOSSK <----Starts doing this
sequence name mismatch btwn fasta: GO2I7MQ02JQS7W and qual file: GO2I7MQ02GKLFO
sequence name mismatch btwn fasta: GO2I7MQ02HE6J1 and qual file: GO2I7MQ02HQZWM
sequence name mismatch btwn fasta: GO2I7MQ02GVTJD and qual file: GO2I7MQ02GN6AI
sequence name mismatch btwn fasta: GO2I7MQ02JMRNX and qual file: GO2I7MQ02FYK15
50436 <----eventually finishes ok.
Does anyone have any insight?
The trim.seqs command expects the fasta file and the qual file to have the same sequences in the same order. You could use the list.seqs command on the fasta file to get a list of sequences. Then use get.seqs to extract those sequences from the quality file. The commands would look like:
But your quality file may still be in a different order as the fasta file.
Thank you! I overlooked this easy way to make a suitable qual file. Trim.seqs works perfectly now.