Trim seqs command issues

Binnuch · March 31, 2011, 4:08am

Hi,

Following the Costello stool analysis example with my own data, I have had trouble with the trim sequences command.

When I type:
“trim.seqs(fasta=3A.fasta, oligos=trimming.oligos, qfile=3A.qual, maxambig=0, maxhomop=8, pdiffs=2, qwindowaverage=35, qwindowsize=50)”

The output returned is:
"3A_ GONO6GH0416XB7 is in your fasta file and not in your quality file, not using your quality file.
1000
2000
3000
4000
4052

Output file names:
trim.fasta
scrap.fasta
trim.fasta
scrap.fasta"

3A_ GONO6GH0416XB7 is definitely in the sequence quality file…

Any thoughts on how to resolve this issue?

westcott · April 25, 2011, 12:16pm

What version of mothur are you using?

Binnuch · May 1, 2011, 10:11am

Version 1.16.0.

westcott · May 3, 2011, 1:44pm

This bug has been fixed in the current version. Here’s the link to download version 1.18.1 http://www.mothur.org/wiki/Download_mothur

lwoo6888 · September 30, 2011, 7:30am

Hello All, and thanks for Mothur!

My issue is similar to this one, I am using version 1.21.1 on our uni server.

I want to use a qual file in my trimming, but there are some problems 1)The fasta files we got from our sequencing service are already barcode-trimmed and binned into groups (group name appended before sequence ID), whereas the qual files do not have the group appended, so we get a sequence not found notice. I can get around this somewhat by manually stripping out the pre-fixed group names and some other mucking about. 2) the qual file has other people’s sequences in it, not just my ones. When I use a small potion of the dataset (6000 seq) with the massive qual file, it works fine, but when I use all (50000) I get something like this:

100
200
.etc
.etc
.etc
18000 <—Works fine to a point
sequence name mismatch btwn fasta: GO2I7MQ02ISDP1 and qual file: GO2I7MQ02IOSSK <----Starts doing this
sequence name mismatch btwn fasta: GO2I7MQ02JQS7W and qual file: GO2I7MQ02GKLFO
sequence name mismatch btwn fasta: GO2I7MQ02HE6J1 and qual file: GO2I7MQ02HQZWM
.etc
.etc
.etc
sequence name mismatch btwn fasta: GO2I7MQ02GVTJD and qual file: GO2I7MQ02GN6AI
sequence name mismatch btwn fasta: GO2I7MQ02JMRNX and qual file: GO2I7MQ02FYK15
50436 <----eventually finishes ok.

Does anyone have any insight?

Kind regards,

Laura.

westcott · October 3, 2011, 7:48pm

The trim.seqs command expects the fasta file and the qual file to have the same sequences in the same order. You could use the list.seqs command on the fasta file to get a list of sequences. Then use get.seqs to extract those sequences from the quality file. The commands would look like:

list.seqs(fasta=yourFastaFIle)
get.seqs(qfile=yourQualFile, accnos=current)

But your quality file may still be in a different order as the fasta file.

lwoo6888 · October 4, 2011, 8:15am

Thank you! I overlooked this easy way to make a suitable qual file. Trim.seqs works perfectly now.

Topic		Replies	Views
trim.seqs qual and fasta file seq name mismatch Commands in mothur	1	1108	January 19, 2016
Trim.seqs command issues Commands in mothur	2	523	January 7, 2019
trim.seqs sequence mismatch Commands in mothur	2	2984	October 22, 2013
sequence name mismatch error in trim.seqs Commands in mothur	4	5457	August 6, 2011
qual file not recognized mothur bugs	9	5618	February 13, 2023

Trim seqs command issues

Related topics