problem with trim.seqs

bunbury · February 15, 2011, 11:07pm

Hi,
I’m using the latest version of mothur and I have tried quality trimming of my sequences based on a qaveragewindow and no matter the value I use, all the sequences get scrapped. The quality file doesn’t seem to be the problem since trimming based on qaverage works well. Any ideas what the problem may be?
Thanks!

pschloss · February 16, 2011, 6:22pm

bunbury,

can you check to see if you have v.1.16.1? we sent out a quick release that fixed a problem like this and you may have 1.16.0.

pat

bunbury · February 16, 2011, 10:20pm

yes, I have version v.1.16.1. I think the problem may have been I downloaded the 64 bit version (Mac). Now I am working with the 32 bit version and it allowed me to trim sequences based on quality scores no problem. However, I noticed that if I trim sequences based on qual scores, ambig seqs, homop, seq length and oligos all at the same time, I get a different result than if I run trim.seqs independently and consecutively for each parameter.
For example:
My data set includes 543703 sequences. If I run this command (trim.seqs(fasta=healthy.fasta, oligos=healthy.oligos, qfile=healthy.qual, maxambig=0, maxhomop=8, pdiffs=2, minlength=250, maxlength=390, qwindowaverage=35, qwindowsize=50, processors=2) I get 242709 in the trim file, while if I run each command at a time (first oligos, then ambig and homop, then seq length and then qual score) I end up with 449456 sequences. Any idea why this may be?

pschloss · February 17, 2011, 1:02pm

Hmm… That doesn’t make sense. The order of things in trim.seqs is to remove the primers/barcodes, trim by quality scores and then to cull sequences that don’t match the length, homop, or ambig parameters. If anything I’d expect fewer sequences doing it piecemeal because the quality trimming should remove regions of sequences that are long, have long homopolymers or have ambiguities. Can you send us the fasta/qual/oligos file to take a look (mothur.bugs@gmail.com)?

bunbury · February 17, 2011, 7:50pm

I think I figured out what my problem is. You are right and the order matters. May be you meant to say in your reply the order should be quality first, then oligos, etc. If I remove oligos first, my sequences won’t match anymore the quality files. By triming step by step, first trimming by quality, i get the same result as if i trim in batch. Makes sense?
thanks…just trying to understand what I’m doing!

pschloss · February 19, 2011, 7:33pm

Right, sorry - quality first, oligos second. I’m glad to see people are trying to dig into the black box. You guys keep us on our toes!

Dharmesh · April 28, 2011, 5:01am

i am facing the same problem here with the trim sequences command. all of my sequences are getting scrapped afterwards (using the defaults as such).
I am using the latest version here (v1.18.1;64 bit for linux), following costello analysis example on my 454 roche barcoded pyrosequencing results.
There is one major difference in our approach and costello’s is that we have amplified a region of around 400bp here running from V1 to V3 region.

thanks,
Dharmesh

pschloss · April 28, 2011, 4:51pm

have you looked at the codes in the scrap.fasta file and can you tell us what you see?

Dharmesh · April 29, 2011, 1:51am

Sequences names in the scrap.fasta file are succeeded by codes “bf” or “f”. In a few cases “bfq” is also present.

Dharmesh

pschloss · May 4, 2011, 5:29pm

I would double check that you have your barcode and primer sequences correct in the oligos file

Topic		Replies	Views
Bug in trim.seqs mothur bugs	5	62995	December 27, 2012
Bugs in trim.seqs mothur bugs	5	92480	November 10, 2009
Trim seqs command issues Commands in mothur	6	4870	October 4, 2011
trim.seqs bug or not? mothur bugs	1	3116	September 16, 2011
Mothur and QIIME (trimming) Integrating mothur with other programs	1	4513	September 22, 2014

problem with trim.seqs

Related topics