Flows algorithm

edenmark · July 22, 2014, 7:57pm

I am comparing the mothur with other pipelines, just to see the differences.

On one other pipeline I’m comparing, it does the normal trimming of barcodes, primers, checks length of sequences, checks quality of each base, etc… (all the normal stuff that you would expect - nothing algorithmically intensive). This removes roughly 25-30% of my sequences.
However, when I compare to this to trim.flows --> shhh.flows, many more of my sequences are removed (roughly 90% of my sequences).

Why is this?

Thanks

pschloss · July 24, 2014, 4:12pm

If you look at the sequence names in the scrap file you will see a | and then single letter codes that indicate why a sequence was scrapped. See the wiki page for trim.flows for a description. Without knowing how you ran trim.flows or what type of data you are sequencing, it’s hard to know what’s going on. If you’re on a unix/mac box you can run the following to see why sequences are getting chucked…

cut -f 1 -d " " *scrap.flow | cut -f 2 -d “|” | sort | uniq -c

If you get a b, f, or l that will indicate mismatches to the barcode, forward primer, or the length.

Pat

edenmark · July 25, 2014, 2:36pm

What about shhh.flows? How does it check the sequence to determine if something is noise or not?

pschloss · July 25, 2014, 4:35pm

You might want to check out the following:

Schloss PD, Gevers D, Westcott SL (2011). Reducing the effects of PCR amplification and sequencing artifacts on 16S rRNA-based studies. PLoS ONE. 6:e27310.
Quince C, Lanzen A, Davenport RJ, Turnbaugh PJ (2011). Removing noise from pyrosequenced amplicons. BMC Bioinformatics 12:38.
Quince C, LanzÃ©n A, Curtis TP, Davenport RJ, Hall N, Head IM, Read LF, Sloan WT (2009). Accurate determination of microbial diversity from 454 pyrosequencing data. Nat. Methods 6:639.

The Quince papers describe the PyroNoise algorithm that we cloned into C++ as shhh.flows.

pat

Topic		Replies	Views
trim.flows scrapping based on length mothur bugs	2	2748	November 22, 2013
seq lenght after Pyronoise Theory behind mothur	5	4553	May 3, 2013
sequences left after denoising and more Theory behind mothur	3	5211	January 17, 2012
Metrics for shhh.flows and trim.flows Commands in mothur	2	2529	July 12, 2012
TRIM.FLOWS Scraps All Sequences - Commands in mothur	34	22661	September 17, 2014

Flows algorithm

Related topics