trim.flows - problem with new sff file

The problem is that 454 has radically changed their flow pattern, which breaks trim.flows and shhh.flows. This will be updated, along with a new look up file, with the next release. We should also add that IonTorrent also uses a weird flow pattern that won’t work with trim.flows and shhh.flows. Again, this will be fixed in the next release. For now, please use the trim.seqs approach with qwindowaverage=35 and qwindowsize=50.

Pat

Hi Pat,

do you have rough date for when this occurred or how we tell if the data uses this “radically changed flow pattern”. Is it safe to assume that if trim.flows doesn’t junk pretty much all the seqs then it’s OK? (i’m having the same problem: trim.flows junks pretty much all the seqs, giving a primer error, while trim.seqs on the sff extracted .fasta files tells me the primer is good…). But I have some “older” reads (not that much older) that seem to go through trim.flows and shhh.flows fine.


Thanks,
Andrew

If you run sffinfo(sff=whatever.sff, sfftxt=T) look at the line that is “Flow Chars”. What is the string of bases that follows?

Thanks Pat,
Here’s the sff.txt:

Common Header:
Magic Number: 779314790
Version: 0001
Index Offset: 31175104
Index Length: 185482
Number of Reads: 9240
Header Length: 840
Key Length: 4
Number of Flows: 800
Format Code: 1
Flow Chars
Key Sequence: TCAG

I don’t really care about the barcode or flow number at this stage, but I have some .sff’s that have two different amplicons with the same barcode, so I’m trying to to use trim.flows to split them into sperate flows, one for each amplicon, using the primer sequence. As I said above if I use trim.seqs on the fasta files from the .sff this works well, but when I try and do it on the flows it scraps all the seqs with |f

Maybe there’s a better way to split the amplicons out into separate files?

abissett -

You have the old flow pattern - Flow Pattern A - which is TACG over and over again. So the current SOP should work for you unless you got a bad sequencing run.

Pat

Ok, that’s what I thought.


So now I don't understand is why when I try and do trim.flows and select sequences on barcode (there is only one barcode in the .flow I'm trying this on) and primer (there are two primers in there) all the sequences fail based on the primer (i.e., |f). They don't fail on quality/length, barcode,.........
But when I run the same on trim.seqs (supplying the same oligos file, and using the fasta file generated from sff.info) mothur keeps around half the seqs (which when I look at the .fasta is about right, about half of the seqs are from each amplicon) and scraps around (the other amplicon).

Could you post a flowgram for a sequence that you think is good along with the barcdoe and primer sequence?

The .flows have around 10k+ seqs in them, so I’ve posted data from a single seq below (I can email you the full flow if you like).
Using trim.flows I get a |f error on the all the seqs in the .flow, including this one.
When I run trim.seqs, using the same oligos file etc., I get an approx. 400 bp seq.
When I look at the .fasta I can see the correct barcode and primer in most of the sequences in the .flow


HYADGMH02GW2D2 697 1.07 0.00 1.04 0.01 0.00 0.93 0.00 1.05 1.14 0.00 1.08 0.00 1.09 1.22 0.03 0.00 1.06 1.17 1.13 0.00 1.04 1.14 0.02 0.00 1.02 0.02 1.07 0.04 1.99 0.00 0.04 1.95 1.10 0.00 1.04 0.08 0.07 1.04 0.05 0.02 2.90 1.05 0.01 1.03 0.14 1.06 0.01 2.04 0.03 1.98 0.00 1.10 1.02 2.99 0.03 1.10 1.03 0.15 1.04 1.11 1.03 1.99 1.08 0.10 0.10 2.04 0.12 2.02 2.92 0.15 2.10 1.08 1.09 1.05 0.14 1.94 0.92 0.11 0.04 1.07 0.12 2.02 1.98 0.12 1.09 0.16 0.20 1.05 0.10 0.04 1.06 2.07 0.01 1.99 0.08 2.05 0.03 1.07 0.04 0.03 1.07 0.10 1.05 0.05 0.12 1.11 0.14 0.03 1.95 1.08 2.08 0.07 0.98 1.09 0.02 1.05 0.09 1.00 0.04 1.05 2.10 0.12 0.01 1.03 0.95 1.03 0.00 2.04 0.09 0.08 1.03 0.03 3.02 0.01 0.14 1.09 0.00 0.00 2.03 0.16 1.04 0.00 0.00 1.16 0.15 0.00 1.07 0.13 1.04 1.05 0.13 0.03 1.01 0.08 1.09 0.04 0.97 0.01 0.99 0.03 2.10 1.05 3.08 0.04 0.00 1.03 0.05 0.06 0.96 0.03 0.08 1.04 1.07 0.00 1.06 0.11 3.87 0.05 0.09 1.00 0.30 1.04 0.15 0.99 1.01 1.06 2.04 0.10 1.98 1.06 1.08 1.03 3.00 0.05 1.94 0.14 1.13 0.01 1.08 1.89 0.09 0.00 1.06 3.05 1.00 0.00 2.03 1.04 0.08 0.00 2.97 0.90 0.05 0.02 1.97 1.11 0.09 0.99 0.27 0.15 2.02 0.13 0.08 1.98 0.16 0.89 1.04 0.12 0.17 2.93 1.13 0.15 1.99 2.99 2.10 0.06 0.17 1.18 0.21 0.05 3.05 0.16 0.09 0.94 0.28 0.13 0.89 0.20 0.18 0.99 0.10 1.04 2.02 0.06 0.17 0.91 0.15 0.11 0.97 0.12 0.10 2.04 0.15 0.10 0.99 0.13 0.99 0.09 0.04 0.93 0.13 1.03 0.15 0.04 1.02 1.08 1.04 0.10 0.99 0.09 0.99 0.14 0.13 1.00 0.08 5.05 1.01 0.03 0.14 1.26 0.05 0.09 0.96 1.71 0.11 0.09 0.98 0.98 0.10 1.03 1.74 0.89 1.07 0.01 0.09 1.96 0.99 0.09 2.87 0.14 0.85 0.08 0.18 1.74 0.94 0.12 0.05 1.81 1.02 1.94 0.11 1.00 0.15 0.24 0.90 0.07 0.91 0.05 1.01 0.09 0.96 0.15 1.97 0.05 0.20 2.00 1.99 0.03 0.85 0.19 1.05 0.04 0.14 1.80 0.20 0.04 0.97 0.23 0.08 1.06 0.20 0.17 1.04 0.15 0.98 1.01 0.22 1.08 0.11 0.14 1.01 0.09 0.14 0.96 0.16 1.93 0.15 1.15 0.13 2.19 1.32 1.12 0.10 0.26 1.11 0.17 0.07 0.88 0.23 0.95 0.07 0.17 1.00 2.13 0.11 1.93 0.13 0.22 1.15 0.24 0.12 0.95 0.08 0.14 0.99 1.04 0.14 0.99 0.09 0.13 0.97 2.11 0.07 1.05 1.06 1.20 0.09 1.06 1.13 0.06 0.06 1.02 1.08 0.13 0.05 1.06 0.12 1.87 0.08 0.14 2.03 0.22 0.13 1.01 0.23 0.01 1.20 0.14 0.05 1.02 0.08 1.02 0.10 1.99 0.15 0.10 2.16 0.15 1.07 0.10 0.12 0.92 0.13 0.98 1.05 0.09 0.03 0.99 0.12 2.04 0.17 0.12 0.99 0.22 1.01 0.04 0.17 0.94 0.15 0.08 0.97 0.11 1.08 1.02 0.13 2.11 0.12 0.05 0.86 0.17 0.93 0.13 3.07 0.04 0.18 0.99 0.14 2.05 1.11 1.09 0.00 0.17 0.86 0.00 0.05 1.15 1.18 0.06 0.07 0.91 0.04 0.15 2.00 0.00 0.13 0.99 0.22 0.10 0.99 1.08 0.05 0.13 4.29 0.08 2.03 0.18 0.09 1.95 0.91 0.99 0.02 0.15 2.04 0.12 1.91 0.13 0.07 0.95 0.03 0.01 0.97 0.08 0.08 4.08 0.08 0.14 0.95 0.29 0.02 1.01 0.16 0.05 1.30 0.01 0.06 1.05 0.15 0.09 2.32 0.06 0.87 0.07 0.06 0.93 2.22 0.18 0.87 1.23 0.06 0.91 0.09 0.95 0.05 0.05 0.90 0.93 1.15 0.05 1.05 0.05 0.13 1.03 0.14 0.01 3.07 0.16 0.00 1.22 1.02 0.96 2.15 0.14 1.79 0.02 1.00 0.06 0.13 1.95 0.17 1.11 0.09 0.00 0.70 0.09 0.93 0.03 1.20 0.10 1.12 0.12 0.03 0.99 0.06 0.04 1.07 0.29 2.18 0.06 0.00 1.86 1.08 0.06 0.11 0.80 1.80 0.04 0.12 3.00 0.96 0.03 0.11 0.96 2.94 0.04 0.14 0.98 0.87 0.05 0.91 0.22 1.25 0.05 1.01 0.82 0.13 0.02 1.51 0.23 1.05 0.03 1.28 0.10 1.08 0.06 0.10 0.92 0.18 0.06 0.96 0.85 0.93 0.09 0.21 1.00 1.07 1.29 0.13 0.95 0.18 0.71 0.87 0.07 0.89 0.20 1.20 1.03 0.20 0.08 1.89 0.16 0.94 0.18 1.11 0.15 0.21 4.02 0.98 0.13 0.04 2.12 0.16 0.11 2.30 0.18 0.19 2.06 0.20 0.14 0.72 0.24 0.16 0.98 0.29 0.83 0.16 0.11 2.01 2.13 0.14 0.06 1.30 1.14 1.12 1.10 0.10 0.15 2.13 0.04 0.17 1.16 0.18 1.22 0.18 0.16 3.37 0.13 1.17 2.22 0.35 0.84 0.25 0.89 0.16 0.18 0.77 1.17 0.10 0.10 1.14 0.12 0.06 0.98 0.16 1.10 1.12 0.85 1.08 0.14 0.20 1.06 0.08 0.17 0.97 0.10 1.08 0.11 0.92 0.13 1.10 0.91 0.28 0.11 0.75 1.22 0.02 0.12 1.32 0.11 3.23 0.13 0.05 0.92 0.28 0.20 0.96 0.23 0.15 1.01 0.91 0.16 1.08 0.16 1.23 0.04 0.11 0.97 0.05 1.15 0.16 1.86 1.21 0.97 0.11 0.90 0.84 0.78 0.11 0.92 0.92 1.12 0.96 0.21 2.29 0.14 0.13 3.48 0.06 1.11 0.81 1.11 0.99 0.12 0.31 0.57 0.10 0.10 5.20 1.43 1.23 0.20
barcode TCTATACTAT primer CTTGGTCATTTAGAGGAAGTAA

I’m not sure if anything is happening here? Maybe this got lost in the other traffic?

Sorry we lost the post until you bumped it. What version of mothur are you using? When we run it in 1.29 the flow you sent goes through just fine.

That’s odd. I’m using V1.29.2

I just ran another .flow (mothur > trim.flows(flow=7031_MID1.flow, oligos=ITS800_cat.oligos, pdiffs=2, b
diffs=2) and it gives the following output files and sizes:
7031_MID1.trim.flow 4
7031_MID1.scrap.flow 43,376,602
7031_MID1.flow.files 0

Most of the error indicators in the .scrap indicate primer and/or barcode and/or length problems, i.e., |lbf. That is no sequences are passed to the .trim.flow

When I run the equivalent using trim.seqs (trim.seqs(fasta=7031_MID1.fasta, qfile=7031_MID1.qual, oligos=ITS800_cat.oligos, pdiffs=2, bdiffs=2, minlength=100) I get:

7031_MID1.trim.fasta
7031_MID1.scrap.fasta
7031_MID1.trim.qual
7031_MID1.scrap.qual
7031_MID1.groups

with 9125 in the .trim.fasta.

Can I somehow send you some of complete .flows to test?

Can you upload it to the wiki?