Problem with oligos file in trim.flows

Hello! This is my first time using mothur. Thanks in advance for any help!

I want to analyze published data from http://www.ncbi.nlm.nih.gov/sra?term=SRR770064. I’m using the 454 SOP but running into problems at the trim.flows step - everything winds up in my SRR770064.scrap.flow file. I suspect this has something to do with my oligos file because I get some content in my SRR770064.trim.flow file if I remove the oligos option. Or maybe it has something to do with the other options I chose in the trim.flows command?

Here is my oligos file (there are no barcodes):

forward TTGACGGGGGCCCGCACAAG
#reverse TACCTTGTTACGACTT

This puts everything in the SRR770064.scrap.flow file:

trim.flows(flow=SRR770064.flow, oligos=SisonMangus2014.oligos, pdiffs=2, processors=2, minflows=450, maxflows=450, order=A)

This works, but don’t I need the oligos file for something?

trim.flows(flow=SRR770064.flow, processors=2, minflows=450, maxflows=450, order=A)

I suspect you’re missing the barcode in your oligos file. These are not removed from the sff files and precede the primer.

Pat

There are no barcodes in this data. Each sequence starts with the forward primer. Does the function not like that it can’t find any?

Taking one of the sequences from the SRA site (see the reads tab at http://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR770064). You get something like this…

>gnl|SRA|SRR770064.8 GSVBTXW01A46MC
tcagAGAGATGTTTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGATGC
AACGCGAAGAACCTTACCAGGGTTTGACATGATACGAATTTCTTTGAAAGAAGAAAGTGC
CTTTTGGAACGTATACACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTG
GGTTAAGTCCCGCAACGAGCGCAACCCTTATTTTTAGTTGCCTATTTGGAACTCTAGAAA
GACTGCTGGTTATAAACCGGAGGAAGGCGGGGATGACGTCAAGTCAGCATGCCCCTTACA
CCCTGGGCTACACACGTGCTACAATGGGTGAGACAATGAGATGCAAATCTGCGAAGACAA
GCTAATCTATAAACTCTCTCTAAGTTCGGATTGTAGGCTGCAACTCGCCTGCATGAAGTT
GGAATCGCTAGTAATCGCTGGTCAGCTATACAGCGGTGAATTCGTTCCCGGGCTTGTACA
CACCGCCCGTCACACCATGGAAGCTGGTTANCGTACGNNN

The ‘tcag’ is the control tag. You’ll see that your primer (TTGACGGGGGCCCGCACAAG) is 8 bases into the sequence and that it actually has a barcode - AGAGATGT. This is why it’s scrapping everything.

Pat

Oh I see! Thanks a lot!