Stuck in Trim.flows

Dear all,
I m new to Mothur and sequencing analysis and hope someone can provide some insights on how to proceed.
We recently did some 454 16S sequencing on patient stool samples (10 samples) and received the seq back in .sff format.
We first did

mothur > sffinfo(sff=sample01.sff, flow=T)

to extract the fasta, qual, and flow data. Then, we did
mothur > trim.flows(flow=sample01.flow, oligos=sample01.oligos, pdiffs=2, bdiffs=1, miniflows=360, maxflows=720, processors=4)

After the analysis, it seems like all flows went to the sample01.scrap.flow and nothing in the sample01.trim.flow

The sample01.oligos is in this format

primer GAGTTTGATCMTGGCTCAG TACCAGGGTATCTAATCC
barcode ACGAGTGCGT ACGAGTGCGT A0087
barcode ACGCTCGACA ACGCTCGACA A0088
barcode AGACGCACTC AGACGCACTC A0117
barcode AGCACTGTAG AGCACTGTAG A0122
barcode ATCAGACACG ATCAGACACG A0124
barcode ATATCGCGAG ATATCGCGAG A0126

Please help. Thanks.

I have also had trouble trying to use an oligos file in the paired barcodes format with a single input file. I don’t think it works. The paired barcodes/primers oligos file format is designed for the MiSeq output, where you have separate files for forward read, reverse read, forward barcode, and reverse barcode sequences. This isn’t clearly indicated on the oligos file wiki page.

If you had a .fasta file you were working with, you could first split on the forward primer/barcode, reverse complement the result, and then split each again on the reverse primer/barcode.
… but I don’t know how to take the rc of a .flow file.

Yeah, the paired barcodes probably won’t work very well. Your reads would have to end with the barcode and not have anything else at the end.

Also, you really don’t want to do minflows=360/maxflows=720. As we showed in our PLoS ONE 454 paper, that really does nothing to improve the data. You’d be better off (assuming you’re using GS FLX) using minflows=450/maxflows=450. If you have FLX+, then you want to use 1000.

Pat

Thank you so much for the replies.

So if I understand correctly, I should first split the .shh or the .fasta file into two separate files, one for forward read and one for reverse read. Then analyze each file separately?

Thanks again, really appreciate your help.

That’s right - I didn’t actually catch that you had sequenced from both directions. You would then have two sets of forward primers and barcodes. You probably want to create two oligos file where each has the information for each direction. Then run trim.flows twice, renaming the output each time, and then process them separately. For the reverse read, when you get to trim.seqs you’ll want to run flip=T.

FWIW, people generally do not sequence in both directions by 454 and instead chose to pool all their data in one direction.

Pat

Thanks Pat but I am getting really confused now about your comment on sequencing both directions on 454. What exactly do you mean by “chose to pool all their data in one direction.”?
The sequence file was handed to me by my PI, so I actually dont know how the sequencing was done and how these sequences were generated. As far as I can tell, it was done with 454 GS FLX.
I apologize for my stupidity but since the oligo file was in above format (paired format), I assume the sequencing was done on both directions.

Given that, I did
[b]mothur > trim.flows(flow=sample01.flow, oligos=sample01.oligos, order=B, pdiffs=2, bdiffs=1, processors=4)
with sample01.oligos in this format

forward GAGTTTGATCMTGGCTCAG
#reverse TACCAGGGTATCTAATCC
barcode ACGAGTGCGT A0087
barcode ACGCTCGACA A0088
barcode AGACGCACTC A0117
barcode AGCACTGTAG A0122
barcode ATCAGACACG A0124
barcode ATATCGCGAG A0126[/b]

everything seems to work fine and I can go all the way to the align command.

But then what I try to do the reverse, everything goes to scrap again
I did
[b]mothur > trim.flows(flow=sample02.flow, oligos=sample02.oligos, order=B, pdiffs=2, bdiffs=1, processors=4)
with sample02.oligos in this format

#forward GAGTTTGATCMTGGCTCAG
reverse TACCAGGGTATCTAATCC
barcode ACGAGTGCGT A0087
barcode ACGCTCGACA A0088
barcode AGACGCACTC A0117
barcode AGCACTGTAG A0122
barcode ATCAGACACG A0124
barcode ATATCGCGAG A0126[/b]

I also tried
reverse TACCAGGGTATCTAATCC
barcode ACGAGTGCGT A0087
barcode ACGCTCGACA A0088
barcode AGACGCACTC A0117
barcode AGCACTGTAG A0122
barcode ATCAGACACG A0124
barcode ATATCGCGAG A0126

and

forward TACCAGGGTATCTAATCC
barcode ACGAGTGCGT A0087
barcode ACGCTCGACA A0088
barcode AGACGCACTC A0117
barcode AGCACTGTAG A0122
barcode ATCAGACACG A0124
barcode ATATCGCGAG A0126

but all didn’t work. I believe I must have misunderstood something, but I just cannot figure out.

Please help. And I apologize for this super long post.

Ok, so maybe we’re both confused. You really need to figure out from your PI or whomever, how the data were generated. Were they all sequenced from the same starting position? This is the norm. Sometimes people sequence from the 5’ end and the 3’ end. This turns out to be a disaster. If they did the former, then you should be good to go with how you ran it with the forward primer and the commented out reverse primer. If they did the latter then the second time through you need to comment out the forward primer line and turn reverse into forward.

Pat

Thanks Pat. Apparently, the sequencing was only done in the forward direction.

The problem is sequencing only in the forward direction, but having dual indexed samples, so that you need to de-multiplex based on barcodes at both ends of the amplicon. Mothur doesn’t seem to be capable of doing this at the .flow stage - - - which means you can’t do pyronoise on samples that have been tagged in this way.

That’s not totallllly true. I’d use the forward primer and denoise. In our experience, you can then use filter.seqs with vertical=T, trump=. and the distal primer will go away.

In that case, do you worry about denoising resulting in changes to the barcode on the distal end of the amplicon?

Oh, a distal barcode… Hmmm, that would create big problems for splitting up the files.