Flow files

Hi, I am not sure how to use the flow files (e.g., in trim.flows) after reading the tutorial info. I have data from 2 different 454 runs. The basic data set is 12 individual samples representing 3 different conditions of plant leaves (3 conditions, each with n=4 individual leaves). In one 454 run these 12 samples were analyzed for the V8V9. The 12 samples were analyzed for V5V7 on both 454 runs, because the sequencing efficiency of the V8V9 primer set was much greater and I wanted to sample at similar depth with V4V6. I have used sffinfo to produce .sff.txt, .fasta, .qual and .flow files from each sample on each run/primer set. For example, Treatment_IAA_1_454reads.1.sff.txt etc were produced.

Can I use “trim.flows (flow=ALL_V8V9.flow, oligos=All_V8V9.oligos pdiffs=1)” on the concatenated (“cat *.flow”) flow files I’ve produced from the 12 individual (sample) flow files for V8V9? Or do I have to run each sample individually with trim.flows? Each sample has about 10,000 reads.

The V5V7 samples from the two different dates of pyrosequencing runs have the same barcodes for the same sample. I understand that they need to have trim.flows applied separately (in whatever way my previous question dictates). Then, can I use the merge.files command to combine all files for downstream analysis?

Thanks! I’m stumped but trying…I need a mother’s help… :slight_smile:

Cheers, Susan

When processing multiple sff files using mothur, we recommend the following for each sff file:

sffinfo(sff=yourSffFile, flow=t)
summary.seqs(fasta=current)
trim.flows(flow=yourFlowFile, oligos=yourOligos)
shhh.flows(file=theFileFileCreatedByTrimFlows)
trim.seqs(fasta=fastaFIleFromShhh.Flows, name=namFileFromShhh.Flows, oligos=yourOligosFile)
summary.seqs(fasta=current, name=current)

When you have run this set of commands for all your sff files, then you can merge the resulting .trim.fasta, trim.names, and groups files. We are working on an sff.multiple command that will automate this process for you. It should be part of our next release.

Thanks: Because I’m new to this, could you just reinforce that you mean that each individual sff (lowercase) file of fasta sequences has to be treated differently (as opposed to each of the 2 SFF (uppercase, which includes, after sffinfo, 12 sff.text, 12 .qual, 12 .fasta, 12 .flow files corresponding to each individual sample sequenced with the primer)? THANKS! Susan

I am finding that the process gets pretty slow when using the sff.multiple command to process my samples (I am processing about 6 samples together), and by the third sample, mothur (but not my computer) gets reallllly slow. I can run the same commands on the samples individually relatively quickly.

Is running these samples individually and then using the merge command a good alternative to the using sff.multiple?



Thanks