merge sff files or flow files

Hi Pat,

A colleague was kind enough to provide me with the complete data set that I was only using a piece of before. The two files represent two different sampling dates from a lake in Brazil. Each folder contains the same 20 groups (reads). The sequences contain the primers, bead links, amplicon tags and bar codes, as follows (which I assume is standard for these things):

CCATCTCATCCCTGCGTGTCTCCGACTCAGACGAGTGCGTGGACTACHVGGGTWTCTAAT
CCATCTCATCCCTGCGTGTCTCCGAC (hybridizs to Lib-L capture bead; anneals to emPCR primers and sequencing primer)
TCAG (sequencing key for amplicon sequencing)
ACGAGTGCGT (MID = multiplex identifier = barcode)
GGACTACHVGGGTWTCTAAT (16S primer)

Should I immediately merge the .sff files or wait and merge the flow files? Should I trim off that chunk of bead annealing sequence and the tag, by creating a separate oligos file with those sequences and using the trim command?

Thanks,
Mike

I ended up using the merge.sfffiles, using the list of files. For trim flows, I didn’t do anything about the bead annealing sequence or the amplicon tag, I just made an oligos file containing my primer and barcodes. When I ran the trim.flows, I ended up with merged.xxxxx.flow files for each barcode, plus merged.trim.flow and merged.flow.files. The latter was what I used for shh.flows, which is currently running. I hope it doesn’t fry my new mac! I did get an error message as this was beginning:

Unable to open /Users/Lycophidion/Desktop/mothur/Lemke data/R_2013_07_06_20_08_39_JR07110653_Administrator_Lemke_Brazil_16S_amplicons_6_july_2013/D_2013_07_06_20_25_46_JR07110653_fullProcessingAmplicons/sff/lookupFiles/LookUp_Titanium.pat. Trying mothur’s executable location /Users/Lycophidion/Desktop/mothur/Lemke data/R_2013_07_06_20_08_39_JR07110653_Administrator_Lemke_Brazil_16S_amplicons_6_july_2013/D_2013_07_06_20_25_46_JR07110653_fullProcessingAmplicons/sff/LookUp_Titanium.pat

Does this indicate something is missing?

Thanks,
Mike

Hmmm, the error message got cutoff on the right hand side. I think what you actually want is sff.multiple. It will run each sff file through sffinfo, trim.flows, shhh.flows, trim.seqs and then it will concatenate the output. Give that a try.

Pat

Hi Pat,

Thanks for the response. Actually, I did try sff.multiple, but got another series of error messages, so I decided to go through the whole thing step by step in order to better learn and trouble-shoot the process. That error doesn’t seem to mean anything, although I’d like to know why I’m getting it, since it is telling me it can’t open a file, but then goes ahead and opens it. Here’s the whole thing with the part I cut off:

error:
Unable to open /Users/Lycophidion/Desktop/mothur/Lemke data/R_2013_07_06_20_08_39_JR07110653_Administrator_Lemke_Brazil_16S_
amplicons_6_july_2013/D_2013_07_06_20_25_46_JR07110653_fullProcessingAmplicons/sff/lookupFiles/LookUp_Titanium.pat.
Trying mother’s executable location /Users/Lycophidion/Desktop/mothur/Lemke data/R_2013_07_06_20_08_39_JR07110653_Administrator_Lemke_Brazil_16S_
amplicons_6_july_2013/D_2013_07_06_20_25_46_JR07110653_fullProcessingAmplicons/sff/LookUp_Titanium.pat

So far, it seems to have worked. However, I’m having that same problem with Clearcut that I noticed a few other people had. Following the 454 SOP, I create a distance matrix

dist.seqs(fasta=final.fasta, output=phylip, processors=2)

Then:

clearcut(phylip=final.phylip.dist)

I receive the error message:

Clearcut: Distance value out-of-range.
Clearcut: Syntax error in distance matrix at offset 43.

I checked the name at line 43 and its neighbors, and there doesn’t seem to be any problem there (all names alphanumeric and greater than 10 characters).

Best,
Mike

Could you send your log file and final.fasta to mothur.bugs@gmail.com?

Hi!

Will do! I sent the matrix file, but it was so big it probably didn’t go through.

Mike