Hello All,
I am currently processing some bidirectional 454 data from a published manuscript. I have received raw .sff files from authors. The sequencing was done bidirectionally, and the only real outcome of this is that ~1/2 of the sequences are forward, and the other have are reverse (still the same fungal ITS2 region, just half of them are reverse complemented). This means to get through trim.seqs I need to run trim.seqs twice- once on the forward reads, and once on the reverse complemented reads. This is no big deal. However, it is a big deal when I go to denoise, as the output of trim.flows can’t easily be reverse complemented. However, even if it could, what would be ideal is if I could denoise before I even started demultiplexing the sequences by barcodes and doing the same on the reverse complement set. Below I propose a solution, but I’d like feedback on a few things.
- I use sffinfo in mothur to generate a .flow file from my .sff file.
- I use trim.flows but do not pass a barcode mapping file. This way all barcodes are retained, but 454-adapter sequences and other things represented by lower case bases in the .sff file are trimmed.
- I pipe the output of trim.flows to shhh.flows to do denoising. This will generate fasta and qual files, and I can move on to demultiplex with my barcode file from there.
The only potential hang up I can for see is that the 454 adapter sequence and other things represented by lower case bases in the raw .sff file will be retained when I choose not to specify a mapping file in trim.flows. This means I will have a bunch of trouble assigning OTUs post denoising.
Is there any other problem I haven’t thought of with denoising before demultiplexing? Or anything else I’ve proposed?