I am following the 454 SOP almost perfectly and I have 3 data sets for the same sample, bacteria, fungi and algae. Everything is working fine for fungi and Algae.
However during the make.shared step with the bacterial sequences I use:
“Almost perfectly” - what does that mean? What are you changing?
I think a few things are happening. First, the filter.seqs command is doing very little to help if you remove the trump=. option. When it wipes out all of your columns, that is because your settings in screen.seqs were incorrect. If you post the output from running summary.seqs before screen.seqs and the screen.seqs command I can help you find the right parameters. I think the inability to get to 0.03 is related to you having sequences that don’t fully overlap with each other because of the screen.seqs/filter.seqs problem. It might also be related to the cutoff you used. Because of quirks in the algorithm, it is important to use a cutoff in dist.seqs of 0.15 or so.
Almost perfectly, means that the sequencing centre sequenced my DNA from the F and the R primers (I know don’t say it).
SO to deal with this I used sff.multiple(file=sfffiles.txt, order=B, minflows=250, maxflows=720, pdiffs=5, bdiffs=2, maxhomop=8, minlength=200, flip=F, processors=7) for the forward Primer.
Then sff.multiple(file=sfffiles.txt, order=B, minflows=250, maxflows=720, pdiffs=5, bdiffs=2, maxhomop=8, minlength=200, flip=T, processors=7) for the reverse primer, then I used Merge.files to combine the two datasets. I know this is likely where most of my troubles originate from. Interestingly this approach seems to have worked fine for the ITS and the 23LSU, probably because the sequences are shorter from the F to the R primer and they overlap more nicely.
I agree that the problems are because you have a longer fragment and the reads don’t overlap. Because of that I’d analyze the reads separately (sorry!)
For shhh.flows to do its job, you really need minflows and maxflows to be equal. Otherwise there isn’t much denoising going on.