Denoising, Trimming from non-sff original

I am attempting to extract sequence data from the SRA website. Most of the data can be converted directly from .sra to .sff, which is great. However, some of the data, because it was not originally in an .sff format cannot be converted to .sff. Instead, I converted the .sra file to a .fastq, which I then converted to fasta and qual.

I can’t seem to figure out how to trim and denoise these fasta/fastq files. I essentially want to do the same thing as trim.flows() and shhh.flows() for the sff files but because I can’t generate a .flow file I can’t figure out how to trim and denoise my file.

How should I go about analyzing the files not available in .sff?


You can use the trim.seqs() command to do some error correction using the fasta and qual file as parameters, including removing homopolymer runs. The 454 SOP page has some good parameters for doing this.