fastq.info command

Hello,

I want to do quality filter my data using mothur > trim.seqs(fasta=sahl09.fna, qfile=sahl09.qual, qaverage=25).
I will prepare my qfile using fastq.info.
Can anyone please explain that do I have to do fastq.info for all forward and reverse reads+ libraries SEPERATELY ? or I can make a file in notepad as we do for make.contigs command and then select that file for doing fastq.info ?

OR do I have to combine qfiles of forward and reverse reads and of all libraries after doing fastq.info , And then use this file for mothur > trim.seqs(fasta=sahl09.fna, qfile=sahl09.qual, qaverage=25) ?

I am confused with this step. Kindly help.

Thanks,
Richa

I’m not sure what you’re really trying to do here. Why not just use make.contigs? We have not found that quality trimming reads before assembly does anything positive. I would strongly encourage you to just use the make.contigs command and move on. If you go down the path you propose, yes, you will have do everything manually. Since I don’t think it’s a good way to go, the odds of us developing a command to do it for you are pretty low.

Pat

Hello Dr. Schloss,

Thank you very much for the response.

I was trying to exclude sequences with low quality score at trim.seqs step.
-Challenge that I faced was to give a no. of qfile at this step, infact qfile of forward and reverse sequnces seperately, which doesnt make any sense to me. This is because I need to make contigs first, and then I could not think of any way to use qfiles to do quality filter using qfile for forward and reverse sequences seperately. But in one of my posts, I was suggested that I can use ‘‘qwindowaverage’’, so I did several trials.

If I understood well your suggestion, I should not try with ''trim.seq (fasta=xyz, qfile=xyz, qwindowaverage=xy,…) ? In this case should I completely exclude some samples with low quality scores ?

I express, that I am a beginner and I really liked Mothur because it is very very user friendly and infact it taught me what is happening with my sequence at each step!!
Many people like me prefer Mothur for processing paired-end 16S Miseq data. So it will be very advantageous for us if there is some option to do quality filter using quality scores.
looking forward,

Richa

Are you following the MiSeq SOP? That explains how to generate a file to run make.contigs on. It takes into account quality scores already.

http://www.mothur.org/wiki/MiSeq_SOP

Pat

Thanks for reply Dr. Schloss.

Make.contigs step takes into account the quality scores and also removes ambiguities (N)at the step of trim.seqs and reduces sequencing error.
But then why the number of unique sequences remains inflated due to poor quality of reads ? Are they not fully excluded in quality control steps before ?

I am sorry if I am asking very silly question.

looking forward,
Richa

The problem is in the regions that do not overlap. If you have one low quality base call in the non-overlapping region, this will generate a new unique sequence. If you could trim the reads back to this base, there would be nothing left of the read to assemble into a config.

Pat

Thank you Dr. Schloss.