Reading MiSeq Sop, I understand that no trimming is needed before applying it to paired end sequences. I guess that role is assumed by make.contigs and the other steps within the “Reducing sequencing and PCR errors” section.
But looking at some papers, sometimes raw paired end sequences are trimmed based on fastq quality scores (e.g, sliding window) prior to follow Miseq sop.
Is MiSeq SOP designed to be used directly with unprocessed fastq files coming from sequencers, or are there situations in which some kind of quality-based trimming is needed?
We have yet to find any evidence that quality score trimming improves the error rate over our approach in make.contigs. Also, our approach to making contigs is generally more stringent than that used in other methods.
I have a similar question/problem that needs to be addressed asap (I currently have a manuscript under review). My samples were amplified (V3-V4) using primers 341F - 805R, and sequenced by MiSeq 2x300 PE. By the time I carried out the quality control and applied Mothur SOP, I was very inexperience and rather ill-advised, and thus the primers and indexes were not trimmed from R1 and R2 before making contigs and start with Mothur pipeline. Now that the reviewer is asking why the primers and indexes were not removed, I’ve been checking (FASTQC software) R1 and R2 files to see in what proportion these sequences are present. Both R1 and R2 files show a perfect length distribution around 301bp. Apparently, R1 reads barely contain primers or indexes (no trace of them in many of the R1 files) or overrepresented sequences. But R2 reads show a really bad quality score with a high proportion of reads containing the reverse primer (showed as overrepresented sequences).
My question is: is Mothur able to successfully handle untrimmed R1 - R2 reads (301 bp)? I know Mothur includes trim.seqs with the option “oligos” to remove primers and barcodes before making contigs. In my case, the contigs were generated using many R2 reads containing the reverse primers. How does this affect the generation of contigs and the ulterior filtering and alignment of those contigs against the Silva database?
I may have to eventually run a test to compare the different results from trimmed and untrimmed reads, but I’d very much like to know the opinion of people experienced in dealing with Mothur.
I look forward very much to your replies.
Many thanks in advance,