This is a follow-up to my question about make.contigs.
While checking a random sample of reads at each stage of Mothur, I found about half where a forward primer exact match is not present in the make.contigs output. For these reads, pcr.seqs typically trims something even though I have specified 0 mismatches in the pcr.seqs command. It usually grabs a small segment of sequence just before the reverse primer.
In addition, in some of the reads the forward primer exact match was present but pcr.seqs did not capture the entire sequence between the forward and reverse primers. It only captures a portion of it.
I have this data in an Excel spreadsheet, so I can email it to you if you want to take a closer look. I can’t really post it here in any nice format.
Can you post one sequence that you think is causing problems? It would be good to see the before and after sequence. One difference between make.contigs and pcr.seqs is that the former requires the sequences to start with the index and primer sequences with nothing else at the beginning. The latter will find the primer at any position in the sequence.
Thanks, Sarah. Yes, we are actually running that option now. It’s still running after 24 hours and the CPU usage is causing some grief for others using our shared resources. This is not 16S data; we have about 800 primers, so we know we are stretching Mothur’s capabilities. I was just hoping to figure out why that error in the primer appeared in the first place, especially since it’s correct in the R1 (the mismatch in the fwd primer appears in R2 - our design gets near-complete overlap in R1 and R2). So, basically could we fix it further up in the pipeline instead of introducing a relaxation in pdiffs downstream.
I’ve been digging further into our results. Since we have many primers, and the amplicons they are designed to capture are very similar (multiple amplicons along the length of antimicrobial resistance genes), we have instances where a a primer sequence (or a sequence very similar to a primer) occurs legitimately in an amplicon. I checked these out last week by aligning some of the reads merged by Mothur (including this example) to the “design” amplicons we want to target, and they map almost perfectly. So, these “internal primer sites” that are not chimeras and are actually part of our target are a unique “feature” of our design. We’re trying out the trimoverlap=t option in make.contigs to see if this helps, as suggested in my other post that I linked above. Thanks so much for your help, Sarah and Pat! Mothur has helped us a lot; it’s a great program and I appreciate all your work to maintain and improve it.