How does pcr.seqs trimming work?

This is a follow-up to my question about make.contigs.

While checking a random sample of reads at each stage of Mothur, I found about half where a forward primer exact match is not present in the make.contigs output. For these reads, pcr.seqs typically trims something even though I have specified 0 mismatches in the pcr.seqs command. It usually grabs a small segment of sequence just before the reverse primer.

In addition, in some of the reads the forward primer exact match was present but pcr.seqs did not capture the entire sequence between the forward and reverse primers. It only captures a portion of it.

I have this data in an Excel spreadsheet, so I can email it to you if you want to take a closer look. I can’t really post it here in any nice format.

Here’s the pcr.seqs command I run:


Can you post one sequence that you think is causing problems? It would be good to see the before and after sequence. One difference between make.contigs and pcr.seqs is that the former requires the sequences to start with the index and primer sequences with nothing else at the beginning. The latter will find the primer at any position in the sequence.

Thanks, Pat. Here’s an example:

before pcr.seqs


after pcr.seqs



This read’s R1 sequence starts with the forward primer, and its R1 sequence starts with the reverse primer.

I ran the following with our latest version,
Release Version 1.44.3 · mothur/mothur · GitHub, and here are the results.

mothur > pcr.seqs(fasta=test.fasta, oligos=test.oligos, pdiffs=0,rdiffs=0) - did not find forward primer because of the mismatch below


When I run it with pdiffs=1:

mothur > pcr.seqs(fasta=test.fasta, oligos=test.oligos, pdiffs=1,rdiffs=0) - removes both primers and this is the resulting trimmed read

test_seq fpdiffs=1(match) rpdiffs=0(match)

Thanks, Sarah. Yes, we are actually running that option now. It’s still running after 24 hours and the CPU usage is causing some grief for others using our shared resources. This is not 16S data; we have about 800 primers, so we know we are stretching Mothur’s capabilities. I was just hoping to figure out why that error in the primer appeared in the first place, especially since it’s correct in the R1 (the mismatch in the fwd primer appears in R2 - our design gets near-complete overlap in R1 and R2). So, basically could we fix it further up in the pipeline instead of introducing a relaxation in pdiffs downstream.

I’ve been digging further into our results. Since we have many primers, and the amplicons they are designed to capture are very similar (multiple amplicons along the length of antimicrobial resistance genes), we have instances where a a primer sequence (or a sequence very similar to a primer) occurs legitimately in an amplicon. I checked these out last week by aligning some of the reads merged by Mothur (including this example) to the “design” amplicons we want to target, and they map almost perfectly. So, these “internal primer sites” that are not chimeras and are actually part of our target are a unique “feature” of our design. We’re trying out the trimoverlap=t option in make.contigs to see if this helps, as suggested in my other post that I linked above. Thanks so much for your help, Sarah and Pat! Mothur has helped us a lot; it’s a great program and I appreciate all your work to maintain and improve it.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.