I’m trying to use trim.seqs and trim.flows to remove forward and reverse primers and barcodes. The commands appear to work when we comment out the reverse primer (#Reverse), but when we try to include the reverse primer all sequences are put into scrap. The scrap file says they fail because the reverse primer was not found/failed (|r). Visual inspection of the sequences confirms the primers are present. A couple of questions that might help resolve the problem:
- Should both primers be entered into the oligos file in 5’ - 3’ format? We tried both, neither worked, but this information isn’t in the wicki.
- When the algorithm searches for the primers and barcode does it search from end to end, or in a restricted part of the sequence? For example, is it looking for the reverse primer in the final x bases of the sequence? generally the reverse primers are within 50 bp of the end of the sequence, but the command fails to find them.
Any help would be appreciated,
I have to admit that the reverse primer feature is pretty poor. If it’s not a dead on match to the reverse primer and the reverse primer isn’t the end of the sequence, the fragment will get culled. For most experiments it is rare to ever get all the way through the region to the reverse primer. For those cases where it does, I can imagine that not all of them will, so running filter.seqs(trump=.) will remove the reverse primer sections. We will be re-addressing this in the future, but for now, if your reverse primer isn’t the end of the sequence or has some mismatches, I’d suggest removing them manually or with a personal script.
Hi and thanks for this useful thread. I’m having a problem with reverse primer removal as well. Our sequences are approx 260 bp in length and were sequenced using Titanium chemistry so the whole fragment is covered. I’ve had a look at my sequences and found that some of them have the full reverse primer, while others are truncated by one base at the 5’-end of the reverse primer (i.e. the very last base of the sequence). Because our sequences are V7 of the 18S, we expect a high amount of sequence diversity and don’t want to assume that any “errors” in the primer sequence are indicative of poor general read quality, provided that all other trimming criteria are met.
If I trim with the full reverse primer sequence in the oligos file, I lose all the sequences that are truncated by this one base (approx 30% of sequences). On the other hand, if I truncate the reverse primer by the one base in question in the oligos file, I still end up losing approx 20% of my sequences that end with the base that I manually truncated in the oligos file. Either way I introduce a bias. Using the trump=. option in filter.seqs() does not sound like a good alternative for me as length polymorphisms in this region means lots of “.” in the alignment, and I would risk losing all columns.
So, I was thinking that if trim.seqs() supports regex for pattern recognition, then I could simply modify the reverse primer sequence to accept the read if there is a base OR nothing at the first position in the reverse primer sequence? Would this be a viable option? It would just be very nice to be able to include the reverse primer removal together with all the other trimming functions. I can of course also remove the reverse primer sequences manually from the sequence files before trimming, but I thought I could check on the status of this since the last post was in 2011.
I am using mothur v.1.33. Are there any active plans to make the reverse primer option less strict with regard to being located at the exact end of the sequence?
thanks very much in advance
Perhaps the easiest thing to do would be to use pcr.seqs on your reference alignment to “amplify” the region you are interested in - without including the primers. Then when you do filter.seqs(trump=.) the sequence data that overlaps with the primers will go away.
Thanks very much for the suggestion! I will give this a try and see if it fixed my reverse primer problem.