I am using mothur v.1.38.1. I try to trim primers off from my sequences. When I run trim.seqs almost all of my sequences end into scrap. However, when I search the scrap file I find sequences that do match with the primers.
My trim.seqs command is as follows:
trim.seqs(fasta=stability282.trim.contigs.fasta, oligos=primers.oligos, checkorient=T, flip=T, pdiffs=0, processors=16)
And primers.oligos file looks like this:
Here is one sequence that has ended into scrap although the primers are there (underlined):
M03602_102_000000000-B4KW2_1_2119_21997_24364|fr(f) bdiffs=0(match) fpdiffs=1000(noMatch) rpdiffs=1000(noMatch)
I tried the command also using pdiffs=1 and then almost nothing ended into scrap. However, previously I haven’t allowed any differences and thus I rather not to do it this time either.
I would be more than happy if somebody can tell me what I am doing wrong here!
mothur expects your sequences to start and end with the primer sequences. It looks like you have barcodes that start/end the sequence. These need to be included in the oligos file.
I’ve encountered something similar, where the sequences had a random string of nucleotides (not barcodes) upstream of the primers.
The core facility used a different barcoding strategy.
I got around the issue by adding a string of Ns to the oligos file. I think it might work in your case as well.
I was able to remove my primers, that also had between one and five N bases up/down stream by using pcr.seqs() and an oligos file. Maybe it works for you too in case you are still stuck. (paired ends, MiSeq)
#2 .oligos: primer GTGCCAGCMGCCGCGGTAA GGACTACHVGGGTWTCTAAT
pcr.seqs(fasta=16S_uml_12.trim.contigs.fasta, oligos=bactV4.oligos, pdiffs=2, rdiffs=2, group=16S_uml_12.contigs.groups)
Thank you very much for these answers! I have now tried a couple of things. First, I received the barcode sequences from the lab and tried trim.seqs with oligos file including also the barcodes. However, again, everything ended into scrap. Then I noticed that the situation is indeed what hleung suggested: there is some random nucleotides preceding the primers and these are not barcodes. The length of this part varies at least between 2 and 8. I tried to add Ns before my primer sequences in oligos file, but I believe I might have done it wrong. Now I got a couple of sequences in trim.fasta-file but still almost all end into scrap-file. Could you please send me an example how the oligos file should look like when I use this Ns string? Carla, I also tried with pdiffs=1 but then nothing ended into trim.fasta-file which is a bit suspicious. I am afraid that if I use something else than pdiffs=0, I may allow some other differences in primers too. Previously I have always used pdiffs=0 and I would rather keep the protocol as similar as possible. Any new ideas how to deal with this problem are welcome!
Just to update that I finally solved this problem. I ended up using an unix command (sed) to replace all the nucleotides before and after the primers. Then I carried on with trim.seqs. Thanks for the answers!