I’ve got a problem in mothur and hope somebody has some expert advice for me. This forum is amazing - I tell this repeatedly.
I have Illumina Miseq data (250 bp paired end, V4 bacterial 16S region and fungal ITS2 region). They come with barcodes and adapters removed. Primers are still contained in the sequence.the company told me, the primers come with up to 10 random nucleotides attached to increase nucleotide diversity…
Now I am having a hard time removing the primers. I tried trim.seqs and make.contigs using the oligo option. Mothur scraps all sequences, because given the primer sequence she assumes the sequences start directly with the primer. But they don’t: I manually checked some fastqs random for the starting nucleotides, sometimes there are no nucleotides prior to primer sequence, sometimes there are random nucleotides. How can I remove the primers?
Thanks for helping me out here.
You might try making your oligos file contain 10 N’s and then your primer sequence. You might also tell your sequence provider that their approach was a real pain in the rear and hurts your ability to get the best possible data sicne you’re sequencing garbage.
Thanks for the fast reply.
I told them. Really annoying! Especially since they told me only after a couple of emails when I insisted on “something is wrong!”.
Anyways, I need to deal with it now. And I am really, really sorry, to have to bother other people with such things. When they make such a mess, they should at least clean it up before they sell it… They do, but you need to pay that cleaning service…
I tried several things now, none of them really worked out.
(i) make.contigs using oligo file. It scraps nearly all sequences.
The oligo file looks something like this:
primer CCTACGGGNGGCWGCAG TACNVGGGTATCTAATCC
primer CCTACGGGNGGCWGCAG NTACNVGGGTATCTAATCC
primer CCTACGGGNGGCWGCAG NNTACNVGGGTATCTAATCC
primer CCTACGGGNGGCWGCAG NNNTACNVGGGTATCTAATCC
primer CCTACGGGNGGCWGCAG NNNNTACNVGGGTATCTAATCC
…'listing all possible combinations…
barcode none none myGroup
(ii) trim.seqs after making contigs. It scraps a bit more than half of the sequences. The oligo looks something like this. The reverse is the reverse compliment:
(iii) trim.seqs prior to make.contigs. Works fine for the forward. I scrap around 2% of the sequences. Doing the exact same command (independent of reverse compliment of the reverse primer - I checked just to be sure) crashes mothur instantly without even given an entry in the logfile.
I tried with the checkorient parameter, it affects only the time needed to run the command, naturally.
Any further suggestions? This is a huge pain for a small task… Is there an option to simply cut the first and last 20 bp of each seq? It would solve all problems.
Thanks a lot!
The chop.seqs command allows for trimming sequences by a given number of bases. http://www.mothur.org/wiki/Chop.seqs
Thanks a lot - for the fast reply and for solving this problem. That was what I had been searching for. Perfect.
To make this story complete and not just tell bad things about sequencing companies: I insisted and they sent me the trimmed sequences.