Is there a way to include wild cards in the .oligos file for trim.seqs/ trim.flows?
I could only find that . (dots) are allowed and can replace a single character but I couldn’t find a wild card character (0 or more characters). Assuming there’s a regular expressions engine behind it this should be possible with * or ? but they don’t. The reason I need this is because we get our sequencing output from the company with their MID before the barcode. Although I have the sequence for that MID it seems more prone to errors and even indels and these have little to do with the overall sequence quality.
At the moment I write my oligos file like this, but it’s sloppy:
barcode ....................ACACGT F515
barcode ...................ACACGT F515
barcode ..................ACACGT F515
barcode .................ACACGT F515
barcode ................ACACGT F515
barcode ...............ACACGT F515
Is there way around it?
Or even better, make mothur search for the combination barcode-primer not just at the beginning of the sequence.
Thanks in advance,
Hmmm, the “.” character shouldn’t work. I’d suggest the following…
barcode ACACGT F515
Then include ldiffs= the largest number of N’s in your longest linker. That has to be expensive to synthesize all of those primers, no?
Oh good to know for the next time; I’ll use N’s instead.
It wasn’t clear to me in the description of the function what is meant by ‘linkers’ and ‘spacers’.
Just to clarify, we synthesize only 1 primer for each barcode that linker is added by the company.
I’m not fully aware of how they do things I only have to deal with the consequences
When I included the linkers in the oligos file this way
and ldiffs were set to 3, only those sequences with linker N were in the trim file, the seqs with longer linkers ended in the scrap. Any solution?
Mothur is designed to treat multiple matches as a mismatch. What you can do is run create a separate oligos file with each wildcard linker string in it. Then you can run the command 3 times, putting the scrap file as input into the command to find the sequences that match the other wildcards.
mothur > trim.seqs(fasta=yourFasta.fasta, oligos=oligos1, ldiffs=1, otherParameters)
mothur > trim.seqs(fasta=yourFasta.scrap.fasta, oligos=oligos2, ldiffs=2, otherParameters)
mothur > trim.seqs(fasta=yourFasta.scrap.scrap.fasta, oligos=oligos3, ldiffs=3, otherParameters)
mothur > merge.files(input=yourFasta.trim.fasta-yourFasta.scrap.trim.fasta-yourFasta.scrap.scrap.trim.fasta, output=yourFasta.merged.trim.fasta)