Wildcards in .oligos file

Hi,
Is there a way to include wild cards in the .oligos file for trim.seqs/ trim.flows?
I could only find that . (dots) are allowed and can replace a single character but I couldn’t find a wild card character (0 or more characters). Assuming there’s a regular expressions engine behind it this should be possible with * or ? but they don’t. The reason I need this is because we get our sequencing output from the company with their MID before the barcode. Although I have the sequence for that MID it seems more prone to errors and even indels and these have little to do with the overall sequence quality.
At the moment I write my oligos file like this, but it’s sloppy:

barcode ....................ACACGT F515
barcode ...................ACACGT F515
barcode ..................ACACGT F515
barcode .................ACACGT F515
barcode ................ACACGT F515
barcode ...............ACACGT F515

Is there way around it?
Or even better, make mothur search for the combination barcode-primer not just at the beginning of the sequence.

Thanks in advance,
Roey

Hmmm, the “.” character shouldn’t work. I’d suggest the following…

barcode ACACGT F515
linker NNNNNNN
linker NNNNNNNN
linker NNNNNNNNN
etc.

Then include ldiffs= the largest number of N’s in your longest linker. That has to be expensive to synthesize all of those primers, no?

Oh good to know for the next time; I’ll use N’s instead.
It wasn’t clear to me in the description of the function what is meant by ‘linkers’ and ‘spacers’.

Just to clarify, we synthesize only 1 primer for each barcode that linker is added by the company.
I’m not fully aware of how they do things I only have to deal with the consequences :slight_smile:

Thanks again

When I included the linkers in the oligos file this way

linker N
linker NN
linker NNN

and ldiffs were set to 3, only those sequences with linker N were in the trim file, the seqs with longer linkers ended in the scrap. Any solution?

Mothur is designed to treat multiple matches as a mismatch. What you can do is run create a separate oligos file with each wildcard linker string in it. Then you can run the command 3 times, putting the scrap file as input into the command to find the sequences that match the other wildcards.


mothur > trim.seqs(fasta=yourFasta.fasta, oligos=oligos1, ldiffs=1, otherParameters) mothur > trim.seqs(fasta=yourFasta.scrap.fasta, oligos=oligos2, ldiffs=2, otherParameters) mothur > trim.seqs(fasta=yourFasta.scrap.scrap.fasta, oligos=oligos3, ldiffs=3, otherParameters) mothur > merge.files(input=yourFasta.trim.fasta-yourFasta.scrap.trim.fasta-yourFasta.scrap.scrap.trim.fasta, output=yourFasta.merged.trim.fasta)