How to to trim variable length ITS seqs

Hello - I’m new here. I recently took the workshop and I love everything about mothur so far. I’m currently working with fungal ITS sequences representing a diverse soil microbial community, and I’m having trouble with a seemingly simple task: trimming 25bp off the 5’ and 3’ ends.

For any non-mushroom people reading this, fungal ITS regions vary widely in length. For my ITS2 sequences, they range from 269 to 475 bp. Also, ITS is so hyper-variable, it cannot be aligned.

I know that I can at least remove the primers by listing them in trim.seqs, but I also want to remove an additional portion off the ends. This part of the sequence likely varies between different species, so I can’t include them in trim.seqs as if they were an extension of the primers. Another (minor) limitation with trim.seqs is that it removes otherwise good quality sequences even if there is an insertion/deletion in the primer sequence. (If I have that correct?)

I would use chop.seqs, but as mentioned above, we can’t specify the length that we want to keep for the numbases parameter. And we can’t align the sequences using align.seqs to force them to be the same length. We only know that we want to remove X bp from each end.

Am I missing something? Is there a hacky work-around?

Mothur doesn’t have a direct way to do this, but you can use this hack.

fakeout.oligos:

barcode NNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNN trimGroup

If the bdiffs parameter is set, mothur will consider an N in the barcode a match to the sequence as long as the sequence doesn’t contain an N in that location.

Try this:

mothur > trim.seqs(fasta=yourFastaFile, oligos=fakeout.oligos, bdiffs=1)

2 Likes

Wow thanks! I will give that a try and report back. :raised_hands:

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.