Can I use trim.seqs to extract V2 from a V1V3 read?

Hi all,

I am interested in looking at just the V2 region in the 16S gene and I have V1V3 reads (from the HMP dataset).

My question is, can I use trim.seqs to extract it? If yes, I’d be grateful for any advice regarding setting up the oligos file.

If this is not a good approach, please let me know if there is a better one.


You might find that the pcr.seqs command is more suited to this. It works in a similar fashion, you provide an oligos file with your primer sequences and the fasta file to trim from.

That would be the most straight forward method, but I’ve found sometimes that using pcr.seqs discards a number of sequences (when it can’t find your primer targets in the sequence). An alternative approach, which would be a bit more complicated, is that you could align a few representative V2 sequences and then use their alignment space as a filtering criteria for the rest of your data.

For example, if you copy an E. coli V2 sequence into a fasta file, you could run the commands:

align.seqs(fasta=Ecoli.V2.fasta, reference=silva.bacteria.fasta)
filter.seqs(fasta=Ecoli.V2.align, vertical=T, trump=.)

align.seqs(fasta=your_real_data.fasta, reference=silva.bacteria.fasta, flip=T)
filter.seqs(fasta=your_read_data.align, hard=Ecoli.V2.filter)

This would allow you to trim your sequences to the V2 region and not worry about losing sequences that don’t match your primer exactly.