I’m testing classification based on full length 16S vs V3V4 (I know Dr. Schloss has reservations about V3V4 but I have to work with what I got!) sequences from Zymo Mock Microbial Community standard kit:
What’s odd is that with full length, I’m getting less resolution in terms of species classification than with just V3V4 (I also know Dr. Schloss is not so keen on species level classifications but that’s what our collaborators require, even if there’s a chance it could wrong…).
With V3V4, I can get to species level for 7 out of 8 bacterial species (B. subtilis can only be classified to genus). Whereas with full length 16S, I get only 5 classification to species (B. subtilis, L. monocytogenes,E. faecalis, P. aeruginosa, L. fermentum), and 3 I get only to genus.
I’m using the same settings for both full length and V3V4 (the tutorial recommends ksize=8 so I didn’t want to mess with it):
mothur "#classify.seqs(fasta=test.fasta, template=silva.nr_v132.align,taxonomy=silva.nr_v132.tax, processors=20, inputdir=in, outputdir=out,ksize=8)
What am I doing wrong here? How can 16S full length be worse for classification than just V3V4?