Trainset Necessary?

I am doing V3V4 16S amplicon sequencing for my project but I am quite new to these bioinformatics.

For the Classify.seqs, is it necessary to have trainset for reference and taxonomy? Can I use silva.nr_138align for reference and for taxonomy?

Is it also necessary to have metada during make.biom? If so, how to I get them?

Thank you

That’s fine - “trainset” is the name of the RDP training set. “silva.nr_v138” is the name of the SILVA training set.



After I classify them to taxonomy, some of the bacteria’s are genus_unclassified.

For example:
1 sequence is classified as enterobacteriaceae;enterobacter
Another sequence is classified as enterobacteriaceae;enterobacteriaceae_unclassified

That means it didn’t have enough confidence to classify the second sequence to the genus level.


