Greengenes 13_8 minor release

Looks like there was a new 13_8 minor release of the Greengenes taxonomy files (see https://groups.google.com/forum/#!topic/qiime-forum/vaV1rgiF08E). From the release notes:

Greengenes minor release to address missing genus and species names. These corrections were sourced by the OTU clusters for the 99% OTUs. Specifically, for each OTU, if it lacked a genus or a species name, the NCBI taxonomy of the cluster members were examined and the most supported name(s) were chosen to update the taxonomy of the representative sequence. Taxonomy information was then propagated to all the lower similarity representative sequences.

Any plans to update the training set files on the wiki?

I’ve been trying out the new taxonomy and it looks like it does help classify some sequences down to genus. I’m also wondering if what they did is a sound way to assign taxonomy to their representative sequences… seems to me that if multiple species’ 16S sequences are in the same 99% OTU, just choosing one of those annotations for the representative sequence might not be the right solution.

Yeah, we can do that…

I’ve updated the reference:

http://www.mothur.org/wiki/Greengenes-formatted_databases