Inconsistent taxonomy classification for Silva data?

We’re using the IM-TORNADO pipeline, which internally is using mothur for classify.seqs amongst other things :slight_smile:

When we analyze some 18s amplicon data we have noticed that the classification appears off in many cases. This is using the provided Silva r119 SSU data from the mothur site (http://www.mothur.org/wiki/Silva_reference_files).

As an example, here is a subset of the OTU taxonomy results w/ Nemotoda:

...
30 Eukaryota(100);Opisthokonta(100);Holozoa(100);Metazoa(100);Animalia(100);Nematoda(100);
74 Eukaryota(100);Opisthokonta(100);Holozoa(100);Metazoa(100);Animalia(100);Nemertea(100);
297 Eukaryota(100);Opisthokonta(100);Holozoa(100);Metazoa(100);Animalia(100);Nematoda(99.9);
482 Eukaryota(100);Opisthokonta(75.6);Holozoa(75.6);Metazoa(75);Animalia(75);Nematoda(55.4);
637 Eukaryota(100);Opisthokonta(85.1);Holozoa(85.1);Metazoa(84.9);Animalia(84.9);Nematoda(70.6);
...

According to the tax_slv_ssu_nr_119.txt file obtained from the Silva downloads site (http://www.arb-silva.de/fileadmin/silva_databases/release_119/Exports/taxonomy/tax_slv_ssu_nr_119.txt) ‘Nemotoda’ is not ranked, and ‘Animalia’ is a sub-phylum (here they are ‘genus’ and ‘phylum’).

We can go back and validate the classification for our other runs (we also have 16s) and possibly fix using the above file, but I wanted to know whether anyone else has run into the same issue.

Whoa, I didn’t know that file existed! That is very helpful actually. What I’ve posted and described (http://blog.mothur.org/2014/08/08/SILVA-v119-reference-files/) just uses the first 6 levels of the eukaryotic taxonomy string because I couldn’t figure it out on my own. I’ll try to get this updated to make a better eukaryotic taxonomy file.

Pat

No problem, thanks for having a look!

I should add, there are a few other files in that directory that also may of use (and possibly better).

The README.txt file describes their purpose a bit more.