I used the new taxonomy file silva.nr_v119.tax to classify my eukaryotic sequences. I realized now that most of my sequences are only classified up to family level. After checking the tax file I found that most eukaryotic sequences in the references do not contain taxonomic information on genus level. Why is that? In the previous versions of the eukaryotic reference files, classification was possible up to taxlevel 19 or 20…so definitely included information for genus level.
SILVA has redone their taxonomic outline in the last few years to try and get rid of 19 or 20 levels since that does not correspond to a Linnean system. It could be that they only go to the family level because that’s all that is possible with the data in the database and the information in your sequence.
It is true that SILVA redid the taxonomic outline and also that they cut it down to fewer tax levels for Bacteria and Archaea. However, this was not possible for the Eukaryotes due to the many sub-classes, orders etc. So most of the Eurkaryotic sequence classification include more than 6 tax levels. It seems though that you cut the taxonomy after tax level 6 in the new ref 119 file that you provide. This is no problem for classifying bacteria and archaea, but it is then not sufficient for eukaryotes.
Is there any way to update the tax file and include more levels?
If you look back through the README file, we provide all of the code that we used to generate the files. You could modify that code to get the level of taxonomic depth you desire.