Hi, does anyone know a program that can convert a .cons.taxonomy file output by classify.otu to NCBI tax IDs? Example: Firmicutes(100);Erysipelotrichi(100);Erysipelotrichales(100);Erysipelotrichaceae(100);Allobaculum(100);Allobaculum_unclassified(100);Allobaculum_unclassified(100);Allobaculum_unclassified(100); converts to (taxid:1187017)
You’d likely need to classify your sequences using an NCBI-based taxonomy. Is that what you’re already doing? If so, unfortunately, mothur doesn’t have anything directly.
I suspect you could strip out the parentheses and numbers and then do a look up of a reference taxonomy in R. If you are using greengenes/silva/rdp, I suspect you’ll get a lot of taxonomies that don’t match what NCBI has.