mothur

Classify.seqs-- How to get the names of the reference sequence

hi Pat
The classify.seqs command usually returns the taxonomy information.
How can I get the sequence names of the reference sequence in the template?

For example
the taxonomy;

172163 k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__;g__;s__

How can I return the “172163” to my submit sequences not only the “k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__;g__;s__”

THANKS!

That’s a greengenes accession number. You could probably use get.seqs to pull out the actual sequence and see if there’s other information in the sequence header.

Pat

Thank you for your reply.
I may not have expressed my problem clearly.
When using “classify.seqs” command to classify my own sequence, the file usually returned is “**** .summary” and “.taxonomy”
In “
.taxonomy”, the result is:
xg1.3.186974 (my seqs name) k__Bacteria(100); p__Gemmatimonadetes(100); c__Gemmatimonadetes(100); o__Ellin5290(85); f__(85); g__(85);

I want the result returned in the “****.taxonomy” file to be
the accesson ID replaces the specific taxonomy information.
such as :
xg1.3.186974 (my seqs name) accesson id in greengenes

Can this need be realized directly? Can it be realized by other indirect means?

That’s not how the Bayesian classifier works. The classifier gives you a confidence score that a sequence has a specific classification. The problem is that there are often many things that are just as good a match and it would be biased to claim one is a better match over another. That is certainly possible using one of the other methods in classify.seqs, but I do not recommend them.

Thank you very much !