How to get a .tax file for classify.seqs/classify.otu from RDP database?

Hello there.

I am following MiSeq_SOP to do some test on mothur commands for future analyses and I have found some problems with it for what I want to do later with my own sequences.

The last step I have made is clustering the sequences into OTUs with


And after that, I got a .shared file with


Now I want to do 2 things:
First, I have to subsample my groups to rarefy the sequences in each group.
Second, I need to assign the taxonomy to the sequences.

I do not know which action would be better to do first, but anyway, what I want to do is to assign the taxonomy with a different database than the SILVA one, specifically, with RDP.
To do so, I have downloaded the unaligned Bacteria 16S fasta file from its webpage, but as I see, a .tax file is needed to complete both the classify.seqs and the classify.out commands, and I do not know where to get it or how to generate it for the RDP.fasta I have.

Does anyone know how to get a .tax file that couples with the RDP.fasta file?

If it is not possible to do this, I have thought of getting the OTUs representative sequences that could come out from the ‘’‘cluster’’’ command to campare it outside mothur (maybe on RDP classifier).
For doing that, I have thought of using


probably with the options

column=, list=, count=

Would it be correct to do that? Is any other way to get the representative sequences that would be included in the .shared file?

Thanks a lot

After searching in the wiki, I have found that is possible to download the RDP database and taxonomy under the a name similar to ‘trainset…16.rdp.fas’ and ‘.tax’.

However, these files are only 20mbs, while the databse file downloaded from the RDP resources web is almost 3Gb.

Are they comparable in terms of outputting the same taxonomy information?


Yeah, they should give very similar output. The RDP may provide their relatively poor sequence alignment, which would suck up a lot of disc space.