I’m using the 16s rRNA alignment found on the CRW website as a reference. Each of the sequences has an accessionID so I’ve written a few messy scripts that can parse this info and return taxonomic information and add back into the alignment fasta file in a form that mothur prefers. Removing sequences without any taxonomic information, I’ve been left with ~20,000 of the original ~35,000 sequences in this alignment file.
I’m just curious if anyone has used this alignment before as reference and what luck they’ve had in using. I decided to forgo Silva because I wanted to put an emphasis on the bacteria genus Leptospira, which was lacking in sequences in Silva when compared to CRW.
Found here: http://www.rna.icmb.utexas.edu/DAT/3C/Alignment/