Classification of Sequences


After running through the MiSeq SOP I wanted to pull out the sequences for specific OTUs. I did this with get.seqs and used a list of the original names of the sequences by going into and taking the unique names for my OTUs. I was able to generate a fasta file with the sequences however, when I run the sequences through NCBI nblast using 16s ribosomal sequences as my database several of the classifications are completely different from how mothur classified my OTUs. I’m not sure which is correct or if I’m doing something wrong. I’ve tried with greengenes and silva/rdp in the pipeline, but still had the same issue of miss classifying the sequences. Any help is appreciated.


which database were you blasting against? Remember that nr is not curated at all. I’d only blast again ref_seq and still will take RDP or silva taxonomy over a ref_seq blast. Basically the only time I’d blast a v4 seq is if someone really wanted a species name-and even then I strongly caution them on trusting that name too far.

I used the 16s ribosomal RNA sequences (bacteria and archaea) for the database. I also tried the nr and rdp databases and the classifications were still off in the phylum level of classification.

humm I haven’t seen that (thought I did last month but i’d rerun a set of samples several times and was mixing up rep.fasta and taxonomy files from different runs).