Hi all,
I ran the following command in mothur to classify the sequences based on the taxonomic classifications. Greengenes database was used for the classification.
classify.seqs(fasta=File_after_processing_all_fastafiles.fa.reorganized, reference=~/db/greengenes/gg_13_8_99.fasta, taxonomy=~/db/greengenes/gg_13_8_99.gg.tax, cutoff=60, probs=F, numwanted=1)
the output of this was as follows (Same lines from the output taxonomy file are extracted)
"
1)k__Bacteria;p__Bacteroidetes;p__Bacteroidetes_unclassified;p__Bacteroidetes_unclassified;p__Bacteroidetes_unclassified;p__Bacteroidetes_unclassified;p__Bacteroidetes_unclassified;
2)k__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rickettsiales;f__Pelagibacteraceae;f__Pelagibacteraceae_unclassified;f__Pelagibacteraceae_unclassified;
3)k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Burkholderiales;f__Comamonadaceae;f__Comamonadaceae_unclassified;f__Comamonadaceae_unclassified;
"u
It is supposed to clearly give the hierarchy from kingdom level to species level _ BUT as given in the line 1 “p__Bacteroidetes_unclassified” is repeated.
According to the http://www.mothur.org/wiki/Classify.seqs example the taxonomy file should clearly segregate the taxonomy level in each sequence.
like this
AY457915 Bacteria;Firmicutes;Clostridiales;Johnsonella_et_rel.;Johnsonella_et_rel.;Johnsonella_et_rel.;Eubacterium_eligens_et_rel.;Lachnospira_pectinoschiza;
AY457914 Bacteria;Firmicutes;Clostridiales;Johnsonella_et_rel.;Johnsonella_et_rel.;Johnsonella_et_rel.;Eubacterium_eligens_et_rel.;Eubacterium_eligens;Eubacterium_eligens;
Can some one please clarify the problem?
What can be the causes for this “Confusing or repeated” output?
A clarification will be highly accepted
Thank you