Hello -
I have encountered a strange error using the Silva taxonomy that I can’t figure out. My sample has both archaea and bacterial sequences in it so I had to use RDP in order to tease them apart.
First, I classified the sequences using the RDP taxonomy :
mothur > classify.seqs(fasta=CSarch.unique.good.filter.unique.precluster.pick.fasta, template=trainset6_032010.rdp.fasta, taxonomy=trainset6_032010.rdp.tax, processors=2, iters=1000, cutoff=80, method=bayesian)
Next, I removed the sequences classified as bacteria:
mothur > remove.lineage(fasta=CSarch.unique.good.filter.unique.precluster.pick.fasta, taxonomy=CSarch.unique.good.filter.unique.precluster.pick.rdp.taxonomy, name=CSarch.unique.good.filter.unique.precluster.pick.names, group=CSarch.good.pick.groups, taxon=Bacteria)
Then, I renamed the remaining files to reflect their new status as only archaea sequences:
mothur > system(cp CSarch.unique.good.filter.unique.precluster.pick.pick.fasta CSarch1.final.fasta)
mothur > system(cp CSarch.unique.good.filter.unique.precluster.pick.rdp.pick.taxonomy CSarch1.final.taxonomy)
mothur > system(cp CSarch.unique.good.filter.unique.precluster.pick.pick.names CSarch1.final.names)
mothur > system(cp CSarch.good.pick.pick.groups CSarch1.final.groups)
However, because the RDP taxonomy has so few Archaea sequences it does not do a very good job of classifying my sequences. There were a ton of unclassified. The Silva taxonomy has a larger database of Archaea sequences so I decided to reclassify the sequences using Silva (which I couldn’t do in the first place since the Silva database is separated into Bacteria and Archaea).
mothur > classify.seqs(fasta=CSarch1.final.fasta, template=silva.archaea.fasta, taxonomy=silva.archaea.silva.tax, processors=2, iters=1000, method=bayesian, cutoff=80)
It all seems well and good but when I make distance matrices and cluster the data into OTUs I run into problems with the classify.otu command:
mothur > classify.otu(taxonomy=CSarch1.final.taxonomy, name=CSarch1.final.names, list=CSarch1.final.an.list, basis=sequence, group=CSarch1.final.groups, label=unique-0.01-0.02-0.03-0.10, cutoff=80, reftaxonomy=silva.archaea.silva.tax)
unique 7567
Warning: cannot find taxon Soil_Crenarchaeotic_Group in reference taxonomy tree at level 2 for G40ZT5B03HBHTT. This may cause totals of daughter levels not to add up in summary file.
Warning: cannot find taxon Soil_Crenarchaeotic_Group in reference taxonomy tree at level 2 for G40ZT5B03F60UP. This may cause totals of daughter levels not to add up in summary file.
0.01 3496
Warning: cannot find taxon Soil_Crenarchaeotic_Group in reference taxonomy tree at level 2 for G40ZT5B03HBHTT. This may cause totals of daughter levels not to add up in summary file.
Warning: cannot find taxon Soil_Crenarchaeotic_Group in reference taxonomy tree at level 2 for G40ZT5B03F60UP. This may cause totals of daughter levels not to add up in summary file.
0.02 985
Warning: cannot find taxon Soil_Crenarchaeotic_Group in reference taxonomy tree at level 2 for G40ZT5B03HBHTT. This may cause totals of daughter levels not to add up in summary file.
Warning: cannot find taxon Soil_Crenarchaeotic_Group in reference taxonomy tree at level 2 for G40ZT5B03F60UP. This may cause totals of daughter levels not to add up in summary file.
0.03 317
Warning: cannot find taxon Soil_Crenarchaeotic_Group in reference taxonomy tree at level 2 for G40ZT5B03HBHTT. This may cause totals of daughter levels not to add up in summary file.
Warning: cannot find taxon Soil_Crenarchaeotic_Group in reference taxonomy tree at level 2 for G40ZT5B03F60UP. This may cause totals of daughter levels not to add up in summary file.
Your file does not include the label 0.10. I will use 0.04.
0.04 176
Warning: cannot find taxon Soil_Crenarchaeotic_Group in reference taxonomy tree at level 2 for G40ZT5B03HBHTT. This may cause totals of daughter levels not to add up in summary file.
Warning: cannot find taxon Soil_Crenarchaeotic_Group in reference taxonomy tree at level 2 for G40ZT5B03F60UP. This may cause totals of daughter levels not to add up in summary file.
Am I using the wrong silva taxonomy file? I used the same taxonomy=silva.archaea.silva.tax file for both the classify.seqs and classify.otu command - how can I be running into this issue? What is the difference between the silva.archaea.rdp.tax file and the corresponding silva.tax file?
Thanks,
Kristina