sub.sample and taxonomy file problems


I’m generating a subsample of reads using sub.sample on a per sample basis and using these reads to create a distance matrix and cluster. Following the clustering, I ran classify.otu(), but got many of the following:

“G2RAAXJ02IU4PO is not in your taxonomy file. I will not include it in the consensus.”

I even tried to run cluster.seqs on my subsample.unique.fasta file, but I still got the errors above.

What am I missing?


Commands are:

sub.sample(fasta=final.fasta, name=final.names, group=final.groups, persample=true, size=5000)

dist.seqs(fasta=final.subsample.unique.fasta, cutoff=0.15, processors=8)

cluster(column=final.subsample.unique.dist, name=final.subsample.names)

classify.seqs(fasta=final.subsample.unique.fasta, name=final.subsample.names, template=../../rdp/trainset6_032010.rdp.fasta, taxonomy=../../rdp/, group=final.subsample.groups, iters=1000, processors=10)

classify.otu(taxonomy=final.subsample.unique.rdp.taxonomy,, group=final.subsample.groups, cutoff=80)

You need to include the .names file with the classify.otu command.

Thanks, Sarah. That was it. Intersting that I can’t use the original final.taxonomy, but instead have to generate a new one using classify.seqs on the sub-sample. Glad to be moving forward again.

mothur > classify.otu(taxonomy=final.subsample.unique.rdp.taxonomy,, group=final.subsample.groups, cutoff=80, name=final.subsample.names)