I’ve finished analyzing a data set following the MiSEQ protocol. After classifying OTU’s, my resulting taxonomy file has many OTUs listed as unclassified, with a zero value in the column for size. Is this sort of thing normal?
I have the same issue. All the unclassified OTUs at first level have zero representation. I also was expecting to get much less OTUs given that the unique seqs are ~68,000 and OTUs are >500,000. Of these listed OTUs, only 10621 have some representation, which sum up to my total number of seqs. Basically all the unclassified OTUs listed are not real, so I’m not sure why they were listed in a first instance.
I’m not sure if you all worked out what your problem was. I had the same issue of lots of zero OTUs in the output of my classify.otu command and found out that it was because chimeras were still in my count file…
If you don’t use reference as self (i.e. using count_table in the chimera.uchime step) and instead use reference=silva.gold or something else then you need to update your count table with remove.seqs after that step.
Doing that fixed the downstream problem for me…
Hope that helps
When using a reference with chimera.uchime, you need to remove the chimeras from the count table or name file using remove.seqs. Failure to do so creates a file mismatch. Summary.seqs and a few other commands, namely classify.otu are “forgiving” intentionally. The classify.otu issue stems from the list file. Not removing the chimeras from the count or name file allows them to be added to the list file because mothur creates the “unique” list from the names in the count table or names file.
We will add an error message / warning in an upcoming release.