what is the possible reason causing high abundance as "bacteria;unclassified" in OTUs table?

Hi there,

I read MiSeq SOP, and did some analysis using the example samples. It worked out pretty good.

Currently, I got my own samples 250PE over V4 region. I used FastQC to check the quality of my current samples, which looked worse than the example samples on MiSeq SOP.

I run mothur by following MiSeq SOP, cluster sequences into OTUs, and sign taxonomic information to each OTU. I got 10-30% relative abundance classified as Bacteria;unclassified in most samples. It is really high… I suspected that I got a lot low quality reads (may be other reasons?) in my data set. My samples were from mouse stool samples, it should be so much high unclassfied sequences.

In my case, what is the possible reason causing high abundance in bacteria;unclassified? If it is the quality issue, is there any way to aggressively remove low quality reads? Any command lines in mothur would help to filter out low quality reads?

Thank you very much. I really appreciate any suggestions on this issue.


Can you try running your data through make.contigs instead of fastqc? I wonder whether fastqc is trimming the reads and making it difficult to classify them.



Pat, FastQC report the quality of a set of reads, but do not perform any trimming. As he mentioned in his post, maybe the problem is in the quality of the run (perhaps lab prep, handling, v3 chemistry,…).

Maybe is a good practice if he provide the commands used for his analysis.