Sequencing Error Rate

Hi all,

I was hoping to get a bit of guidance for addressing a potential sequencing error issue (bit of a broad question, I apologize)
I recently sequenced a number of samples and when I assess the error rate via seq.error, it’s quite high (~0.001, as opposed to the 6e-5 number cited in the Mothur SOP). I’m also getting ~100 OTUs for a sample that should only have 5 species, theoretically.

The batch file I used is based on the commands laid out in the Mothur SOP and I sequenced the V4 region using the V2 300 cycle kit. The samples were spiked onto a flow cell that was sequencing more than just 16s, if that matters.
I used a mock community I made myself, but all of the species were confirmed via sanger sequencing before hand.

I wasn’t sure if maybe I’m making the reference fasta wrong (I just copied the species-specific 16s fastas from NCBI and put them into one file) or if there might be some other issue with our pipeline. Is there anything that stands out that I should try to address first?

Any help is greatly appreciated!


The problem is probably that you are using the V3 chemistry. You really want to use the V2 chemistry. This post is still current…

Beyond the bad chemistry, there are also problems with sequencing longer (i.e. 300 nt) than the fragment (i.e. ~275 nt). This will add to the error rate. I would strongly encourage you to go back and resequence using the 2x250 V2 chemistry.


I have a follow-up question: is there a way to estimate sequence error rates other than the method with the mock community (described in the MiSeq SOP)? A reviewer to my paper asks for error rate though I have not performed mock-community analysis…

Nope. Sorry.

Hi Pat,
No, don’t be sorry! That’s what I had understood. I asked that question just in case…