These sequences were aligned using the silva.bacteria reference files, screened using screen.seqs, filtered using filter.seqs, and pre.clustered prior to running the chimera check.
I searched through the mothur forum and found one post from 2009 regarding high chimeras found, however the thread ended without a resolution of the problem being posted.
Is there something I might be doing wrong? I would not expect to have so many chimeras in my data
Anecdotal, but I’ve found this to be the case in all of my studies as well. Here’s a reference that sequenced a mock community and found similar chimera rates (32% and 36%): http://dx.doi.org/10.1007/s12275-012-2642-z
I notice this post is two years old already, but was there any resolution to the high % chimera problem? I’m running into the same problem (some of my communities even have up to 60% chimeras). I’m following the mothur SOP, my code looks essentially the same. I’ve used 35 cycles to amplify my amplicons (working with low-biomass sediment samples) so this may partially explain my numbers. Any recommendations?
One thing that has been seen in the literature is that lengthening the extension times reduces the rate of chimera formation. This is why we use a 5 minute extension time. It doesn’t get rid of all of the chimeras, but we generally see <15% of our total sequences being chimeras. The data for this observation are tucked away in the Supplement of this: http://www.ncbi.nlm.nih.gov/pubmed/21212162
I’ve noticed that many of these sequences that were identified as chimeras produce great alignments when I blast them, so now I’m going to see how many of these are actually false positives, and whether they bias the rest of the community.
Figuring out what is really an in silico chimera is difficult. Seems like the current consensus is to be very strict with chimera detection for surveys. I think the current algorithms may be too strict, but I also don’t think it’s worth fighting to keep those sequences because there are better wetlab ways to figure out if things are true chimeras vs false positives. If fine differences between close relatives is your research interest, better to attack it with data other than v4.