I’ve been working in few datasets and I often find that more than a few samples contain fair amounts of chimeras, in many cases over >30% after taking abundance into account.
I run chimera.vsearch as indicated in the MiSeq SOP with mothur 1.39.5, Vsearch 2.6.0 and Silva_nr v128
What levels of chimeric assembled reads are normal/common?
Do you often see those numbers or do you think that there is something wrong?
Thank you in advance,
Above 30% does seem high. I might wonder if those samples are special in anyway - are they expected to have low biomass or low diversity? You might also look at your PCR conditions including your extension times. I’m not sure that it would make a difference, but we (and others) use long extensions times of 5 min to reduce the rate of chimeras.
Not really. Most samples have >100,000 raw reads
Do you mean 5 min extension in each cycle? or in the Final extension step?
The last batch I’ve been processing used this PCR conditions:
initial denaturation at 95° C for 10 min
denaturation at 94° C for 30 s
annealing at 55° C for 10 s
elongation at 72° C for 45 s
final elongation at 72° C for 10 min
PS:based on your comment I guess that low-biomass and low-diversity samples tend to show higher levels of chimeras
Additional info: this dataset was v1v3 (yes I know…) 2x300bp run
I’m also analysing a 16Sv4 dataset (2x250bp) and the chimeras don’t seem to be as abundant. Although there are more than a few between 10-25%, a couple >25% but none >=30%
What numbers do you usually observe with your v4 protocol?
I think around 10%
We do a 5 min elongation step (not 45 sec)