Removing chimeric sequences

klau19 · May 19, 2011, 1:58pm

Hi there,

The updates with chimera slayer has got me thinking about what is the best way to deal with chimeric sequences- is it valid just to remove them completely from the analysis?

I can understand the removal of chimeric sequences if they were due to random effects and are present at low levels (eg 1% of the total sequences). Removal of chimeras will also give a better idea of the actual diversity.

However, it seems like the number of chimeric sequences can get quite high in amplicon pyrosequencing data (>10% of total sequences or even higher in some cases). In addition, according to the Chimera Slayer paper (Haas et al 2011), chimeras are more likely between 16S rRNA sequences from highly abundant organisms. Chimeras between certain pairs are also more likely, and are reproducible between independent amplifications. This suggests to me that chimera formation is not just a random process (noise), and removing chimeras may actually artificially lower the counts of the more abundant organisms.

It will be very interesting to have your thoughts on this, or let me know if I am completely on the wrong track with my thinking.

I also noticed that there is a trim option in mothur for its chimera.slayer function, to give chimeric sequences that have been trimmed to leave its longest piece (albeit set to false by default). Is it a valid approach to include these trim sequences in subsequent analysis? Any ideas on the best approach to deal with chimeric sequences if they are present at a significant level in your data (>10%)?

Thanks,
Kelvin

pschloss · May 20, 2011, 4:01pm

Hi Kelvin,

All good questions. Using mock communities and synthetic chimeras, the false positive rates are quite low. And are more likely to artificially increase the number of OTUs/phylotypes. The next iteration of the Costello analysis will move the chimera checking to after the pre-clustering step. There we would suggest supplying the name file as well as setting reference=self. This will allow you to check for chimeras by stepping from most to least abundant using the more abundant sequences as potential parents to the chimera.

Pat

Topic		Replies	Views
chimera.slayer error Commands in mothur	8	8230	August 13, 2010
Dereplicate in chimera.uchime Theory behind mothur	3	5510	August 16, 2013
high chimera percentage using chimera.uchime Commands in mothur	8	4656	January 9, 2017
Dealing with chimeras Theory behind mothur	5	10444	March 5, 2012
chimera.slayer detects "too many" chimeras Commands in mothur	3	4109	May 13, 2011

Removing chimeric sequences

Related topics