Chimera identification in seq.error

joewan · August 5, 2013, 6:58pm

Hello,

I’ve been using seq.error to look at error rates in some Illumina mock community data. The documentation for seq.error is a little unclear, however, about how exactly it handles chimeras. I was hoping to get some clarification on two points:

First, how is chimera identification done in seq.error? I looked at the values in the .error.chimera file, and it looks like there must be at least 3 fewer mismatches to the best chimera than to the best single reference. Is this indeed the criteria used by mothur?

Second, I want to fine-tune parameters for chimera.uchime (or possibly another algorithm) in order to apply it to real data from the same run (my sequences are relatively short, so chimera.uchime’s default parameters don’t work so well). Is it a good idea to use the chimeras called by seq.error to evaluate parameter choices (i.e. calculate % of chimeras removed and false positive rates for different thresholds)? Or is there a better, more robust way to train chimera.uchime for my data?

Thanks,
Joe

pschloss · August 6, 2013, 11:15am

Sorry for the scant documentation. Need to work on that…

First, how is chimera identification done in seq.error? I looked at the values in the .error.chimera file, and it looks like there must be at least 3 fewer mismatches to the best chimera than to the best single reference. Is this indeed the criteria used by mothur?

We describe the chimera calling method for seq.error here: Reducing the Effects of PCR Amplification and Sequencing Artifacts on 16S rRNA-Based Studies

Second, I want to fine-tune parameters for chimera.uchime (or possibly another algorithm) in order to apply it to real data from the same run (my sequences are relatively short, so chimera.uchime’s default parameters don’t work so well). Is it a good idea to use the chimeras called by seq.error to evaluate parameter choices (i.e. calculate % of chimeras removed and false positive rates for different thresholds)? Or is there a better, more robust way to train chimera.uchime for my data?

seq.error is probably the best way. Alternatively, you could use Robert Edgar’s software to develop your own training set where you know the chimeras and where they are formed - this is how he created his simm datasets that he used in testing Uchime. I believe it is described in the Uchime paper.

Pat

Topic		Replies	Views
seq.error and chimera detection Commands in mothur	2	1048	March 30, 2017
chimera.uchime error Commands in mothur	3	2813	August 28, 2015
Chimera.uchime high rate of false positives mothur bugs	2	2702	November 19, 2012
Chimera.uchime Theory behind mothur	0	3520	November 25, 2012
error with filter.seqs or chimera.uchime??? mothur bugs	3	1453	March 13, 2017

Chimera identification in seq.error

Related topics