Are "unique seqs" analogous to ESVs?

jmquestel · October 7, 2019, 11:44pm

Hi,

I have been using Mothur for a few years now to analyze V4 and V9 18S rRNA data for metazoan zooplankton communities. Current trends in the metabarcoding field are shifting towards analyses of Exact Sequence Variance (ESVs) instead of clustering sequences at similarity thresholds into OTUs. I was wondering if the “unique sequences” that are identified in Mothur after quality control, filtering, and chimera detection are analogous to ESVs that are generated using pipelines such as DADA2 (Callahan et al., 2016 DADA2: High-resolution sample inference from Illumina amplicon data. Nature Methods 13(7):581-587. doi:10.1038/nmeth.3869). It seems the DADA2 runs sequences through more stringent quality control to ensure detection of PCR errors, fewer false positives for taxonomic identifcations, and low error rates.

I have been classifying the unique sequences identified after running my Illumina data through the Mothur pipeline against a proprietary DNA reference database to get species level identications for zooplankton. I’m curious to know your thoughts on if “unique seqs” would be accepted at ESVs by the scientific community or if I would have to use the DADA2 pipeline for this purpose.

Thanks.

Kendra · October 8, 2019, 2:50pm

DADA2 doesn’t just quality filter, it models the errors (somehow, I don’t understand how) which includes modeling all rare sequences as errors and throwing them out. 1) I don’t think all rare sequences are errors, 2) I don’t like black boxes that I don’t understand and 3) I think trying to get single nucleotide resolution from illumina amplicon data is expecting more precision than the technology can offer. So I stick with mothur and OTUs.

Currently people seem to be equating dada 2 with ASV, not sure how they’d take unique seqs from mothur as ASV.

pschloss · October 10, 2019, 12:45pm

In our experiments, mothur and dada2 are just as “stringent”. In fact we see the same sequencing error rates by both methods (compare output from pre.cluster to dada2 output) if all singletons are removed. I think removing singletons is a horrible idea since you are significantly changing the structure of the community. Furthermore, there are some significant problems with dada2 including the need to fit hundreds of parameters leading to an over fit model and the use of corrected P-values that are absurdly small. The end result is a real risk of actually lumping together sequences that should not be lumped together.

system · October 20, 2019, 12:45pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Mothur vs dada2 Theory behind mothur	9	3628	June 12, 2020
OTUs or sequences? Theory behind mothur	6	3763	November 23, 2016
unique.seq - large number Commands in mothur	7	5770	October 20, 2014
OTUs vs ASVs Theory behind mothur	6	10123	December 5, 2019
DADA2 in Mothur? Theory behind mothur	9	4901	April 7, 2019

Are "unique seqs" analogous to ESVs?

Related topics