Robustness and Reproducibility in the Demarcation of OTUs

Schmidt TSB, Matias Rodrigues JF, von Mering C (2014). Limits to Robustness and Reproducibility in the Demarcation of Operational Taxonomic Units. Environ Microbiol, doi:10.1111/1462-2920.12610

We have investigated six different methods for ‘de novo’ OTU demarcation: hierarchical average linkage (AL), complete linkage (CL), single linkage (SL) clustering, as well as CD-HIT, UCLUST and UPARSE. Our basic question was: how comparable (or not) are the results provided by these methods? When analyzing a 16S dataset, how much interpretation bias is introduced by the choice of clustering method alone? How robust / sensitive is clustering to minor parameter changes, such as minute threshold changes, or changes in clustering context? How similar are clusterings based on subregions only when compared to full-length clusterings?
We ran a series of tests to look into these questions. We focused on cluster composition to measure similarity between partitions – do two given methods tend to bin the same sequences together, or not?

Our results indicate that AL and CL, and somewhat surprisingly also CD-HIT provided highly robust and reproducible clustering, whereas SL, UCLUST and UPARSE were more sensitive to even slight parameter changes. For example, simply by moving e.g. from a 97% cutoff to 97.2% led to significantly different clusterings for the latter heuristics.
We observed that clustering for all methods was generally ‘replicable’ – when repeating the exact same clustering run, results were replicated. However, not all methods were also ‘reproducible’ – that is, they did not necessarily provide concordant results under slightly changed computational setups.

