rarefaction curve and singleton

Dear all,

I need your expertise on the matter of rarefaction curve. It is about whether we should use rarefaction curve to analyse a data without singletons (A singleton is a read with a sequence that is present exactly once). The web page (http://www.drive5.com/usearch/manual/abrare.html) states that standard rarefaction software and the original formula for rarefaction curve are unable to give a correct result for data without singletons.

I know that it is a common procedure to remove singleton sequencing reads in NGS data cleaning, which is also what I did for the sequencing data. I use mothur for quality filtering my sequencing data. After that, the OTUs was clustered from the data (without singletons), and the rarefaction curve of the OTUs is calculated using the command “rarefaction.single” in mothur. However, the removal of singletons is not included in the mothur SOP for MiSeq. here is an example rarefaction curve for my MiSeq data (https://drive.google.com/file/d/0BzJAtgd-R9RgZjQ2QTZEamc3SUk/view?usp=sharing)

Is it true that standard programmes are not able to give the correct result for sequencing data without singleton, and what is the (mathematical) reasons? Is mothur among such standard program? If so, what will be the right method/program for my data set?




Best wishes

Do not remove singletons. This is based on a mistaken idea and introduces various biases depending on how you do it. Let rarefaction do the correction for you.

Pat