I read a paper evaluating methods of sub-sampling, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3233110/. The authors suggested that sub-sampling to median reads number was more accurate than that to the minimum number.
I thought I would initiate a friendly discussion here regarding to rationals and reasoning for sub-sampling methods.
I suppose that I could use python to “recode” as mentioned in the paper for median sub-sampling. Mothur would do sub-sample to the minimum size. I’m more interested in the rationals.
I would appreciate any opinion.