Please suggest whether alpha diversity is also to be calculated on subsampled data?

yes, pick a number of sequences and subsample to that for both alpha and beta indicies


I’ve been told the same by several of my microbiome colleagues and that is what I do.

The question for me: what value to pick?

Recent example: I have ~40 patient samples, and to that run added 10 reagent controls (blanks, etc). The nseqs for the patient samples ranged from 9,500 to 30,000. The highest nseqs for my reagent controls (I have a great tech :smiley: ) was 400. So it was easy for me, in calculating alpha and beta indices, to set subsampling to 9500.

Is that right? And what would be a good rationale for setting the value?

I like to go a bit below my lowest sample. Because subsampling the sample that has 9500 to 9500 is different than subsampling the 30000 to 9500. I’d probably have gone with 7500. But this is probably a really minor point.

I’m still trying to figure out what to do with the negative controls, for now I’m processing them with the rest of the samples but they get dropped when subsampled (which is ok for me. clients have the data, they can see what OTUs show up in the neg and decide what to do with that information)

Thanks kmitchell for informative reply.