Hello, mothur community,
I have been working on a recent project that includes 16s V4 data from mouse lungs, provided by a collaborator. I have processed these data through the MiSeq pipeline with the associated animal fecal samples for OTU clustering and then split the shared file into lung samples and fecal samples to do individual analyses.
There is quite a substantial difference in read depth for the lung samples per sample, with the lowest having 26 reads and the highest having 99672 reads. Of course, the samples with really low coverage likely do not have sufficient sequencing depth to represent the microbial community present. I am trying to define a cutoff for what is “sufficient” coverage for these samples for subsetting purposes. I have generated the attached rarefaction curve where I can visually see that most curves reach the asymptotic phase by around 1k reads, but this is, of course, just my subjective view. I was curious if there is a better way to define this cutoff for the inclusion of samples and subsequent subsampling for OTU enrichment analysis.