Pooling sequence data


I am posting this to (hopefully) stimulate a bit of discussion on this topic.

In quite a few 16S experiments, authors choose to pool their samples either prior to DNA extraction or after DNA extraction before preparing samples for sequencing. In my own work, I choose to take individual samples forward for processing and sequencing. However, I am now interested in pooling the sequence data by treatment (to establish a “mean”, quite crudely) and running statistical analyses on these. The rationale for this is to establish sample sizes for a future experiment.

Firstly, any advice on methods to proceed with the above would be great!

Secondly, what is the consensus with regards to the analysis described above? Clearly it is important to run individual analyses but from an experimental unit point of view (i.e. pen, not individual), technically I have to proceed with the above!

Many thanks in advance,


if you can afford to sequence them individually, do. once they’re sequenced cluster each sample individual then run stats on the treatments, don’t pool sequences before clustering.