Assessing differences in communities sequenced at two diff. sequencing facilities (same primers)

Hi Y’all,

Would love some feedback from folks on approaches for assessing whether differences in the observed microbial communities are ecological or because they were done at different sequencing centers.

Background: we have a dataset from 2014 and 2016. Both years contain samples collected from the same aquatic environment. The samples were collected and processed the same. However, in 2014, the samples were sequenced at Location A using Illumina MiSeq, and, in 2016, samples were sequenced at Location B. The samples contain a pre-filter and sterivex filter which separate particle-associated and planktonic cells. They were sequenced using the same primer set.

We don’t see a year-to-year difference when we look at the samples (prefilter and sterivex) from both years. The biggest driver of community structure is filter type. But as we dive into individual fractions, we see that within prefilter or sterivex-associated communities, year-to-year is very important, #1 or #2 (via R2) driver of the different variables we measured.

The year-to-year difference may be ecologically relevant

  • List item

; however, IMO, we need to make sure that the year-to-year differences aren’t sequencer/location rather than true year-to-year variation.

I would love it if anyone has any insight into how to help tease this apart or provide statistical backing for it.

Hey -

Do you have any samples that were sequenced in both batches? Would it be possible to take a handful of each and resequence them together? I’m afraid that if the conditions and sequencers are confounded you really can’t say what’s driving the difference. This would be an example of a batch effect

Pat

Unfortunately, I don’t think we do and my main concern is batch effect (I couldn’t think of the name in the original post). I don’t think we can separate the differences in communities year to year from batch to batch. I was hoping to limit the effect through a correction before further downstream analyses, but wasn’t sure if it was possible (e.g., ConQuR).
https://www.nature.com/articles/s41467-022-33071-9