Hey, anyone.
I’m curious what you think the best possible experimental design is for an amplicon based study. Mock communities are great (not that I’ve yet had the pleasure of using one) but I’m wondering if spike-ins could also improve the data’s reliability?
Just wondering what your “ideal” experiment looks like, especially considering the idea of combining independent experiments together.
Thanks!
Hey -
Spike ins are an interesting idea. The challenge is what are you going to spike in - cells or DNA? If cells then you’re assuming that they will attach to the environmental matrix and lyse as efficient as if they were natively grown there. If DNA, then you lose the ability to look at extraction efficiency. Ultimately, if you look over these problems, they’re best for quantification. You can get much better quantification doing qPCR with universal primers. If your question is about comparing your data to someone else’s I don’t think spike ins will help.
For comparison of independent experiments, I would strongly encourage you to interpret your results independently and then make comparisons. You cannot use someone else’s data as your control since the data were generated by different methods, by different people, with different equipment. If the signal is strong enough to be real, you should see it across studies. We’ve done work like what I suggest in a study on obesity and colon cancer.
Hope this helps,
Pat
Hey, Pat
Thanks for the considerations, I’m designing an study that will likely involve multiple experiments. Do you think it’s possible to ensure each experiment’s results are directly comparable to each other? Even if the samples were taken at different times/in different experiments, can’t they still be combined if they were all sequenced together?
Additionally, does anyone think HarmonizR* would be a viable batch correction tool ie. representative OTU sequences are used as features and their abundances are the values. I mention HarmonizR because it was designed with the expectation of missingness (ie. some representative sequences may be missing between samples). In this context, a mock community could be used as a reference batch. IDK, any thoughts on this would be appreciated.
*: HarmonizR basically applies sva::ComBat or limma::removeBatchEffect() for each feature/representative_seq