Removing vs. labelling contaminants

Hi! Prior to subsampling/rarefaction, I compared my data to my negative controls and realized there were some ASVs in my data that are probably contaminants. I was wondering if I should remove these ASVs from all of my samples or rename them “contaminant_ASV” prior to subsampling and rarefaction. My worry is that if I remove them, then that artificially inflates the number of “good” ASVs during subsampling and rarefaction.

In most cases, I actually encourage people to leave those in. This is one of the (many) reasons I encourage people to use rarefaction. You’ll effectively be treating all of the samples equally and many of the contaminants will likely wash out during rarefaction because the will be rare. If you notice an ASV/OTU that overlaps with a contaminant, then when you analyze for differential abundances be reluctant to make a big deal of things that are in the contaminant list and that are low abundance.

On the other hand, if you have things that show up in the negative control that are high abundance in your samples and you are super confident it is a contaminant, then you might be safe to remove it from your sample. I would really only do this in low biomass samples (e.g., lung).

Depending on how you do your library prep, things that show up in your negative control could be coming from your samples. I find that people are quick to “blindly” remove things they think are contaminants without asking a bunch of other questions first.

2 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.