Dear all

First of all apologies for spamming the forum but this is the only place that has the right people that can answer my question.

We recently got a review from a paper we submitted and among other things we did a differential representation analysis using the indicspecies package.

Just a bit of background: the study examines two groups of samples coming from cows called shedders (or O157 group) and non-shedders (non-O157 group) (if the O157 pathogen is detected in the cow feces, then the animals are called shedders, if not they are called non-shedders)…

And here is the reviewer’s comment:
« The microbiome study is interesting, and again several cited studies support the idea that microbiota composition may play a role in persistence. An indicator species analysis was successful in identifying OTUs that was found in non-O157 shedding microbiomes and other found in O157 shedding genomes. Unfortunately, the suggested indicator variants were not tested on an independent non/shedding dataset, so there was no validation of the significance of the identified variants.”

Is that person -in all seriousness- asking us to go and get a different set of samples from non-O157 animals and check if the non-O157 OTUs (which we found to be representative of the non-O157 group) are still there?

Thank you all for your responses in advances, just wanna make sure I got his/her point!

It sounds like they are asking for that. I’m with you in thinking that this is a big ask at this point in the field’s development. I think you have a couple of options…

  1. Politely, tell them to go fly a kite… “Future investigations will be needed to validate these biomarkers and to determine whether they have a mechanistic role in O157 shedding”. I’d especially go this route for non-Nature/Science manuscripts

  2. Point out that the taxa you found were important have been seen in similar roles in other studies - i.e., the results make sense.

  3. Could you artificially create a held out dataset? You could take 20% of your samples hold them out, run the test on the other 80% and then see if the results on the 80% hold up for the 20%? You could then repeat that a bunch of times. This is called cross-validation and is often used in situations like this one.

Hope this helps a bit!


Yeap, nr 1 and nr 2 is what I thought as well. And yes it helps a lot! Thanks again, P.

