Shannon and Inv Simpson for several samples combined?

I am a complete novice when it comes to metagenomics but I have been playing with Mothur for a few months to complete some analysis for my thesis. I am about done but my advisor is asking for a different approach that I don’t know how to do.

Is there a way to calculate the Shannon Index and Inv Simpson for several samples combined?

I can only explain what I mean by explaining my project:

I have separate microbial communities from three wipes that were grown in two salt treatments. I extracted DNA from the cultures over six months and we are observing the succession of the communities through who is present and how the diversity changes over time. I have already calculated the diversity for each time point of each wipe of both salt treatments.

My advisor would like me to instead, calculate the Shannon Index and Inv Simpson for combined wipe samples at each time point in both salts. So no longer breaking it down by wipe sample, only time points (so like week 1, week 2, week 3 … ) for each salt treatment.

This is what my table currently looks like (separated by wipe samples)

Instead, I’d like to have just one “combined wipe sample” that would be separated by all five timepoints. (Planned to embed and image here but it won’t let me add more than one).

I hope this makes sense.

Signed - a desperate (and sleepy) grad student. :sleeping:

Realized I could add the second image here:

So like above, I would like the end product to have all wipes combined but broken down by time points.

My brain just doesn’t seem to understand how to combine the samples.


After doing some searching, it seems I am looking for the “merge.groups” command. Am I able to run alpha diversity calculations on this file following the merge?

Hi there, yup - merge.groups is the command you need to do what your advisor is asking. That being said… unless there’s a reason, I wouldn’t suggest this type of analysis. It would be better to treat the three wipes separately so you can account for the variation in the data. Also, if the three wipes have different numbers of sequences, then the one with more sequences will swamp out the others. With the way you had done it, you can rarefy all of the samples to a common number of sequences and then calculate mean/median and sd/IQR.


1 Like

Thanks for your reply! I agree that this may not be the best approach but I should appease the advisor for him to make that call.

Thanks again!!

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.