mothur

Groups explaining grouping on NMDS plot?

Dear members of the mothur community,

I would like to have your advice concerning NMDS. I have tried to perform NMDS on a set of samples (2 environments, each made up of 4-5 zones, each comprising 1-3 “replicates”).

The grouping on NMDS plot is interesting (apart from two failed samples).

I produced it like that:

dist.shared(shared=current, calc=braycurtis)
nmds(phylip=pomm-16S.braycurtis.0.03.lt.dist)
in LibreOffice, plot pomm-16S.braycurtis.0.03.lt.nmds.axes

I would like to know which groups (first OTUs, and then species) contribute to those differences. I understand that there is some way to create a “biplot” showing both samples and groups (which are the variables that describe the groups) but I have not succeeded in getting this done following the guidelines in the older forum posts (including, among other, explanations from Kendra in 2016).

Yours,
Maxime

Hi there,

Try using the corr.axes function. The correlation with the first two axes would be the x and y coordinate positions for where the arrow head on a biplot would be positioned.

Pat

Hi Pat,
Thank you for the tip. In my case, there are 9086 OTUs (at 0.03), which means the “cloud” is quite dense. Are one OTU’s distance and/or p-values related to its “contribution” to the difference between samples?
As a consequence, should I sort them by distance or by p-value (and if so, of axis1 or of axis2)?
Or should I rather remove minor OTUs (based on the number of reads)? Or increase the threshold to reduce the number of OTUs?
Yours,
Maxime

First - I’d suggest only doing a biplot with PCoA data since the axes mean something whereas they don’t mean anything with NMDS. You’ll likely want to remove anything that isn’t significant on at least one of the axes. I’d then suggest only looking at the OTUs that have the longest vector for the two dimensions.

Thank you Pat! What do you mean “anything not significant on at least one of the axes”?

Anyway, I’m afraid it will take me some more time to investigate because pcoa() outputs rsq_1 (=0.34) < rsq_3 (=0.53) < rsq_2 (=0.56) whereas I would have expected rsq_1 > rsq_2 > rsq_3.

Other statistical tool could be a simper analyses between those two groups. In R, vegan does it, and will give you which OTUs contribute to the dissimilarity between groups. IMO, better than plotting the OTUs on top.

The Rsq values are the Rsq for using 1, 2, or 3 axes. So it should increase as you increase the number of axes

Hi Leocadio! Thank you for the suggestion. Right now I am too busy with my teaching to dive deep into R but I surely will investigate this option.

Pat, if I understand you well, I should get rsq_1 < rsq_2 < rsq_3 (and not the other way around as I wrote above). But I get rsq_1 < rsq_3 < rsq_2.

This is very briefly discussed in an older forum post (https://forum.mothur.org/t/pcoa-r-squared-error/1573) but in my case not so many distances are close to 1.

According to which criterion shall I select which OTUs (out of the 9083) to plot on the biplot? The length? The p-value (and if so, 1, 2 or 3)?

I’d use the p-value first and then the length

Pat

Thank you Pat, but which p-value? The ones with a low p-value for axis1 frequently have a high one for axis2 and/or 3, and vice-versa…

If either of them are significant then they would be a candidate for inclusion.

Pat