I’ve run through another analysis and I’m now looking at how to report the values output by metastats (especially the mean of counts).
In one of my significant comparisons, metastats reports
rows = 3526, # col = 15, g = 7
OTU mean(group1) variance(group1) stderr(group1) mean_of_counts(group1) mean(group2) variance(group2) stderr(group2) mean_of_counts(group1) p-value
1 152.960007 1016.466938 13.015804 359.000000 114.408821 1223.554955 11.659783 423.444444 0.000000
2 40.454650 159.847676 5.161519 88.000000 29.147686 174.973284 4.409249 113.666667 0.000001
3 2.173682 2.684413 0.668881 5.500000 6.416141 48.313402 2.316928 27.333333 0.000000
As you can see, in the first OTU the mean of group 1 and 2 are 153 and 114, mean of counts are 359 and 423. In OTU2 we get means of 40 and 29 and mean of counts of 88 and 114.
I assume the significance is based on the difference in unique sequences in the analysis and the mean of counts is a report on the mean of total sequences. What can I say about these OTU’s given that the mean of counts shows an opposite trend than the mean(groups)?
So the the first - mean(group1) - is a normalized mean so that each group in the analysis has the same number of sequences. I think the second - mean_of_counts(group1) - is the raw mean without normalization. The analysis is performed on the first column.
Hope this helps…
Hi there. I finally got around to making a design file and using the metastats command in mothur. I was blown away by all of the near-instant comparisons that used to take me hours to upload. However, the results of mothur’s implementation of metastats differ from the results I get by manually uploading a pairwise comparison to the metastats website. Here are the results of a given comparison computed by the metastats server (I should note that these are phylotype data):
Name mean(group1) mean(group2) pvalue qvalue
Otu028 0.00104441 0.004230912 0 0
Otu010 0.013421655 0.000458172 0.000511811 0.317814449
Otu001 0.021862553 0.00523187 0.353913386 1
Otu002 0.041109759 0.019840739 0.181850394 1
Otu003 0.036736316 0.003925142 0.064409449 1
Otu004 0.485380281 0.461616572 0.794818898 1
…= one phylotype with significant q-value
And here are the results of analyzing the same data using mothur’s implementation of metastats:
OTU mean(group1) mean(group2) p-value q-value
1 98.422661 23.467353 0 0
2 185.001016 89.417507 0 0
3 165.209313 17.812766 0 0
6 59.5982 103.927769 0 0
7 19.643869 95.92188 0 0
28 4.828263 19.148579 0 0
…= 33 phylotypes (out of ~200) with significant q-value!
The means are reported differently by each program, but as you can see from the p and q values the results from mothur are much less stringent/much more noisy. Do you have any explanation for this? I am 99% sure that it is not an error in the design file because the means reported in the mothur metastats outputs are correct for a given OTU in a given comparison, indicating that mothur analyzed the samples I told her to analyze. I am inclined to “believe” the results from the metastats server based on looking at the input data (a fair bit of variance per phylotype across samples).
My questions are, why the discrepancy, and how do I fix it? I appreciate you thinking about this.
Could you send your shared and design files to email@example.com so I can take a look?
Thanks for bringing this to our attention! The C code we use in mothur was provided by the original authors (White and Pop); however, it does seem that there are differences in pvalues for low frequency OTUs when you compare the C and R code. The next version that we post will have metastats code that more faithfully corresponds to the original R code.
I’m so glad that you found the source of the problem, and that it’s not me or my data!
Thank you for addressing this problem, and as always I’m looking forward to the next release of mothur.