Whittaker's similarity coefficient

There seems to be something wrong with the calculation of Whittaker’s similarity coefficient.

My sample A has 126 observed OTUs.
Sample B has 133 observed OTUs.
They have 57 shared OTUs.

The reported value using summary.shared() is 9.26, but this doesn’t seem to fit with the formula:

C = 2 * Stotal / (Sa + Sb) - 1
C = 2 * (126 - 57 + 133 - 57) / (126 + 133) - 1 = 0.12

hi,
I found the code for this calc may be not improved correctly in mothur 1.19…
:wink: so , I changed the whittaker.cpp file a little bit.
int countA = 0;
int countB = 0;
int countC = 0;
int sTotal = shared[0]->getNumBins();
for(int i=0;i<sTotal;i++){
if(shared[0]->getAbundance(i) != 0){ countA++; }
if(shared[1]->getAbundance(i) != 0){ countB++; }
if(shared[0]->getAbundance(i)==0&&shared[1]->getAbundance(i)==0){countC++;}
}

//data[0] = 2sTotal/(float)(countA+countB)-1;
data[0] = 2
(sTotal-countC)/(float)(countA+countB)-1;
hopefully this code works well.

I’m afraid I don’t think that is correct…

What you added:

if(shared[0]->getAbundance(i)==0&&shared[1]->getAbundance(i)==0){countC++;}

will calculate the number of shared OTUs, but that is not necessary to calculate Whittaker (see the formula at http://www.mothur.org/wiki/Whittaker). You suggested altering our formula to:

data[0] = 2*(sTotal-countC)/(float)(countA+countB)-1;

Which is saying 2(S_total-S_shared) / (S_A + S_B) - 1, which is not correct.

To return to the original example…

S_shared = 57
S_A = 126
S_B = 133
S_total = 126 + 133 - 57 = 202

index = 2 * 202 / (126 + 133) - 1 = 0.56

sorry for long time reply :oops: as i just used the command once and never looked back…

I find it is interesting and take a easy test.
I made one face shared file:
0.03 a 25 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
0.03 b 25 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 0 0 0 0 0 0
0.03 c 25 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1

in it sample a has 15 otus, b has 7otus and they shared 4 otus. a,b,c have 25 otus

for the formula at the bottom in the whittake page:
2*18/15+7-1=0.636363

and use the original mothur19
the result
label comparison whittaker
0.03 a b 1.272727 <- i think it comes from: 2*25/(15+7)-1=1.272727273
0.03 a c 0.785714
0.03 b c 1.500000

the changed mothur19

label comparison whittaker
0.03 a b 0.636364
0.03 a c 0.785714
0.03 b c 0.500000

and I noticed that the values for a-c group are same.

so I guess the
"int sTotal = shared[0]->getNumBins(); "
read the whole otus number among all samples,

i added
“if(shared[0]->getAbundance(i)==0&&shared[1]->getAbundance(i)==0){countC++;}”
for count the otu that don’t have a or b reads.

because the page did not show the example of more than 3 samples situation, I may misunderstood the theory.
:oops: