Hi,
I found a large discrepancy between the output for jclass using the summary.shared or dist.shared and the trees from tree.shared. I found that the distance matrix generated with dist.shared gives exactly 1-jclass for the respective comparison in the summary.shared output, as expected. My problem is that the tree built with tree.shared based on jclass does neither correspond to the distance matrix nor the summary.shared output. My expectation was that samples with smallest distance (highest similarity) from the distance matrix should cluster together in the tree. Indeed the two samples with the smallest Jaccard distance end up on different parts of the Jaccard tree. Am I not getting what UPGMA does or am I not getting a more general part of the theory or is it a bug?
Thank you for comments,
Fabian
This is the tree:
((((69.6M:0.316176,64.6M:0.316176):0.00804999,(68.6M:0.287879,82.6M:0.287879):0.0363477):0.0209049,80.6M:0.345131):0.0352789,((((63.6M:0.298173,(59.6M:0.280303,56.6M:0.280303):0.0178697):0.0131899,74.6M:0.311363):0.0167578,81.6M:0.328121):0.0424201,(67.6M:0.357143,65.6M:0.357143):0.0133978):0.00986964):0.11959;
The comparison 65.6M vs. 63.6M should be very close in the tree because 0.560606 is the smallest distance in the Matrix but it is not
This is the distance matrix:
12
56.6M
59.6M 0.575758
63.6M 0.675676 0.699115
64.6M 0.756757 0.704225 0.771186
65.6M 0.695238 0.695238 0.560606 0.750000
67.6M 0.705128 0.705128 0.787402 0.714286 0.736842
68.6M 0.689655 0.689655 0.771429 0.878788 0.797980 0.814286
69.6M 0.666667 0.651685 0.648438 0.703297 0.630252 0.704082 0.785714
74.6M 0.732143 0.796610 0.566176 0.833333 0.678571 0.835938 0.796117 0.687500
80.6M 0.631579 0.683544 0.669492 0.756098 0.663636 0.738636 0.656250 0.659794 0.722689
81.6M 0.663265 0.663265 0.573643 0.656250 0.619048 0.710280 0.771739 0.610619 0.623077 0.657143
82.6M 0.639344 0.639344 0.752294 0.693548 0.714286 0.636364 0.725490 0.702381 0.809091 0.632353 0.736842