Hi. I am analyzing 8 samples. Two sediments (S), two non-contaminated soils § and four contaminated soils ©. I built the design file as:
M1 C
M2 C
M8 P
M24 C
M28 C
M35 P
M40 S
M57 S

Then, I ran the tree.shared command on my subsampled dataset (jclass calculator) getting this tree that shows clear difference between the sediments and soils and between these last, the contaminated and non-contaminated soils.

Then, I tested the tree with the parsimony command and got this result:

Tree# Groups ParsScore ParsSig
1 C-P 1.000000 0.1680
1 C-S 1.000000 0.1610
1 P-S 1.000000 0.3580

How are these probabilities considered to decide if the groups I defined are significantly different or not? from the tree I’d expected probabilities below 0.05 and now I’m confused and don’t know how to interprete these results. Any explanation that can help me? Thank you!

The experts may provide more info but you have quite limited power with such small sample sizes for each group, so it’s not surprising. You’d need pretty profound and consistent differences between groups to get a P<0.05 with such a small study population.


Part of the problem is that there are only so many ways to randomly label a tree with 8 branches and 3 groupings. As you are finding, even with perfect separation of the groups, you cannot get a p-value less than 0.05. There aren’t 20 ways to do the random labeling. These are the breaks with non-parametric methods. Ass Scott mentioned, you need more replicates and/or fewer treatments.

Thank you very much Pat and Scott! When I use the branch lenght with unifrac it gets better, and now with your explanations I understand more what is happening. Excuse my silly questions but statistics is not my best :? :roll: :oops: