precision and clustering

I’m working with someone who is attempting to use pyrotags to look at microOTUs (sub 3% difference). He started playing with the precision option in clustering and has found significantly different fine clusters when he increases the precision to 1000. It seems to me that you couldn’t calculate a precision of 1000 for sequences that are only 200 bp long so I don’t understand how that would influence the clustering. I’ve read your 2011 paper and don’t know how useful this exploration is but he’s already done it, I’m just trying to understand why he’s getting clusters he “likes” with precision=1000 but not with the default.

This is probably so far into the weeds that I’m not sure why anyone would really care. At some point between dotur and now, we changed the way we deal with the cutoff - previously we would round the distance to the level of precision. Now we don’t round because we got sick of explaining why we round. Anyway, if you have a 1 base out of 250 base difference that is a distance of 0.004. If your precision is 100, then that gets truncated to 0.00. If your precision is 1000 then it’s 0.004. Viola ~ the difference. As for why someone would “like” one over another, I’m not sure. Part of the concept behind OTUs is that it removes this type of subjectivity that plagues traditional taxonomy definitions (see Bacillus thuringiensis, anthracis and cereus).

Hope that’s helpful…

yes thanks, I think I can discuss the results a bit better now.

I agree on the “liking” a particular result doesn’t make it true but that’s for another more philosophical discussion which would probably require beer.