Problem with hcluster?

cfriedline · December 4, 2009, 2:19pm

Hello,

I ran hcluster on our large distance matrix (85699 seconds!), but see interesting cutoff values when things start to get far apart. We’re not sure if these are meaningful for our analysis yet, but wanted to bring it to your attention. When I read the list file created by the cluster (with associated groups file) using

read.otu(list=..., groups=...)

I get the following output:

...
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.01
0.00
1.01
1.02
1.03
0.00
1.03
1.04
0.00
...
2.55
2.56
0.00
2.56
0.00
2.56
0.00
2.56
...

Is this an error in the code, or should I expect multiple zeros like this? It continues in this way until all distances are found. Also, what about the duplicates at 2.56? Is this because I’m using reads that are unique across all samples?

Thanks,
Chris

pschloss · December 4, 2009, 5:18pm

Acutally, my bigger concern is how you got distances >0.30… I suspect you have a number of sequences that don’t overlap and give large distances. To check this you should run summary.seqs on the alignment file that you run dist.seqs on. You’ll probably find a number of sequences that end before most sequences start and vice versa.

Regardless, the bug is weird. Is there any way that you could either post online or email the distance matrix to us? Alternatively, you could email us (mothur.bugs@gmail.com) the sequences and the list of commands you run to get to the hcluster step. This is a new command so its entirely possible that the kinks aren’t all ironed out yet.

Pat

cfriedline · December 4, 2009, 8:28pm

Distances larger than 0.3 are a result of using my own matrix of paralinear distances (I posted something similar to the commands board) instead of a %-based distance metric like those from dist.seqs. I’ll check about the matrix, it’s pretty big (>5G).

Topic		Replies	Views
hcluster, average neighbor, large distance matrix mothur bugs	1	4232	June 29, 2011
Computer Issues with hcluster Commands in mothur	2	2956	May 24, 2011
Segmentation fault when clustering a 1.44 GB dist file mothur bugs	5	135479	November 14, 2009
Distance matrix issues - still running Commands in mothur	6	722	October 14, 2019
Clustering a 10GB distance matrix mothur bugs	2	4197	March 16, 2011

Problem with hcluster?

Related topics