Regarding cluster command

I m using 1.46.1 and giving cluster command which is killing my dist file size around 650Gb

cluster(column=final.dist, count=final.count_table, cutoff=0.03


Unfortunately it is not useful to only show this incomplete line of code (by the way you are missing the last parenthesis).

For example, what does the logfile say? Whats is the situation before the command?

You need to look at the logfile and think “does that makes sense”?

If your distance matrix is 650 GB it is unlikely to go through cluster, it will simply use way too much RAM. You might want to check out this blog post…


I did not spot that, indeed 650Go is an impossibly fat distance matrix…