problem with cluster.split...?

Hi! I am running mothur 1.33.3 on a Mac OS 10.8 with 12 processors and 24 GB RAM. My distance matrix is large (37 Gb) so I am running cluster.split with cluster=F. The command appears to run without error, but the output files are .temp and are not empty. There are no .list or .files files as I was expecting. I either want to know 1) how to use the .temp files, or 2) if there is an error that you can help me overcome it. My first thought was that the command was failing due to the size of my dataset, but I successfully ran cluster() with a different 38 Gb distance matrix on my PC using only 3 processors (it took over a day), so it seems to me that the problem is specific to either the Mac or the cluster.split command.

Thanks for thinking about it!


A distance matrix that large likely won’t fit in RAM and cluster, but with a cutoff and splitting you may be able to process it. Can you tell me what version you are using, on which OS? If you set cluster=f, then cluster.split won’t create a *.list file. It won’t cluster. It will only split the distance matrix, and create a bunch of *.number.tempdist files.

To answer you question regarding what I am operating,

Also, I am confused about an aspect of your answer:

This confuses me because on the command information page for cluster.split, the instructions say, “If you set cluster=F, mothur will generate a file containing a list of the split files.” So perhaps this page needs to be revised to more accurately reflect the output of cluster.split when cluster=F, and how to proceed using the output that mothur does provide?

Thank you for your help!