mgcluster questions

rikander · March 2, 2012, 11:15pm

Hi there,

I have been using mgcluster on a MacOSX with 8 processors and 8 GB of RAM. I’ve been running the following mgcluster process:

mgcluster(blast=HCM_virome_pooled_vs_itself.blastp, name=HCM_virome_pooled.names, method=furthest, cutoff=0.30)

With a blastp input file of about 1.4GB. It’s been running for about four weeks now, and rather than going faster with time, it seems to be going more slowly (it has clustered to 0.20 so far). Since this is taking forever, I’m wondering if there may be issues with the software, or whether this is simply the nature of the beast. If so, is it possible to implement a way to run mgcluster on multiple processors in parallel? I’m also wondering whether memory might be restricting this process. I’ve heard it may be possible to read the distance matrix to a file rather than to memory, and if so, is there a way to do that in mgcluster?

I’m just now wading into using mgcluster, and it seems like it could have great applications for what I’m interested in doing, but I’m trying to figure out my options regarding the speed issue.

Thanks!
Rika

Topic		Replies	Views
mgcluster input? Commands in mothur	1	2729	May 11, 2010
mgcluster version issues Commands in mothur	4	4132	May 12, 2011
mgcluster Commands in mothur	3	4171	March 15, 2010
Clustering a 10GB distance matrix mothur bugs	2	4197	March 16, 2011
Memory requirements for clustering 61GB distance file Commands in mothur	9	9880	December 15, 2011

mgcluster questions

Related topics