cluster.split running out of RAM

charliep · February 22, 2014, 11:59pm

We have been struggling to complete the cluster.split command in our workflow. The command we are using is:

cluster.split(fasta=FieldTwo.trim.contigs.good.unique.good.filter.precluster.unique.pick.pick.fasta,
count=FieldTwo.trim.contigs.good.unique.good.filter.precluster.unique.uchime.pick.pick.count_table, taxonomy=FieldTwo.trim.contigs.good.unique.good.filter.precluster.unique.pick.pds.wang.pick.taxonomy,
splitmethod=classify, taxlevel=5, cutoff=0.15, large=T, processors=2)

We have increased the RAM to 128GB and set large=T, it now runs longer (6 days rather than 3) before running out of RAM but ultimately it does. Any advice on what we might do to reduce the RAM requirements for this command? I’m not sure what statisics you would need about our input to quantify the RAM requirement but I’m happy to send anything you need to do that.

thanks,
charlie

pschloss · February 25, 2014, 9:45pm

First - do not use large=T.

Second, these typings typically happen with MiSeq data because you didn’t sequence a region where you can get full overlap and thus full denoising of the data. If this is the case then you might just be stuck with running phylotype.

Pat

Topic		Replies	Views
Stuck at cluster.split -- how do I overcome RAM issue? Commands in mothur	12	12755	August 20, 2013
Cluster.split runtime problem Commands in mothur	6	3877	November 24, 2015
cluster and cluster.split Commands in mothur	8	6865	September 18, 2013
cluster.split killed Commands in mothur	1	2903	August 28, 2013
cluster.split error mothur bugs	2	2217	March 14, 2015

cluster.split running out of RAM

Related topics