cluster.split running out of RAM

We have been struggling to complete the cluster.split command in our workflow. The command we are using is:

cluster.split(fasta=FieldTwo.trim.contigs.good.unique.good.filter.precluster.unique.pick.pick.fasta,
count=FieldTwo.trim.contigs.good.unique.good.filter.precluster.unique.uchime.pick.pick.count_table, taxonomy=FieldTwo.trim.contigs.good.unique.good.filter.precluster.unique.pick.pds.wang.pick.taxonomy,
splitmethod=classify, taxlevel=5, cutoff=0.15, large=T, processors=2)

We have increased the RAM to 128GB and set large=T, it now runs longer (6 days rather than 3) before running out of RAM but ultimately it does. Any advice on what we might do to reduce the RAM requirements for this command? I’m not sure what statisics you would need about our input to quantify the RAM requirement but I’m happy to send anything you need to do that.

thanks,
charlie

First - do not use large=T. :slight_smile:

Second, these typings typically happen with MiSeq data because you didn’t sequence a region where you can get full overlap and thus full denoising of the data. If this is the case then you might just be stuck with running phylotype.

Pat