We have been struggling to complete the cluster.split command in our workflow. The command we are using is:
cluster.split(fasta=FieldTwo.trim.contigs.good.unique.good.filter.precluster.unique.pick.pick.fasta,
count=FieldTwo.trim.contigs.good.unique.good.filter.precluster.unique.uchime.pick.pick.count_table, taxonomy=FieldTwo.trim.contigs.good.unique.good.filter.precluster.unique.pick.pds.wang.pick.taxonomy,
splitmethod=classify, taxlevel=5, cutoff=0.15, large=T, processors=2)
We have increased the RAM to 128GB and set large=T, it now runs longer (6 days rather than 3) before running out of RAM but ultimately it does. Any advice on what we might do to reduce the RAM requirements for this command? I’m not sure what statisics you would need about our input to quantify the RAM requirement but I’m happy to send anything you need to do that.
thanks,
charlie