cluster.split with opticlust

I’m trying to recluster a big dataset with opticlust. Couple of questions.

cluster.split(fasta=current, count=current, taxonomy=current, splitmethod=classify, taxlevel=4, cutoff=0.15, processors=16, cluster=f)

cluster.split(file=current, processors=4)

Dist size and RAM. Opticlust has the same restrictions correct? the entire dist must be read into RAM? If I’m using 4 processors, do I need 4x as much RAM as my largest dist or 1x?

I have a 212gb trim.contigs.good.unique.good.filter.precluster.pick.pick.dist
and many <60GB trim.contigs.good.unique.good.filter.precluster.pick.pick.fasta.4.dist
is cluster.split going to do anything with the larger non-numbered.dist?

In the second cluster.split command, I’m getting this. Why is it saying I didn’t set cutoff and why using 0.03?

Clustering bb16.oc.trim.contigs.good.unique.good.filter.precluster.pick.pick.fasta.9.dist

tp tn fp fn sensitivity specificity ppv npv fdr accuracy mcc f1score
531415 374828924 162459 391692 0.575681 0.999567 0.7658670.998956 0.234133 0.998526 0.663297 0.657293 


You did not set a cutoff, using 0.03.

There might be a small bug with using the file option in cluster.split. By default we use a cutoff of 0.03. When you do the first step (or dist.seqs) you can do 0.03. This will save a ton of disk space and RAM.

Pat