cluster.split with opticlust

Kendra · February 1, 2017, 7:24pm

I’m trying to recluster a big dataset with opticlust. Couple of questions.

cluster.split(fasta=current, count=current, taxonomy=current, splitmethod=classify, taxlevel=4, cutoff=0.15, processors=16, cluster=f)

cluster.split(file=current, processors=4)

Dist size and RAM. Opticlust has the same restrictions correct? the entire dist must be read into RAM? If I’m using 4 processors, do I need 4x as much RAM as my largest dist or 1x?

I have a 212gb trim.contigs.good.unique.good.filter.precluster.pick.pick.dist
and many <60GB trim.contigs.good.unique.good.filter.precluster.pick.pick.fasta.4.dist
is cluster.split going to do anything with the larger non-numbered.dist?

In the second cluster.split command, I’m getting this. Why is it saying I didn’t set cutoff and why using 0.03?

Clustering bb16.oc.trim.contigs.good.unique.good.filter.precluster.pick.pick.fasta.9.dist

tp tn fp fn sensitivity specificity ppv npv fdr accuracy mcc f1score
531415 374828924 162459 391692 0.575681 0.999567 0.7658670.998956 0.234133 0.998526 0.663297 0.657293 


You did not set a cutoff, using 0.03.

pschloss · February 1, 2017, 9:05pm

There might be a small bug with using the file option in cluster.split. By default we use a cutoff of 0.03. When you do the first step (or dist.seqs) you can do 0.03. This will save a ton of disk space and RAM.

Pat

Topic		Replies	Views
cluster problem with opticlust mothur bugs	5	1488	February 6, 2017
Cluster.split issue "Num_Dists_Below_Cutoff" Commands in mothur	4	1158	March 14, 2019
Quality of clustering: Cluster.split versus Cluster commands Theory behind mothur	2	558	April 2, 2020
Cluster Split running for a few days. Should I stop? Commands in mothur	1	710	October 20, 2017
Dist.seqs running for many days/large file Commands in mothur	8	1536	April 26, 2020

cluster.split with opticlust

Related topics