Run today on v1.42.3.
pre.cluster(fasta=iu_choneArchuniquealigngoodfilter.unique.good.fasta, count=iu_choneArchuniquealigngoodfilter.good.count_table, diffs=4, processors=6)
Partial output from the logfile:
Processing group HF4dpn:
LF2bkpn 32352 -46753 79105
Total number of sequences before pre.cluster was 32352.
pre.cluster removed 79105 sequences.
It took 14 secs to cluster 32352 sequences.
Processing group LF3bkm:
BH1bkpn 22640 -28136 50776
Total number of sequences before pre.cluster was 22640.
pre.cluster removed 50776 sequences.
It took 9 secs to cluster 22640 sequences.
Processing group BH1f:
LF3bkm 5379 -3117 8496
Total number of sequences before pre.cluster was 5379.
pre.cluster removed 8496 sequences.
It took 1 secs to cluster 5379 sequences.
It seems that it is just reversing the ‘numbers before pre.cluster’ and ‘numbers removed’ in the in the logfile, which makes more sense than removing more sequences than there are to remove. Also, the “processing group X” often doesn’t match the group stats on the next line, which I assume is just due to the overlapping outputs from the 6 processors. Again, the process seems to work fine.