I tried to first run cluster.split and when it crashed, I tried dist.seq+cluster on 192 files (dist.seqs file is 1.7 Tb… is it normal?), which also crashed.
I notice that during the process, all the ram was used (32 gig) and also the swap ( 8 gig, on 256 gig SSD disk). Do you think I should reinstall my system with more swap? Or the 1.7 Tb file is not normal?
I tried increasing swap with an SSD and it still didn’t work. You’ll need to use cluster.split because to cluster a 1.7TB file, you’d need 1.7TB ram. You can look at your temp .dist files to see if it’s going to work-your biggest (.dist * # of processors) needs to me smaller than your RAM. With 32GB ram, you’ll probably want to only use one processor for the cluster.split command.