cluster.split

Hi there,
I tried to run cluster.split on my MiSeq data, but was unable to complete it as the temporary files produced took up approximately 21TB. I just had a look at how many sequences I was trying to cluster (929,618 unique/6,416,820 total – which is down from approx 26 mil raw reads at the beginning of my analysis). Someone must have encountered this problem before, any insight as to what I can do?

Thanks in advance,
Jessica

I think this will explain what’s going wrong and perhaps some paths forward…

http://blog.mothur.org/2014/09/11/Why-such-a-large-distance-matrix%3F/

Pat