For 16S distance calculation, I am using:
dist.seqs( fasta = file.fasta, cutoff = 0.03, processors=8 )
It turns out that my file.fasta has about 240K sequences ( file size is about 3G ).
At this time, computations have been going on for >10h in our cluster.
I am worried that it may not complete even in 2 days time.
Is there a way to cope with such large no. of sequences.
Your timely input will be a great help. Thanks in advance.