Why does this require the dist file? It seems that the representative sequences could be pulled from the fasta using just the list and names. I have a giant dist file (yes followed the sop, I just have a huge diverse dataset) that I can only work with on our server. I want to create 3,5,10% OTU databases for blasting but my other computer can’t read the distance matrix. What am I missing?

It needs the distance file to figure out which sequence is the minimum distance to all other distances.

ah, I see. is there anyway to just get the sequence that represents the most (numerically) uniques in an otu?