*.unique.filter.phylip.fn.list file should be how big?

I have a 26 GB in size *unique.filter.phylip.dist file and used cluster command to have *.unique.filter.phylip.fn.list … It is taking 5 hours now and have this view

********************###########
Reading matrix: |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


and still waiting. How much big will be the .list file and would it take a while more?

It looks like you are using a phylip-formatted distance matrix with furthest neighbor (why not average?). Are you using cluster.classic, cluster, or cluster.split?

If you have to run a phylip matrix like this, it could take a while depending on how similar the sequences are. I’d estimate that it could take up to 2 weeks. Alternatively, I’d suggest using the cluster.split approach as outlined in the MiSeq SOP on the wiki.

Pat

First of all Thenk you for your quick reply,

I use cluster. command and chose furthest randomly casue I could’t really guess which one to use. At the and I have this table which goes to only Genus LEvel but I would like to reach to species level, How could I do that Dr. Schloss? And here is the part of table I have is everything ok?

OTU Size Taxonomy
Otu00001 25137 Bacteria(100) Proteobacteria(100) Betaproteobacteria(100) Burkholderiales(100) Alcaligenaceae(100) Bordetella(99)
Otu00002 15182 Bacteria(100) Proteobacteria(100) Alphaproteobacteria(100) Rhodobacterales(100) Rhodobacteraceae(100) Paracoccus(100)
Otu00003 2758 Bacteria(100) Proteobacteria(100) Betaproteobacteria(100) Burkholderiales(100) Alcaligenaceae(100) Bordetella(99)
Otu00004 2493 Bacteria(100) Proteobacteria(100) Alphaproteobacteria(100) Rhodobacterales(100) Rhodobacteraceae(100) Paracoccus(100)
Otu00005 1516 Bacteria(100) Proteobacteria(100) Betaproteobacteria(100) Burkholderiales(100) Alcaligenaceae(100) Bordetella(77)
Otu00006 1436 Bacteria(100) Proteobacteria(100) Alphaproteobacteria(61) Rhodobacterales(52) Rhodobacteraceae(52) unclassified(100)
.

I would encourage you to follow the SOPs that are available on the wiki. The default for cluster and cluster.split is average neighbor, which is far preferred to any other method. For you other question, see your other post: How can I classify OTUs to "SPECIES" level with mothur?