read.dist hang-up

I’ve been trying to use MOTHUR to analyze some 454 datasets (~20,000 seqs total, 10 libraries of ~2000 seq/each). I’ve been successfully able to filter fasta-formatted alignments as well as generate distance matrices (using dist.seqs). However, when I attempt to load the distance matrices (either the merged libraries or individual libraries) using read.dist the program gets hung-up. I see the progression bars of the “reading matrix” progress for a few minutes then it just stops. Any guesses what might be the problem? I’ve tried setting cutoffs at 0.1, but that does not seem to help.
Distance matrices of a smaller library (ca 300 seq) can be read just fine. Also, I amended the “makefile” within mothur as directed for pyrosequencing analysis.
Thank you for any insight! I truly appreciate you making this wonderful resource available to the community and so well supported.

Hi Alison

What are the specs on your computer (Mac/PC/Linux, Processor, RAM)?

Opps, yes I could see how that information could be helpful:
Mac OS X 10.5.8
Processor: 3.06 GHz Intel Core 2 Duo
Memory: 4GB 1067 MHz DDR3

Seems like you have plenty of computer.

Can you provide me with the summary.seqs report on your .align file before you run dist.seqs?

Also, is the smaller library (~300 seqs) you tested a subset of your 454 dataset? If not, I would try to take only a couple hundred sequences from the 454 dataset and run them through the same treatment you tried with your large dataset. If this doesn’t work, there might be something wrong with your file format.

Hope this helps

Ah, yes, I think it was a formatting issue. I tried a subset of 100 seqs from 454 files and it did not work. The sequence designations seem to have been problematic (maybe too long?! 11 characters). I deleted several characters and it seems to be fine now.