I’m just trying to do a clustering of up to 17000 16S rRNA gene sequences obtained from public databases.
Cluster command using a cutoff of 0.2 using both 1.21 and 1.27 (windows) reported lower cutoff values (in the log file) than the ones within the distance matrix. The cutoff value in the log file was rather close to unique (<0.004), but in the dist matrix we could find values of 0.5 or other higher than the one in the logfile.
Furthermore, sometimes both mothur version also crashed.
Could these problems be related to a low RAM in the computer? or it’s just a mater of the input file since the sequences came from different 16S rRNA regions (different primers used to generate them)?
Thank you all
Marc and Mireia