I’m a Mothur beginner wondering if it is possible that running the “cluster” command on a simple (not particularly powerfull) computer takes only 49 sec (for 8462 unique seqs)? it is supposed to be a loooong step…
Here is what I did:
mothur > dist.seqs(fasta=majorque.pick.fasta, cutoff=0.15, processors=2)
Output File Name:
It took 224 to calculate the distances for 8462 sequences.
mothur > cluster(column=majorque.pick.dist, name=majorque.pick.names)
Reading matrix: |||||||||||||||||||||||||||||||||||||||||||||||||||
changed cutoff to 0.0540724
Output File Names:
It took 49 seconds to cluster
Here is what I did from the begining - following the SOP :
Uchime (tested with or without an additional align+filter after chimera removal)
Remove lineage (Mitochondria-Chloroplast-Archaea-Eukarya-unknown)
Everywhere I did follow the SOP and use the parameters as recommended, except for the following:
- I used 360-720 as parameters for trim.flows (if using 450 flows, more than 50% of my sequences ended in the scrap file)
- I did not use “trump=.” when I first filtered the sequences because it shortened the mean length of the sequences from 420 to 257 bp *** by the way I did not find an explanation on the wiki/forum, so any explanation here is also more than welcome, note that the sequences seem to align well, and overlap seems ok***. Anyway, after chimera removal, I tested an aditional align+filter step, this time using “trump=.” and results look the same with cluster command taking 70 sec.
Is there something I’m doing wrong?
Hoping I’m not missing something obvious here,
Thanks in advance for your help,