Hi, I am following MiSeq SOP steps to process 16S rRNA data. my goal here is to run pre.cluster with MULTIPLE cores. However, if I submit job without group= option, I got error: “When using running without group information mothur can only use 1 processor, continuing.”
So I provided the latest group file generated by the screen.seqs step right after make.contigs. I got many different errors which list read names that found in the group table, but not in the fastq file. But this is expected since many duplicated reads were removed from the input fastq for pre.cluster (those reads only appear in names file, I guess).
Did I miss something here? Can you confirm that pre.cluster can run with multiple cores?
Thanks a lot.
update:
After searching the posts, I found a solution that list.seqs and get.seqs can generate updated group, name files. So, now pre.cluster runs with group file.
However, the remaining problem is, although I used processors=5 option, I only see one core is running with 87GB memory used for it. Can pre.cluster use multiple cores? If yes, how to do it?
Thanks.
For it to use multiple cores, you have to provide multiple groups. It puts each group on a different core. If you have a group that is much larger than others, it will likely take longer to process than the others.
Pat