Problem in run pre.cluster

yangfa1 · March 18, 2016, 8:04pm

Hi, I am following MiSeq SOP steps to process 16S rRNA data. my goal here is to run pre.cluster with MULTIPLE cores. However, if I submit job without group= option, I got error: “When using running without group information mothur can only use 1 processor, continuing.”
So I provided the latest group file generated by the screen.seqs step right after make.contigs. I got many different errors which list read names that found in the group table, but not in the fastq file. But this is expected since many duplicated reads were removed from the input fastq for pre.cluster (those reads only appear in names file, I guess).
Did I miss something here? Can you confirm that pre.cluster can run with multiple cores?
Thanks a lot.

yangfa1 · March 19, 2016, 9:09pm

update:
After searching the posts, I found a solution that list.seqs and get.seqs can generate updated group, name files. So, now pre.cluster runs with group file.
However, the remaining problem is, although I used processors=5 option, I only see one core is running with 87GB memory used for it. Can pre.cluster use multiple cores? If yes, how to do it?
Thanks.

pschloss · March 21, 2016, 6:34pm

For it to use multiple cores, you have to provide multiple groups. It puts each group on a different core. If you have a group that is much larger than others, it will likely take longer to process than the others.

Pat

Topic		Replies	Views
pre.cluster taking a long time mothur bugs	8	2619	May 10, 2017
An error occurs while running pre.cluster command Commands in mothur	7	732	February 13, 2023
pre.cluster only uses 1 processor Commands in mothur	2	766	September 25, 2017
Pre.cluster not working and quit mothur mothur bugs	6	1096	August 9, 2019
is it feasible to process 8M reads with pre.cluster Commands in mothur	1	1130	March 24, 2016

Problem in run pre.cluster

Related topics