is it feasible to process 8M reads with pre.cluster

yangfa1 · March 22, 2016, 2:03am

I am following MiSeq SOP. I have ~8M reads in each group covering the ~1500 bp of 16S rRNA. So this 8M reads from one group can only be processed by one core. I found it took 7.5 hr to process 50000 reads. If the time required is proportional to the number of reads, it takes at least 1200hrs to process 8M reads, which makes this step unrealistic. Could you please suggest a solution? Maybe I have to skip this step? Or I can break 8M to 100 parts and process them in parallel, which could be better than skip this step?
Thanks.

pschloss · March 24, 2016, 11:56am

Can you split the reads by sample group? Have you run the sequences through unique.seqs? If you can split them then you can use multiple processors to precluster the samples separately.

Pat

Topic		Replies	Views
Problem in run pre.cluster Commands in mothur	2	1395	March 21, 2016
pre.cluster taking a long time mothur bugs	8	2641	May 10, 2017
Feedback on a pre.cluster issue workaround for processing ITS sequences Commands in mothur	2	610	November 1, 2019
pre.cluster only uses 1 processor Commands in mothur	2	782	September 25, 2017
Pre.Cluster uses too much memory, time Commands in mothur	8	2753	May 11, 2016

is it feasible to process 8M reads with pre.cluster

Related topics