long processing time chimera.uchime?

sekhwal · March 13, 2017, 4:13am

Hi, i am running my 18s RNA 21 samples all together with silva.seed_v123 database. I am just getting 23h processing time on one group with chimera.uchime command. Could let me know is it normal?

pschloss · March 13, 2017, 1:17pm

It’s hard to diagnose problems if we don’t have the exact command you are running. Can you provide more details? Also, you seem to be firing off a bunch of posts to forums. If you could ask a single question about a single problem that would make things much easier for us to follow.

sekhwal · March 13, 2017, 8:56pm

Hi Pat,

Sorry to post lots of questions in the forums. I will manage my queries.

Please have a look into below process command. Pipeline took 29h to find chimeras from a group JCT40_3_S25.

mothur >
chimera.uchime(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table, dereplicate=t)

Using 1 processors.

Checking sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta …

It took 5843 secs to check 16062 sequences from group JCT28_2_S24.
It took 1158 secs to check 6828 sequences from group JCT3_3_S23.
It took 105137 secs to check 75188 sequences from group JCT40_3_S25.

dwaite · March 13, 2017, 10:01pm

That’s an unfortunate run-time for the group, but I’m not sure it’s a bug. If you compare the times of each samples, you’ll notice there’s a direct correlation between the number of sequences in a sample and it’s run time. I.e., JCT3_3_S23 has ~7,000 sequences and took ~1,100 seconds to run. JCT28_2_S24 has about twice as many sequences, and took ~5,800 seconds. Your ‘problem’ sample has about 5 times as many sequences as JCT28_2_S24 so obviously will take longer to run.

I don’t know the exact growth rate of the uchime algorithm, but since it involves comparing all of your rare sequences to all the abundant sequences I would expect it to grow roughly exponentially as samples get bigger.

Kendra · March 14, 2017, 9:57pm

I agree with dwaite, thats pretty normal for checking tens of thousands of seqs/sample. I’ve considered subsampling the samples that win the pooling lottery and get an order of magnitude more seqs than i need

Topic		Replies	Views
chimera.uchime Commands in mothur	3	1851	February 20, 2015
chimera.uchime running for ever Integrating mothur with other programs	1	1594	May 11, 2017
suggestions for large files for uchime de novo Commands in mothur	2	2079	March 13, 2015
error with filter.seqs or chimera.uchime??? mothur bugs	3	1453	March 13, 2017
Uchime processes Commands in mothur	6	2943	May 13, 2016

long processing time chimera.uchime?

Please have a look into below process command. Pipeline took 29h to find chimeras from a group JCT40_3_S25.

Related topics