Different results make.contigs 1.47 vs older version

Alexandre_Thibodeau · March 15, 2022, 6:27pm

Hello!

I am updating a previous run to Mothur 1.47 so that I can start using it as a basis to build a database with cluster.fit.

There is some differences with version 1.47.

firstly, I am keeping a lot more of sequences with the new version. There is also ambig bases after make.contig that remains. So far these are the main differences. Any idea Why? I am posting both for comparison. The one difference is that I am setting the seed to 100 in v 1.47 before contig, precluster and clustering steps for reproducibility. Is it the fixed seed that is affecting the process?

Many thanks.

Old (1.44 I think)
Setting logfile name to megacampy_logFile_clustersplit

mothur > make.contigs(file=megacampy.files, oligos=primers.oligo.txt, checkorient=t, pdiffs=2, deltaq=5)

…

mothur > screen.seqs(fasta=current, group=current, summary=current, maxambig=0, maxhomop=70)

…

unique.seqs(fasta=current)

……

mothur > summary.seqs(fasta=current, count=current)

Using 32 processors.

Start End NBases Ambigs Polymer NumSeqs

Minimum: 1 49 49 0 3 1

2.5%-tile: 1 252 252 0 3 1547809

25%-tile: 1 253 253 0 4 15478090

Median: 1 253 253 0 4 30956180

75%-tile: 1 253 253 0 5 46434269

97.5%-tile: 1 253 253 0 6 60364550

Maximum: 1 463 463 0 70 61912358

Mean: 1 252 252 0 4

# of unique seqs: 4672367

total # of seqs: 61912358

It took 150 secs to summarize 61912358 sequences.

align.seqs(fasta=current, reference=silva.nr_v132.pcr.align, flip=t)

……..

Start End NBases Ambigs Polymer NumSeqs

Minimum: 0 0 0 0 1 1

2.5%-tile: 1968 11550 252 0 3 1547809

25%-tile: 1968 11550 253 0 4 15478090

Median: 1968 11550 253 0 4 30956180

75%-tile: 1968 11550 253 0 5 46434269

97.5%-tile: 1968 11550 253 0 6 60364550

Maximum: 13425 13425 418 0 50 61912358

Mean: 1968 11549 252 0 4

# of unique seqs: 4672367

total # of seqs: 61912358

1.47
mothur > set.seed(seed=100)

Setting random seed to 100.

mothur > make.contigs(file=megacampy.files, oligos=primers.oligo.txt, checkorient=t, pdiffs=2, deltaq=5, maxambig=0, maxhomop=50)

……..

mothur > unique.seqs(fasta=current)

……

mothur > summary.seqs(fasta=current, count=current)

Using megacampy.trim.contigs.count_table as input file for the count parameter.

Using megacampy.trim.contigs.unique.fasta as input file for the fasta parameter.

Using 32 processors.

Start End NBases Ambigs Polymer NumSeqs

Minimum: 1 23 23 0 3 1

2.5%-tile: 1 252 252 0 3 1968628

25%-tile: 1 253 253 0 4 19686276

Median: 1 253 253 0 4 39372552

75%-tile: 1 253 253 0 5 59058827

97.5%-tile: 1 253 253 12 6 76776475

Maximum: 1 463 463 189 230 78745102

Mean: 1 252 252 1 4

of unique seqs: 18276278

total # of seqs: 78745102

……….

mothur > align.seqs(fasta=current, reference=silva.nr_v132.pcr.align, flip=t)

Start End NBases Ambigs Polymer NumSeqs

Minimum: 0 0 0 0 1 1

2.5%-tile: 1968 11550 252 0 3 1968628

25%-tile: 1968 11550 253 0 4 19686276

Median: 1968 11550 253 0 4 39372552

75%-tile: 1968 11550 253 0 5 59058827

97.5%-tile: 1968 11550 253 12 6 76776475

Maximum: 13425 13425 452 94 95 78745102

Mean: 1968 11549 252 1 4

of unique seqs: 18276278

total # of seqs: 78745102

Alexandre_Thibodeau · March 17, 2022, 7:47pm

Hello, just an update.

SO I stopped the job and restarted it with modifications. Turns out that the seed is not the problem. screening against adding maxambig=0 to screen .seq (after alignment) put things back to normal. Looks like make.contigs is not able to remove all ambigs.

Cheers!

westcott · March 21, 2022, 1:51pm

Thanks for reporting this bug. The removal of ambiguous bases and homopolymers in make.contigs is not triggered unless you add the maxlength option. This bug will be corrected in our next release.

Alexandre_Thibodeau · March 21, 2022, 2:02pm

Glad I was of some use.

system · March 31, 2022, 2:03pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
make.contigs skips some groups in 1.34.4 version? mothur bugs	3	2500	February 12, 2015
Mac/Windows Variation in the results of make.contigs mothur bugs	4	3785	March 13, 2015
make contigs trouble mothur bugs	5	1522	November 23, 2017
make.contigs - number of seqs in fasta does not match .groups file Commands in mothur	4	1878	February 2, 2016
make.contigs issue Commands in mothur	2	1183	September 2, 2016

Different results make.contigs 1.47 vs older version

of unique seqs: 18276278

of unique seqs: 18276278

Related topics