summary.seqs error - count table not unique

Hi all - after getting over an initial problem with the format of my input .fastq files, I’ve gotten through many of the steps in the MiSeq SOP using my own data (paired-end, 250bp, 16S V4).

I can now get through the align.seqs and screen.seqs steps, but I get the following error when I run summary.seqs:

[ERROR]: Your count table contains more than 1 sequence named HWI-M00590_260_000000000-AGRFM_1_2114_16059_29324, sequence names must be unique. Please correct.

Reading through the forum, I found this post suggesting that it may be from a bug in the software (at least if it happens during the cluster.split). Is this still a bug, or have I done something wrong?

Best,

Dan

Could you post your log file so I can see the commands you ran?

Hi Sarah - see below for the full log up to the point where I get the error. Thanks in advance for helping troubleshoot this.

Best,

Dan



Mac version

Using ReadLine

Running 64Bit Version

mothur v.1.35.1
Last updated: 3/31/2015

by
Patrick D. Schloss

Department of Microbiology & Immunology
University of Michigan
pschloss@umich.edu
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type ‘help()’ for information on the commands that are available

Type ‘quit()’ to exit program
Script Mode


mothur > make.contigs(ffastq=FGC1279_s_1_R1.fastq, rfastq=FGC1279_s_1_R4.fastq, findex=FGC1279_s_1_R2.fastq, rindex=FGC1279_s_1_R3.fastq, oligos=MiSeq3_gz08jan16.oligos, bdiffs=1, processors=8)

Using 8 processors.
Making contigs…


mothur > summary.seqs(fasta=FGC1279_s_1_R1.trim.contigs.fasta, processors=8)

Using 8 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 237 237 0 3 1
2.5%-tile: 1 251 251 0 3 331631
25%-tile: 1 252 252 0 4 3316301
Median: 1 253 253 0 4 6632602
75%-tile: 1 253 253 2 5 9948902
97.5%-tile: 1 254 254 7 6 12933572
Maximum: 1 500 500 105 250 13265202
Mean: 1 255.988 255.988 1.36727 4.53383

of Seqs: 13265202

Output File Names:
FGC1279_s_1_R1.trim.contigs.summary

It took 63 secs to summarize 13265202 sequences.

mothur > screen.seqs(fasta=FGC1279_s_1_R1.trim.contigs.fasta, summary=FGC1279_s_1_R1.trim.contigs.summary, maxambig=0, maxlength=275, processors=8)

Using 8 processors.

Output File Names:
FGC1279_s_1_R1.trim.contigs.good.summary
FGC1279_s_1_R1.trim.contigs.good.fasta
FGC1279_s_1_R1.trim.contigs.bad.accnos


It took 121 secs to screen 13265202 sequences.

mothur > unique.seqs(fasta=FGC1279_s_1_R1.trim.contigs.good.fasta)
6728021 1016888

Output File Names:
FGC1279_s_1_R1.trim.contigs.good.names
FGC1279_s_1_R1.trim.contigs.good.unique.fasta

mothur > count.seqs(name=FGC1279_s_1_R1.trim.contigs.good.names, processors=8)

Using 8 processors.
It took 5 secs to create a table for 6728021 sequences.


Total number of sequences: 6728021

Output File Names:
FGC1279_s_1_R1.trim.contigs.good.count_table

mothur > summary.seqs(count=FGC1279_s_1_R1.trim.contigs.good.count_table, processors=8, fasta=FGC1279_s_1_R1.trim.contigs.good.unique.fasta)

Using 8 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 240 240 0 3 1
2.5%-tile: 1 252 252 0 3 168201
25%-tile: 1 252 252 0 4 1682006
Median: 1 253 253 0 4 3364011
75%-tile: 1 253 253 0 5 5046016
97.5%-tile: 1 253 253 0 6 6559821
Maximum: 1 275 275 0 58 6728021
Mean: 1 252.646 252.646 0 4.50301

of unique seqs: 1016888

total # of seqs: 6728021

Output File Names:
FGC1279_s_1_R1.trim.contigs.good.unique.summary

It took 7 secs to summarize 6728021 sequences.

mothur > pcr.seqs(fasta=silva.nr_v123.align, start=11894, end=25319, keepdots=F, processors=8)

Using 8 processors.

Output File Names:
silva.nr_v123.pcr.align


It took 95 secs to screen 172418 sequences.

mothur > summary.seqs(fasta=silva.nr.v123.fasta, processors=8)

Using 8 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 9876 212 0 3 1
2.5%-tile: 1 13425 292 0 4 4311
25%-tile: 1 13425 293 0 4 43105
Median: 1 13425 293 0 5 86210
75%-tile: 1 13425 293 0 5 129314
97.5%-tile: 1 13425 459 1 6 168108
Maximum: 4225 13425 1122 5 16 172418
Mean: 1.14112 13424.9 308.769 0.0498904 4.74949

of Seqs: 172418

Output File Names:
silva.nr.v123.summary

It took 20 secs to summarize 172418 sequences.

mothur > align.seqs(fasta=FGC1279_s_1_R1.trim.contigs.good.unique.fasta, reference=silva.nr.v123.fasta, processors=8)

Using 8 processors.

Reading in the silva.nr.v123.fasta template sequences… DONE.
It took 53 to read 172418 sequences.
Aligning sequences from FGC1279_s_1_R1.trim.contigs.good.unique.fasta …
Some of you sequences generated alignments that eliminated too many bases, a list is provided in FGC1279_s_1_R1.trim.contigs.good.unique.flip.accnos. If you set the flip parameter to true mothur will try aligning the reverse compliment as well.
It took 1156 secs to align 1016888 sequences.


Output File Names: FGC1279_s_1_R1.trim.contigs.good.unique.align FGC1279_s_1_R1.trim.contigs.good.unique.align.report FGC1279_s_1_R1.trim.contigs.good.unique.flip.accnos
mothur > summary.seqs(fasta=FGC1279_s_1_R1.trim.contigs.good.unique.align, count=FGC1279_s_1_R1.trim.contigs.good.count_table, processors=8)

Using 8 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 3 1 0 1 1
2.5%-tile: 1968 11550 252 0 3 168201
25%-tile: 1968 11550 252 0 4 1682006
Median: 1968 11550 253 0 4 3364011
75%-tile: 1968 11550 253 0 5 5046016
97.5%-tile: 1968 11550 253 0 6 6559821
Maximum: 13425 13425 271 0 58 6728021
Mean: 1968.45 11549.8 252.645 0 4.503

of unique seqs: 1016888

total # of seqs: 6728021

Output File Names:
FGC1279_s_1_R1.trim.contigs.good.unique.summary

It took 49 secs to summarize 6728021 sequences.

mothur > screen.seqs(fasta=FGC1279_s_1_R1.trim.contigs.good.unique.align, count=FGC1279_s_1_R1.trim.contigs.good.count_table, summary=FGC1279_s_1_R1.trim.contigs.good.unique.summary, start=1968, end=11550, maxhomop=8, processors=8)

Using 8 processors.

Output File Names:
FGC1279_s_1_R1.trim.contigs.good.unique.good.summary
FGC1279_s_1_R1.trim.contigs.good.unique.good.align
FGC1279_s_1_R1.trim.contigs.good.unique.bad.accnos
FGC1279_s_1_R1.trim.contigs.good.good.count_table


It took 131 secs to screen 1016888 sequences.

mothur > summary.seqs(fasta=FGC1279_s_1_R1.trim.contigs.good.unique.good.align, count=FGC1279_s_1_R1.trim.contigs.good.good.count_table, processors=8)

Using 8 processors.
[ERROR]: Your count table contains more than 1 sequence named HWI-M00590_260_000000000-AGRFM_1_2114_15526_29324, sequence names must be unique. Please correct.