1 sequence remains after align.seqs

I’m following the Costello Analysis using my own data.
After trim.seqs(fasta=TDPool1.raw.fasta, oligos=TDPool1Barcodes.oligos2, qfile=TDPool1.raw.qual, maxambig=0, maxhomop=8, flip=T, bdiffs=1, pdiffs=2, qwindowaverage=35, qwindowsize=50) I get:
Start End NBases Ambigs Polymer
Minimum: 1 28 28 0 2
2.5%-tile: 1 29 29 0 3
25%-tile: 1 249 249 0 4
Median: 1 349 349 0 4
75%-tile: 1 420 420 0 5
97.5%-tile: 1 450 450 0 6
Maximum: 1 529 529 0 8

of Seqs: 703350

After unique.seqs(fasta=TDPool1.raw.trim.fasta) I get:
Start End NBases Ambigs Polymer
Minimum: 1 28 28 0 2
2.5%-tile: 1 40 40 0 4
25%-tile: 1 299 299 0 4
Median: 1 369 369 0 4
75%-tile: 1 429 429 0 5
97.5%-tile: 1 450 450 0 6
Maximum: 1 529 529 0 8

of Seqs: 200575

After align.seqs(candidate=TDPool1.raw.trim.unique.fasta, template=…/Mothur.source/silva.bacteria.fasta, flip=T) I get:
Some of you sequences generated alignments that eliminated too many bases, a list is provided in TDPool1.raw.trim.unique.flip.accnos. If the reverse compliment proved to be better it was reported.
It took 5067 secs to align 200575 sequences.
Start End NBases Ambigs Polymer
Minimum: 26790 28464 79 0 8
2.5%-tile: 26790 28464 79 0 8
25%-tile: 26790 28464 79 0 8
Median: 26790 28464 79 0 8
75%-tile: 26790 28464 79 0 8
97.5%-tile: 26790 28464 79 0 8
Maximum: 26790 28464 79 0 8

of Seqs: 1

How did I end up with one sequence?

I’d appreciate any help pinpointing what the heck I’m doing wrong :slight_smile: Thank you!

Can you also provide the summary.seqs function calls that you used?

Here you go Pat. Thank you for your help!
Bonnie Youmans

mothur > summary.seqs(fasta=TDPool1.raw.trim.fasta)

Start End NBases Ambigs Polymer
Minimum: 1 28 28 0 2
2.5%-tile: 1 29 29 0 3
25%-tile: 1 249 249 0 4
Median: 1 349 349 0 4
75%-tile: 1 420 420 0 5
97.5%-tile: 1 450 450 0 6
Maximum: 1 529 529 0 8

of Seqs: 703350

Output File Name:
TDPool1.raw.trim.fasta.summary


mothur > unique.seqs(fasta=TDPool1.raw.trim.fasta)

Output File Names:
TDPool1.raw.trim.unique.fasta
TDPool1.raw.trim.names


mothur > summary.seqs(fasta=TDPool1.raw.trim.unique.fasta)

Start End NBases Ambigs Polymer
Minimum: 1 28 28 0 2
2.5%-tile: 1 40 40 0 4
25%-tile: 1 299 299 0 4
Median: 1 369 369 0 4
75%-tile: 1 429 429 0 5
97.5%-tile: 1 450 450 0 6
Maximum: 1 529 529 0 8

of Seqs: 200575

Output File Name:
TDPool1.raw.trim.unique.fasta.summary

mothur > align.seqs(candidate=TDPool1.raw.trim.unique.fasta, template=…/Mothur.source/silva.bacteria.fasta, flip=T)

Reading in the …/Mothur.source/silva.bacteria.fasta template sequences… DONE.
Aligning sequences from TDPool1.raw.trim.unique.fasta …

Some of you sequences generated alignments that eliminated too many bases, a list is provided in TDPool1.raw.trim.unique.flip.accnos. If the reverse compliment proved to be better it was reported.
It took 5067 secs to align 200575 sequences.


Output File Names: TDPool1.raw.trim.unique.align TDPool1.raw.trim.unique.align.report TDPool1.raw.trim.unique.flip.accnos
mothur > summary.seqs(fasta=TDPool1.raw.trim.unique.align)

Start End NBases Ambigs Polymer
Minimum: 26790 28464 79 0 8
2.5%-tile: 26790 28464 79 0 8
25%-tile: 26790 28464 79 0 8
Median: 26790 28464 79 0 8
75%-tile: 26790 28464 79 0 8
97.5%-tile: 26790 28464 79 0 8
Maximum: 26790 28464 79 0 8

of Seqs: 1

Output File Name:
TDPool1.raw.trim.unique.align.summary

Could you send your TDPool1.raw.trim.unique.fasta file to mothur.bugs@gmail.com?

Thank you, westcott. I sent an email with a link to the file.

Bonnie

I am also having this same problem (different data set). Several people have tried the data and ran into the same problem, but at least one person with version 1.16 of mothur processed it just fine. Has a solution been discovered?

The solution for me was to download the executable mothur file. I had originally downloaded the source code and compiled it myself. Once I downloaded the executable, everything worked great. (I have to give a shout out to Sarah Westcott for her help!)

I hope that helps.

Bonnie

Hi All,

I am not sure if this problem has been resolved, but I have noticed this happening when I am generating files that are too large for my computer to handle. If the file I am working with is larger than the amount of RAM I have on my computer I get something similar to what you are seeing… I don’t think the computer is able to open the file because of this and it only gets thru the 1st sequence. I would try switching to a computer with more RAM and then running summary.seqs again. It would usually happen to me after align.seqs when the files get HUGE.

-Gary