1 sequence remains after align.seqs

bpyoumans · January 22, 2011, 7:46pm

I’m following the Costello Analysis using my own data.
After trim.seqs(fasta=TDPool1.raw.fasta, oligos=TDPool1Barcodes.oligos2, qfile=TDPool1.raw.qual, maxambig=0, maxhomop=8, flip=T, bdiffs=1, pdiffs=2, qwindowaverage=35, qwindowsize=50) I get:
Start End NBases Ambigs Polymer
Minimum: 1 28 28 0 2
2.5%-tile: 1 29 29 0 3
25%-tile: 1 249 249 0 4
Median: 1 349 349 0 4
75%-tile: 1 420 420 0 5
97.5%-tile: 1 450 450 0 6
Maximum: 1 529 529 0 8

of Seqs: 703350

After unique.seqs(fasta=TDPool1.raw.trim.fasta) I get:
Start End NBases Ambigs Polymer
Minimum: 1 28 28 0 2
2.5%-tile: 1 40 40 0 4
25%-tile: 1 299 299 0 4
Median: 1 369 369 0 4
75%-tile: 1 429 429 0 5
97.5%-tile: 1 450 450 0 6
Maximum: 1 529 529 0 8

of Seqs: 200575

After align.seqs(candidate=TDPool1.raw.trim.unique.fasta, template=…/Mothur.source/silva.bacteria.fasta, flip=T) I get:
Some of you sequences generated alignments that eliminated too many bases, a list is provided in TDPool1.raw.trim.unique.flip.accnos. If the reverse compliment proved to be better it was reported.
It took 5067 secs to align 200575 sequences.
Start End NBases Ambigs Polymer
Minimum: 26790 28464 79 0 8
2.5%-tile: 26790 28464 79 0 8
25%-tile: 26790 28464 79 0 8
Median: 26790 28464 79 0 8
75%-tile: 26790 28464 79 0 8
97.5%-tile: 26790 28464 79 0 8
Maximum: 26790 28464 79 0 8

of Seqs: 1

How did I end up with one sequence?

I’d appreciate any help pinpointing what the heck I’m doing wrong Thank you!

pschloss · January 24, 2011, 8:37pm

Can you also provide the summary.seqs function calls that you used?

bpyoumans · January 24, 2011, 9:15pm

Here you go Pat. Thank you for your help!
Bonnie Youmans

mothur > summary.seqs(fasta=TDPool1.raw.trim.fasta)

Start End NBases Ambigs Polymer
Minimum: 1 28 28 0 2
2.5%-tile: 1 29 29 0 3
25%-tile: 1 249 249 0 4
Median: 1 349 349 0 4
75%-tile: 1 420 420 0 5
97.5%-tile: 1 450 450 0 6
Maximum: 1 529 529 0 8

of Seqs: 703350

Output File Name:
TDPool1.raw.trim.fasta.summary

mothur > unique.seqs(fasta=TDPool1.raw.trim.fasta)

Output File Names:
TDPool1.raw.trim.unique.fasta
TDPool1.raw.trim.names

mothur > summary.seqs(fasta=TDPool1.raw.trim.unique.fasta)

Start End NBases Ambigs Polymer
Minimum: 1 28 28 0 2
2.5%-tile: 1 40 40 0 4
25%-tile: 1 299 299 0 4
Median: 1 369 369 0 4
75%-tile: 1 429 429 0 5
97.5%-tile: 1 450 450 0 6
Maximum: 1 529 529 0 8

of Seqs: 200575

Output File Name:
TDPool1.raw.trim.unique.fasta.summary

mothur > align.seqs(candidate=TDPool1.raw.trim.unique.fasta, template=…/Mothur.source/silva.bacteria.fasta, flip=T)

Reading in the …/Mothur.source/silva.bacteria.fasta template sequences… DONE.
Aligning sequences from TDPool1.raw.trim.unique.fasta …

Some of you sequences generated alignments that eliminated too many bases, a list is provided in TDPool1.raw.trim.unique.flip.accnos. If the reverse compliment proved to be better it was reported.
It took 5067 secs to align 200575 sequences.

Output File Names: TDPool1.raw.trim.unique.align TDPool1.raw.trim.unique.align.report TDPool1.raw.trim.unique.flip.accnos
mothur > summary.seqs(fasta=TDPool1.raw.trim.unique.align)

Start End NBases Ambigs Polymer
Minimum: 26790 28464 79 0 8
2.5%-tile: 26790 28464 79 0 8
25%-tile: 26790 28464 79 0 8
Median: 26790 28464 79 0 8
75%-tile: 26790 28464 79 0 8
97.5%-tile: 26790 28464 79 0 8
Maximum: 26790 28464 79 0 8

of Seqs: 1

Output File Name:
TDPool1.raw.trim.unique.align.summary

westcott · February 3, 2011, 2:15pm

Could you send your TDPool1.raw.trim.unique.fasta file to mothur.bugs@gmail.com?

bpyoumans · February 3, 2011, 9:00pm

Thank you, westcott. I sent an email with a link to the file.

Bonnie

turn0180 · March 2, 2011, 1:32am

I am also having this same problem (different data set). Several people have tried the data and ran into the same problem, but at least one person with version 1.16 of mothur processed it just fine. Has a solution been discovered?

bpyoumans · March 2, 2011, 3:58pm

The solution for me was to download the executable mothur file. I had originally downloaded the source code and compiled it myself. Once I downloaded the executable, everything worked great. (I have to give a shout out to Sarah Westcott for her help!)

I hope that helps.

Bonnie

glecleir · June 17, 2011, 7:51pm

Hi All,

I am not sure if this problem has been resolved, but I have noticed this happening when I am generating files that are too large for my computer to handle. If the file I am working with is larger than the amount of RAM I have on my computer I get something similar to what you are seeing… I don’t think the computer is able to open the file because of this and it only gets thru the 1st sequence. I would try switching to a computer with more RAM and then running summary.seqs again. It would usually happen to me after align.seqs when the files get HUGE.

-Gary

Topic		Replies	Views
Help for Processing improved sequences in align.seqs Commands in mothur	4	2654	March 27, 2015
unaligned after align.seqs Commands in mothur	4	3959	February 8, 2011
Align.seqs removing most bp Commands in mothur	22	14368	April 26, 2013
align.seqs and no of bases Commands in mothur	5	2939	January 16, 2015
Blank file after summary.seqs mothur bugs	24	11963	July 2, 2015

1 sequence remains after align.seqs

of Seqs: 703350

of Seqs: 200575

of Seqs: 1

of Seqs: 703350

of Seqs: 200575

of Seqs: 1

Related topics