
Hi there,
…when I run :
mothur > filter.seqs(fasta=Lema_16S_adults.trim2.unique.good.align, vertical=T, trump=. )

I end up with sequences of 4 base pairs!!!..

Length of filtered alignment: 4
Number of columns removed: 49996
Length of the original alignment: 50000
Number of sequences used to construct filter: 13104

So I run:
mothur > filter.seqs(fasta=Lema_16S_adults.trim2.unique.good.align)
and I get my sequences again…what is happening?!
I decided to use the later one…is this ok?!

Length of filtered alignment: 1178
Number of columns removed: 48822
Length of the original alignment: 50000
Number of sequences used to construct filter: 13104


How are you running screen.seqs? Can you send us the output from summary.seqs using the fasta and name file you are giving screen.seqs?

Hi Pat,
So basically I when I align my sequences I get:

mothur > align.seqs(fasta=Lema_16S_adults.trim2.unique.fasta, reference=silva.bacteria.fasta, flip=t)

Start End NBases Ambigs Polymer NumSeqs Minimum: 1044 1048 1 0 1 1 2.5%-tile: 1044 3857 30 0 3 351 25%-tile: 1044 6389 279 0 4 3507 Median: 1044 8419 372 0 5 7014 75%-tile: 1044 10303 433 0 5 10520 97.5%-tile:43007 43116 493 0 5 13676 Maximum: 43116 43116 500 0 5 14026 Mean: 3185.6 10294.7 341.795 0 4.67275 # of Seqs: 14026

so I screen like:
mothur > screen.seqs(fasta=Lema_16S_adults.trim2.unique.align, name=Lema_16S_adults.trim2.names, start=1044)
mothur > summary.seqs()

Start End NBases Ambigs Polymer NumSeqs Minimum: 1044 1048 3 0 1 1 2.5%-tile: 1044 3855 155 0 4 328 25%-tile: 1044 6333 297 0 5 3277 Median: 1044 8411 379 0 5 6553 75%-tile: 1044 10261 437 0 5 9829 97.5%-tile:1044 13862 493 0 5 12777 Maximum: 1044 14965 500 0 5 13104 Mean: 1044 8525 358.279 0 4.76389 # of Seqs: 13104
Basically then I had the problem when I do the dist.seqs and the cluster ()..... I ended up making a Phylip distance matrix and the cluster worked.....not really understand what is my back mistake in all that... :roll:

Many thanks!!!

So the earliest a sequence ends is at position 1048 - so you’re keeping sequences that run from 1044 to after 1048 - viola a 4 bp alignment. Instead try this…

screen.seqs(fasta=Lema_16S_adults.trim2.unique.align, name=Lema_16S_adults.trim2.names, start=1044, end=6333)
Also, it doesn't look like you're really doing much for quality trimming.


Thanks Pat :smiley:
Now everything works…!
I wish I could attend your August workshop as I am quite new in this and really like mother!..unfortunately I am in the other part of the world (Australia)…so a bit far !!