Hi there,
…when I run :
mothur > filter.seqs(fasta=Lema_16S_adults.trim2.unique.good.align, vertical=T, trump=. )
I end up with sequences of 4 base pairs!!!..
Length of filtered alignment: 4
Number of columns removed: 49996
Length of the original alignment: 50000
Number of sequences used to construct filter: 13104
So I run:
mothur > filter.seqs(fasta=Lema_16S_adults.trim2.unique.good.align)
and I get my sequences again…what is happening?!
I decided to use the later one…is this ok?!
Length of filtered alignment: 1178
Number of columns removed: 48822
Length of the original alignment: 50000
Number of sequences used to construct filter: 13104
Thanks!
kim
How are you running screen.seqs? Can you send us the output from summary.seqs using the fasta and name file you are giving screen.seqs?
Hi Pat,
thanks,
So basically I when I align my sequences I get:
mothur > align.seqs(fasta=Lema_16S_adults.trim2.unique.fasta, reference=silva.bacteria.fasta, flip=t)
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1044 1048 1 0 1 1
2.5%-tile: 1044 3857 30 0 3 351
25%-tile: 1044 6389 279 0 4 3507
Median: 1044 8419 372 0 5 7014
75%-tile: 1044 10303 433 0 5 10520
97.5%-tile:43007 43116 493 0 5 13676
Maximum: 43116 43116 500 0 5 14026
Mean: 3185.6 10294.7 341.795 0 4.67275
# of Seqs: 14026
so I screen like:
mothur > screen.seqs(fasta=Lema_16S_adults.trim2.unique.align, name=Lema_16S_adults.trim2.names, start=1044)
mothur > summary.seqs()
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1044 1048 3 0 1 1
2.5%-tile: 1044 3855 155 0 4 328
25%-tile: 1044 6333 297 0 5 3277
Median: 1044 8411 379 0 5 6553
75%-tile: 1044 10261 437 0 5 9829
97.5%-tile:1044 13862 493 0 5 12777
Maximum: 1044 14965 500 0 5 13104
Mean: 1044 8525 358.279 0 4.76389
# of Seqs: 13104
Basically then I had the problem when I do the dist.seqs and the cluster ().....
I ended up making a Phylip distance matrix and the cluster worked.....not really understand what is my back mistake in all that... :roll:
Many thanks!!!
So the earliest a sequence ends is at position 1048 - so you’re keeping sequences that run from 1044 to after 1048 - viola a 4 bp alignment. Instead try this…
screen.seqs(fasta=Lema_16S_adults.trim2.unique.align, name=Lema_16S_adults.trim2.names, start=1044, end=6333)
Also, it doesn't look like you're really doing much for quality trimming.
Pat
Thanks Pat
Now everything works…!
I wish I could attend your August workshop as I am quite new in this and really like mother!..unfortunately I am in the other part of the world (Australia)…so a bit far !!
Cheers
kim