Hi,
I run screen.seqs(fasta=HX1JDSX01.shhh.trim.align, name=HX1JDSX01.shhh.trim.names, group=HX1JDSX01.shhh.groups, optimize=start, end=6409, criteria=95, processors=20)
My concern is why Im loosing a lot of sequences in this step. I mean most of the seqs seem to cover the same area of the 16S already before the command. Is it due to my start-end-criteria?
Before screen.seqs # 230330
After screen.seqs #2632
mothur > summary.seqs(fasta=HX1JDSX01.shhh.trim.align, name=HX1JDSX01.shhh.trim.names,processors=20)
Using 20 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 0 0 0 0 1 1
2.5%-tile: 1044 5443 248 0 3 5759
25%-tile: 1044 6091 256 0 4 57583
Median: 1044 6109 273 0 5 115166
75%-tile: 1044 6202 279 0 5 172748
97.5%-tile: 1079 6409 296 0 6 224572
Maximum: 43115 43116 309 0 8 230330
Mean: 1137.62 6131.41 269.069 0 4.44837
# of unique seqs: 89050
total # of seqs: 230330
mothur > summary.seqs(fasta=HX1JDSX01.shhh.trim.good.align, name=HX1JDSX01.shhh.trim.good.names)
Using 1 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1044 6409 276 0 4 1
2.5%-tile: 1044 6409 287 0 4 66
25%-tile: 1044 6411 290 0 5 659
Median: 1044 6418 292 0 5 1317
75%-tile: 1044 6420 293 0 5 1975
97.5%-tile: 1044 6424 295 0 5 2567
Maximum: 1044 6447 302 0 7 2632
Mean: 1044 6415.7 291.258 0 4.98594
# of unique seqs: 1346
total # of seqs: 2632
Thanks,