Hi,
After using screen. seqs in the new 454 SOP, I ended up with only 53% of my original sequences, and after running all the pipeline many of my samples (that I had already ran through the pipline) ended up with <~1,000 sequences when I had obtained >~7,000.
After aligning this is what I got:
mothur > summary.seqs(fasta=current, name=current)
Using All_Howler.unique.align as input file for the fasta parameter.
Using All_Howler.names as input file for the name parameter.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 0 0 0 0 1 1
2.5%-tile: 1044 11895 239 0 4 27921
25%-tile: 1044 11895 438 0 4 279203
Median: 1065 13870 479 0 5 558405
75%-tile: 1482 13870 491 0 5 837607
97.5%-tile: 5431 13870 512 0 6 1088888
Maximum: 43116 43116 535 0 7 1116808
Mean: 1757.19 13191.9 445.846 0 4.87609
of unique seqs: 888386
total # of seqs: 1116808
Output File Names:
All_Howler.unique.summary[/u][/u][/i]
[b]Then I ran screen.seqs in this way:
mothur > screen.seqs(fasta=current, name=current, group=All_Howler.groups, end=13870, optimize=start, criteria=95, processors=8)
After wards this is what I got:[/b]
mothur > summary.seqs(fasta=current, name=current)
Using All_Howler.unique.good.align as input file for the fasta parameter.
Using All_Howler.good.names as input file for the name parameter.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1044 13870 245 0 3 1
2.5%-tile: 1044 13870 290 0 4 15829
25%-tile: 1044 13870 441 0 5 158288
Median: 1071 13870 482 0 5 316576
75%-tile: 1468 13870 497 0 5 474864
97.5%-tile: 5260 13871 516 0 6 617323
Maximum: 5341 16283 535 0 7 633151
Mean: 1578.76 13881.8 457.626 0 5.01822
of unique seqs: 535028
total # of seqs: 633151
Output File Names:
All_Howler.unique.good.summary
[b]As you can see I lost ~47% of all my sequences. I can't think of any other step that could have caused my low number of reads at the end.
Can somebody give me any clue as to whether the screen.seqs command seemed to be well written?
Thanks a lot.
A[/b]