hello
I´m sure this a recurrent problem but i`m missing something that i cannot figure out.
I used just a forward primer in 454 amplification to V6 region.
After:
trim.seqs(fasta=F2.fna, oligos=barcode.oligos, qfile=F2.qual, maxambig=0, maxhomop=8, bdiffs=1, pdiffs=2,qwindowaverage=35, qwindowsize=50, minlength=250, maxlength=500)
I get:
mothur > summary.seqs(fasta=F2.fasta)
Start End NBases Ambigs Polymer
Minimum: 1 250 250 0 3
2.5%-tile: 1 278 278 0 4
25%-tile: 1 361 361 0 5
Median: 1 392 392 0 5
75%-tile: 1 423 423 0 5
97.5%-tile: 1 491 491 0 6
Maximum: 1 500 500 0 8
of Seqs: 14628
After command [b]uniq.seqs[/b]:
mothur > summary.seqs(fasta=F2.unique.fasta)
Start End NBases Ambigs Polymer
Minimum: 1 250 250 0 3
2.5%-tile: 1 278 278 0 4
25%-tile: 1 360 360 0 5
Median: 1 392 392 0 5
75%-tile: 1 425 425 0 5
97.5%-tile: 1 491 491 0 6
Maximum: 1 500 500 0 8
of Seqs: 13593
After command align.seqs(candidate=F2.unique.fasta, template=silva.bacteria.fasta, flip=t)
(none sequence was flipped)
mothur > summary.seqs(fasta=F2.unique.align)
Start End NBases Ambigs Polymer
Minimum: 1044 5271 221 0 3
2.5%-tile: 1044 6104 277 0 4
25%-tile: 1044 8186 360 0 5
Median: 1044 8507 392 0 5
75%-tile: 1044 9964 425 0 5
97.5%-tile:1044 13862 491 0 6
Maximum: 1467 15641 500 0 8
of Seqs: 13593
This summary shows there is a problem in the alignment. It seems there is a perfect alignment in the left column (i can cut all the sequences that do not start on 1044) but, on the other hand, there are serious problems on the right side. If i reject all the sequences that do not represent the median values i will lost a serious amount of data.
How can i improve this?
Thanks for your help.