Hello ,
-
I used greengenes reference taxonomy and alignment file. After doing Chimera.uchime and remove.seqs, I lost 85% of my unique sequences. Why ?
-
Another question, why screen.seq did not work ? Below is the summary written.
mothur > align.seqs(fasta=stability.trim.contigs.trim.good.unique.fasta, reference=gg.refalign, flip=T)
mothur > summary.seqs(fasta=current)
Start End NBases Ambigs Polymer NumSeqs Minimum: 5 2263 370 0 3 1 2.5%-tile: 9 2266 370 0 4 312 25%-tile: 9 2266 371 0 4 3119 Median: 9 2266 372 0 5 6237 75%-tile: 9 2266 373 0 5 9355 97.5%-tile: 9 2266 376 0 6 12161 Maximum: 13 2293 376 0 8 12472 Mean: 9.01379 2266.03 372.169 0 4.71208 # of Seqs: 12472
mothur > screen.seqs(fasta=stability.trim.contigs.trim.good.unique.align, count=stability.trim.contigs.trim.good.count_table, summary=stability.trim.contigs.trim.good.unique.summary, start=9, end=2266, maxhomop=8)
It took 1 secs to screen 12472 sequences.
mothur > summary.seqs(fasta=current, count=current)
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 5 2266 370 0 3 1
2.5%-tile: 9 2266 370 0 4 3339
25%-tile: 9 2266 372 0 4 33387
Median: 9 2266 372 0 5 66774
75%-tile: 9 2266 373 0 5 100161
97.5%-tile: 9 2266 376 0 6 130209
Maximum: 9 2293 376 0 8 133547
Mean: 8.99997 2266 372.232 0 4.62525
of unique seqs: 12400
total # of seqs: 133547
Looking forward for suggestion.