Suffering with align.seqs


I’m analyzing a 454 sequencing sample and I’m in the align step. The reason why I’m suffering is related with the alignment quality, moreover with the percertage of similatiry between query and template (SimBtwnQuery&Template). Well, to demonstrate this I made a plot of the alignments, I ordered them in relation to highest to lowest and that’s what I saw.


As you can see there are a lot of sequences that have very low simmilaty with their respective match and I don’t know how to reduce or crop them from my analysis.

This is the command I used to the alignment, my database was silva (I’m right is the recommended database)

align.seqs(candidate=sequences.shhh.trim.unique.fasta, template=silva.bacteria.fasta, processors=1, search=kmer, flip=t, threshold=0.75)

I don’t know what to do with the secuences under the black square or how to increase the match quality or remove them. I would be really grateful if somebody could help me.


A few options…

  1. You could just carry on and not worry. This is what 99% of people do. This is what we did in the original paper when we saw that you don’t have to have a close match to get a great alignment.

  2. You could update your reference alignment to the new and larger silva reference alignment

  3. You can remove them with screen.seqs: