too many gaps in alignment with template

afioredonno · January 27, 2016, 5:32pm

Dear Patrick, dear mothur users,
I have two questions, one related to Illumina Miseq read quality (paired-end reads) and one to the way mothur aligns.

I noticed that there are many quite large deletions (often of 10 nucleotides) in a very conserved region after the V4 of the 18S (I’m working with a group of protists). Did someone else oberve the same pattern? Was it signaled already?
I use mothur to align my reads according to a template. The template doesn’t have any gaps in this region and I used a high penalty for opening gaps (gapopen=-5).
In the alignment, I noticed that most often the deletions result in several gaps, like this:

T G A A G T A A T A T G A T T G A T A G G G
T G A A G _ _ _ _ _ _ _ _ _ _ _ A T A G G G -> one deletion of 11 nucleotides, should be one gap
T _ _ _ G _ A _ _ _A _ G _ _ _ _ A T A G G G -> the flanking region before the gap is “spread” into the gap, resulting in 5 gaps!

Since the number of gaps can make a difference in the distance between otherwise identical (or closely related) sequences (option “calc” with the default “onegap” in dist.seqs), adding artificial gaps will increase the number of final OTUs. I don’t think that using the “nogap” option would be a solution, since there are other parts of the alignment where gaps have acutally a meaning (difference in lenghts in the variable helices).
Why does mothur tend to create that many gaps? Is there any option to correct this?
Waiting for your answer,
Anna Maria

pschloss · January 28, 2016, 1:36pm

I’m not 100% on what you’re asking - a couple things that I think will help. (1) the extra gaps in the SILVA reference are largely structural and really don’t matter. (2) after aligning and screening the sequences, you should run filter.seqs(trump=., vertical=T), which will remove a lot of the structural gaps. (3) in dist.seqs if the two sequences you are comparing have gaps in the same positions, those positions are ignored. The punchline is that I don’t think what you’re worried about is an issue.

Pat

Topic		Replies	Views
alignment Commands in mothur	3	1421	May 31, 2016
How does align.seq's SimBtwnQuery&Template handle gaps? Theory behind mothur	3	598	May 5, 2019
Alignment seqs of different lengths? Commands in mothur	4	4293	April 6, 2010
Alignment in V4 region Theory behind mothur	3	546	June 12, 2022
sequencing alignment problem? Commands in mothur	4	1467	September 16, 2016

too many gaps in alignment with template

Related topics