Alignment in V4 region

Ray · May 30, 2022, 3:35pm

Hi all,

I’m now quite confused for how align.seqs step in mothur works. There are only ~250 bases in V4 regions. However, in SOP, after alignment, the start point is 1969 and the end point is 11551, while there are only ~250 bases. Why does that happen? How was SILVA database build and how does mothur conduct alignment steps?

Thanks

pschloss · May 31, 2022, 7:29pm

Hi Ray,

The silva reference alignment (and greengenes) have extra columns that only contain gap characters. Those columns don’t contain any data. They are there as padding in case there’s novel sequence diversity encountered. As an example, the TM7 have an intron in the V1 region and so it’s good to have padding there to accommodate TM7 sequences. You can learn more about the algorithm in align.seqs by looking at these papers from my group…

Hope these help a bit…
Pat

Ray · June 2, 2022, 3:54am

Got it, thanks, Pat!

system · June 12, 2022, 3:55am

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
V4 region of Silva Database Commands in mothur	5	4720	October 24, 2014
PCR.Seqs Commands in mothur	3	2476	March 30, 2015
Hypervariable region	2	354	November 11, 2022
Customize Silva reference for V4 region Commands in mothur	6	600	August 13, 2023
Align.seq and filter problems Commands in mothur	6	758	June 5, 2021

Alignment in V4 region

Related topics