Truncation of chunks out of the alignment

Ltomassini · January 6, 2010, 6:20am

Hello
I am wondering if there is a function that allows me to truncate a chunk at the beginning of the alignment and another chunk at the end. My 16S sequences are aligned, but the beginning and the end of the alignment show sequences that are all staggered between them for a lenght of about 170 sites. The way I dealt with this when using MEGA 4 was to just truncate those chunks at the beginning and at the end so that I would be left with the core central part that does not have those “empty lines in the betweens” (about 500 nucleotides were left at this point). Is it possible to do the same in Mothur? I am wondering if the “filter” command is the one that would work…
thanks
Letizia

Rewski52 · January 6, 2010, 7:33pm

When you say truncate, you mean that you don’t want the beginning and end sections to be counted when building the distance matrix, right?

If so, generating a hard filter and using the filter.seqs command would work for this. Take a look at the Lane mask to get an idea of what it looks like (http://www.mothur.org/w/images/2/2a/Lane1241.gg.filter) You want to place zeroes where you want to ignore columns, and ones(1) where you want to count the columns. You need to build the filter so that it is the exact length of your alignment space. The size of your alignment space is different if you used greengenes, silva, and if you have already used filter.seqs to remove dots and dashes in every column.

Hope this helps

Ltomassini · January 8, 2010, 4:46am

Thanks! Your answer was useful to understand the meaning and the usefulness of a hard mask. I actually ended up not needing a hard mask though, because I realized that “Trump=.” would take care of all of the staggered ends of the sequences and make the beginning and the end of the alignement even: if there is just one “.” in the column, the whole column will be delited, even if there are nucleotides data in other sites of the same column.
Thanks again
Letizia

Rewski52 · January 8, 2010, 5:10pm

Good Call!

That is definitely the easier way to do it. I guess the hard mask would only be useful if you wanted to mask sequences that DO overlap completely without any “.” or “-” in any of the columns.

Topic		Replies	Views
explanation of trump option in filter.seqs Theory behind mothur	4	6389	May 23, 2014
filter Commands in mothur	1	2851	September 13, 2010
Length of filtered alignment:0	6	902	April 20, 2021
cut alignment Commands in mothur	2	2852	September 27, 2013
Length of filtered alignment=0 Commands in mothur	2	459	October 3, 2019

Truncation of chunks out of the alignment

Related topics