I am wondering if there is a function that allows me to truncate a chunk at the beginning of the alignment and another chunk at the end. My 16S sequences are aligned, but the beginning and the end of the alignment show sequences that are all staggered between them for a lenght of about 170 sites. The way I dealt with this when using MEGA 4 was to just truncate those chunks at the beginning and at the end so that I would be left with the core central part that does not have those “empty lines in the betweens” (about 500 nucleotides were left at this point). Is it possible to do the same in Mothur? I am wondering if the “filter” command is the one that would work…
When you say truncate, you mean that you don’t want the beginning and end sections to be counted when building the distance matrix, right?
If so, generating a hard filter and using the filter.seqs command would work for this. Take a look at the Lane mask to get an idea of what it looks like (http://www.mothur.org/w/images/2/2a/Lane1241.gg.filter) You want to place zeroes where you want to ignore columns, and ones(1) where you want to count the columns. You need to build the filter so that it is the exact length of your alignment space. The size of your alignment space is different if you used greengenes, silva, and if you have already used filter.seqs to remove dots and dashes in every column.
Hope this helps
Thanks! Your answer was useful to understand the meaning and the usefulness of a hard mask. I actually ended up not needing a hard mask though, because I realized that “Trump=.” would take care of all of the staggered ends of the sequences and make the beginning and the end of the alignement even: if there is just one “.” in the column, the whole column will be delited, even if there are nucleotides data in other sites of the same column.
That is definitely the easier way to do it. I guess the hard mask would only be useful if you wanted to mask sequences that DO overlap completely without any “.” or “-” in any of the columns.