controlling alignment

Kirk · March 27, 2012, 1:04pm

Hi all,

I don’t know any better place on the forum to ask this question so pardon my enthousiasm

I’d like to check the quality of my alignments. However, I wasn’t able to open the alignment using several programs (Mega, ClustalX2, BioEdit). Upgrading the amount of RAM to 4 GB still makes the programs crash, so I guess it’s rather software related.

I’d like to check the quality of my alignments pre screening/filtering (thus the largest files), since I noticed some bad alignments in the filtered (which will inflate the number of OTUs, I guess). So what do you people use to open and check(/adjust) this amount of sequences?

Thanks

pschloss · March 27, 2012, 7:31pm

ARB

We also have a command - align.check - that will tell you how well the 16S rRNA secondary structure is preserved.

Pat

Kirk · April 5, 2012, 7:57am

Dear Pat,

I’m not only worrying about the alignments against the secondary structure, but about the general alignment quality.

I often see bases shifted just a couple of positions, clearly out of place (lower case).

-GA-A-CG------
-GA-A-CG------
-GA-G-C--g----
-GA-G-CG------
-GA-A-CG------
-GA-A-CG-G----
-GA-A-CG------
-GA-A-CG------
-GA-A-CG-A----
-GC-Acggtctg--
-GA-G-CG------
-GC-G-CG-C----
-GA-G-GG------
-aa---GG-G----
-GT-G-GG------

So, perhaps these things can be corrected manually in nice conserved blocks, where such aberrations are easily seen, but this seems quite impossible in more variable regions, with the bases scattered over a lot of columns, no? How do you handle this?
This will have an impact on the screening, filtering, chimera detection (or is this (Uchime) independent of the alignment?) and number of OTUs, right?

I’ve tried ARB to align my sequences, with little or no improvement.
Altering parameters probably will never eliminate all of these mistakes?
Wouldn’t a “de novo” approach be better to calculate the number of OTUs?

I don’t want to nag, you know, just wondering how somebody else deals with this or if I’m misinterpreting some things here …

Kind regards.

ps: bug alert: recalling the reference database after saving it doesn’t seem to work. When realigning it has to be read in again, and is added to the amount of memory used. (or could it be because the save parameter is still set to true?)

pschloss · April 5, 2012, 1:14pm

Hi Kirk,

Yeah we’ve seen a few cases of this and in general you can go in and manually correct the mis-aligned portions in the reference alignment and then re-align. The problem with de novo is that they pretty much leave out any reference to the secondary structure. Doing things like uclust and esprit over-estimate the similarity between sequences because they aren’t forced to maintain homologous sites across multiple sequences.

Pat

Kirk · May 11, 2012, 2:57pm

The thing is, scanning through the reference (silva.bacteria.fasta) with ARB, I don’t see any misalignments in those places.

I took a subsample and checked the misaligned sequences, and the mistakes seemed consistent (same faults for same genus), so I was quite confident that it wouldn’t have much impact.

However, I’ve just aligned some Flavobacteria, of which some have identical sequences, and also here I see misalignments, even among the identical ones. Strange.

Topic		Replies	Views
Alignment due diligence Commands in mothur	6	4912	December 4, 2014
23s rRNA database alignment Commands in mothur	2	1949	June 1, 2015
Issues with align.seqs: Eliminated bases warning and failed screen.seqs	7	67	December 16, 2024
align.seqs and no of bases Commands in mothur	5	2977	January 16, 2015
Not aligning all sequences? mothur bugs	3	32852	January 31, 2010

controlling alignment

Related topics