Alignment due diligence

Hi all, i have been looking around to explore the differnt options available out there to inspect my alignment. I would like to be able to assess how many of my OTUs is simply a function of my alignment errors. I have seen abstracts with > 50k OTU (97%)!!! Error during alingment can simply be inflamed by the growing number of sequences.
so basically the question is: what tools shall i use to inspect my alignemtn space ? and what critical points to look at?
please be generous with details especially when it comes to opening mothur’s files with other programs i.e. formats, …


many thanks

O.

You can open the align file in ARB’s alignment editor. But it will likely be a disaster. Most likely these large number of OTUs are not due to alignment problems but due to sequencing errors.

Pat, thanks for your reply. i was able to open the alignment file (shhh.trim.unique.good.filter.fasta ~ 120 MB) in both TextWrangler and Sequencher (it took a while with the latter). How do you describe a good alignment? As far as sequencing error, can i utilize the shhh.qual files to screen for higher quality?

many thanks

O.

ARB is the only way that I can think of.

Hi, I am posting this qeustion in this post as it is related to my previous question.
I removed some sequences form my shh.trim.unique.align (manually) and re-ran unique.seqs which gave me a current name file. But now i am having a problem that my groups file is not matched with my fasta and names file!! I don’t have record of the deleted sequences. How can i create an accnos file form the deleted sequences in order to remove those sequences from my groups file? this will also help me to keep record of the deleted sequences.

many many thanks

O.

use the remove.seqs command.

Hi. thanks for your time.
remove.seqs requires an accnos file which i am after. here is how i did it. i created an accnos file for the original and new fasta using list.seqs and then created one for the difference i.e. the removed sequences using grep in linux. I did that using the name option to make sure i am not working with uniques sequences only. And then i used remove.seqs using the accnos for the difference.

thanks

O.