Building Reference Alignments

Hi-

Could someone point me to sources regarding building reference alignment libraries? I’m unsure as to how the gaps are determined, and what their purpose is, exactly. My advisor wishes me to translate what occurs in mothur to a formal mathematical language, but it’s terribly difficult to create well-defined objects when you don’t really know what they are.

Cheers,
Stephen

The silva and greengene-based reference alignments were taken from SILVA and greegenes. The gaps are inserted to preserve positional homology and the secondary structure of the rRNA that the gene codes for.

Hope this gets you going…
Pat

In fact that’s perfect. I also have a question regarding reference sequence identifiers as per your (2009) paper: how are these chosen?

Thanks,
Stephen

Here’s a description…

http://www.mothur.org/wiki/Silva_reference_files

That’s great, thank you.