Alignment seqs of different lengths?

Perhaps this has been addressed by another comment, but I haven’t been able to find it. When using the the align.seqs command, I’ve noticed that that the resulting alignment has different sized sequences – some 1439, some 1460, etc. Typically, most programs require all sequences in an alignment to be the same length through padding with gaps etc. To allow a mothur-generated alignment to be used with such a program, is there some option to get mothur to do this?

Thanks,

Jonathan Badger

Hi Jonathan,

Hmm… I’m not quite sure what you’re referring to. To run align.seqs you have to give mothur a reference alignment. Typically one would either use the SILVA or greengenes alignments which are 50000 and 7682 columns long, respectively. Then you would align your sequences to those and the aligner will add the padding and gaps. I think you might be looking for a de novo aligner, which mothur cannot do. But I think the only reason to go that route was if you had a protein coding gene. Perhaps I’ve missed something in your question, though.

Pat

I am using a template – this is the command I’m using.
align.seqs(candidate=seq.fasta, template=silva.bacteria.fasta, flip=T, processors=2)

The issue is that the resulting seq.align has different lengths, where one would expect that they would be all the same.

I have reduced the issue to a few sequences so you can see the problem. (attached example.zip)

Jonathan

I’m pretty sure that you’re using the unaligned silva.bacteria.fasta file from the taxonomy outline download. You want to use the aligned version from the alignment databases download (http://www.mothur.org/wiki/Alignment_database). Sorry for the confusion, this is something I need to work on…

Yes, that was the problem. Thanks.

Jonathan