My old post about 23s data seems to have vanished when the wiki and forum crashed so I’m reuploading my question.
I started a project to use mothur to classify 23s rRNA sequences. To do this I’ve tried to create my own alignment database from BLAST results and using the SINA aligner. After this I have to do some quality control on the aligned output file like removing spaces between identifiers and deleting duplicate identifiers. However, when I try to run align.seqs against example sequences to test the alignment or if I try to use an ecoli sequences to get the start and end values for pcr.seqs I get this error out of mothur:
[ERROR]: template is not aligned, aborting.
Does anyone have any advice?
Thanks for your help,
Most often when people get this error its because one or two sequences are short. Here’s a little trick that may help you figure out which ones:
mothur > set.dir(debug=t)
mothur > filter.seqs(fasta=yourReferenceFile, vertical=T)
Mothur will attempt to filter the sequences. You don’t care about the results, but you should see something like:
Using 1 processors.
[DEBUG]: U68589 length = 943
[DEBUG]: U68590 length = 497
[ERROR]: Sequences are not all the same length, please correct.
This should help you find the short sequence.
That worked great! One thing to note for anyone else that tries this is when I tried to enter these commands at first I still saw the same output as before but no debug lines were coming up. To fix this I changed the debug=t in the first command to a capital “T” and then I saw the correct sequence lengths.