PR2.fasta file not aligned

Hi,

I am attempting align against the template pr2.fast, but only get the following error:


mothur > align.seqs(fasta=18S.unique.fasta, reference=pr2_gb203_version_4.5.fasta)

Using 1 processors.

Reading in the pr2_gb203_version_4.5.fasta template sequences… [ERROR]: template is not aligned, aborting.
DONE.
It took 0 to read 0 sequences.

Any help with this would be appreciated.

Regards,

Nicole

1 Like

Hi NicoleDames14,

the

align.seqs

cmd ‘aligns a user-supplied fasta-formatted candidate sequence file to a user-supplied fasta-formatted template alignment’ (see the link: https://www.mothur.org/wiki/Align.seqs). In your case the ‘user-supplied fasta-formatted candidate sequence file’ is ‘18S.unique.fasta’ and the ‘user-supplied fasta-formatted template alignment’ is ‘pr2_gb203_version_4.5.fasta’

The problem is that the Protist Ribosomal Reference database (PR2) is not aligned, so you can not use it as ‘fasta-formatted template alignment’.

There are two options: (1) align the PR2 database first using a MSA (Multiple Sequence Alignment) tool, such as MAFFT or MUSCLE; or (2) skip the alignment step, i.e. it’s possible to go further with mothur pipeline using an alignment-independent approach.

I have been using the second option.

Basically, you skip that step and go directly to

pre.cluster

. Then, with

cluster

cmd you can use an alignment-independent clustering algorithm such as Vsearch dgc (method=dgc).


I think also, this question that you made was pointed out by another colleague here. Try to find it.
I hope I have helped. Kind regards, @renh@
1 Like

Thank you @renh@.

I realized now that pr2 is not aligned and should be used for taxonomy rather.

I have managed to figure out what to do, but will also I’ll try your suggested method.

Regards,

Nicole