cmd ‘aligns a user-supplied fasta-formatted candidate sequence file to a user-supplied fasta-formatted template alignment’ (see the link: https://www.mothur.org/wiki/Align.seqs). In your case the ‘user-supplied fasta-formatted candidate sequence file’ is ‘18S.unique.fasta’ and the ‘user-supplied fasta-formatted template alignment’ is ‘pr2_gb203_version_4.5.fasta’
The problem is that the Protist Ribosomal Reference database (PR2) is not aligned, so you can not use it as ‘fasta-formatted template alignment’.
There are two options: (1) align the PR2 database first using a MSA (Multiple Sequence Alignment) tool, such as MAFFT or MUSCLE; or (2) skip the alignment step, i.e. it’s possible to go further with mothur pipeline using an alignment-independent approach.
I have been using the second option.
Basically, you skip that step and go directly to
pre.cluster
. Then, with
cluster
cmd you can use an alignment-independent clustering algorithm such as Vsearch dgc (method=dgc).
I think also, this question that you made was pointed out by another colleague here. Try to find it.
I hope I have helped.
Kind regards,
@renh@