Error while processing custom nifH database (reference and taxonomy files)

I am getting the following error while trying to process query sequences (nifH) using classify.seqs with custom nifH reference and taxonomy files. I have attached the dropbox link of query, reference and taxonomy files nifH (query, reference, taxonomy files) and screenshot of error. I used mothur v.1.40.5.

For example,
AHJU02000025 is already in your taxonomy file, names must be unique
‘AHJU02000025’ is in your template file and is not in your taxonomy file. Please correct.
In a crux for all 100 entries present in reference and taxonomy were shown in error.

Could anybody help me to solve this issue ?

Thanks in advance


Are you seeing this issue with our current version, 1.43.0, If so, can you send your reference files to so I can take a closer look for you?

Dear Westcott,
Thanks. I have used the same mothur v 1.40.5 for mcrA (taxonomy and reference) files given in 10.1016/j.mimet.2014.05.006 to verify whether any issue is there due to version. It is running properly without any problem. I will send the files to your mail Id. I ll also try to run in newest version. Thanks
Dinesh S L

Thanks for sending your files. The references you sent include duplicate lines for several sequences. For example, looking at sequence CP020898, it is located on lines 19, 43 and 55. Mothur expects the reference sequences to be unique. Removing the duplicates from both files can be down with the list.seqs and get.seqs commands. Here’s how:

mothur > list.seqs(fasta=nifH100.fasta) - list unique names in fasta file

mothur > get.seqs(fasta=current, taxonomy=nifH100.taxonomy) - select only unique names from references. Command will generate a bunch of warnings about the duplicate names

mothur > classify.seqs(fasta=nifHqueryseqs.fas, reference=nifH100.pick.fasta, taxonomy=nifH100.pick.taxonomy) - classify your sequences using references without duplicates

Dear Westcott,
Your solution worked. Thanks a lot.
Dinesh S L