classify.seqs error in mothur-1.22

Hi there,

I’ve come across a potential bug with classify.seqs in the new version of mothur-1.22. We are using the greengenes taxonomy and fasta file downloaded from http://greengenes.lbl.gov/Download/Sequence_Data/Fasta_data_files/current_GREENGENES_gg16S_unaligned.fasta.gz as the reference. With past versions of mothur, we have been able to run this with no errors, but for some reason with version 1.22, we get warnings like this:

Warning: cannot find taxon Actinobacteria_ in reference taxonomy tree at level 2 for HA66EFZ01EIYUK. This may cause totals of daughter levels not to add up in summary file.
Warning: cannot find taxon Actinobacteria_ in reference taxonomy tree at level 2 for HAFFLFH02HXAK3. This may cause totals of daughter levels not to add up in summary file.
Warning: cannot find taxon Actinobacteria_ in reference taxonomy tree at level 2 for HAFFLFH02JDFYX. This may cause totals of daughter levels not to add up in summary file.
Warning: cannot find taxon Actinobacteria_ in reference taxonomy tree at level 2 for HAFFLFH02I8XCF. This may cause totals of daughter levels not to add up in summary file.
Warning: cannot find taxon Actinobacteria_ in reference taxonomy tree at level 2 for HAFFLFH02I5U00. This may cause totals of daughter levels not to add up in summary file.


The errors come up for the following taxa:

Actinobacteria_
Lentisphaerae_
Fusobacteria_
Spirochaetes_

All of which have _(class) in the text string… maybe the parentheses are the issue?? Although this error have never come up with past version of mothur using the same greengenes taxonomy files…

Thanks very much for your help!

Best,
Emiley

Hi Emiley,
The problem is occurring because when mothur sees parentheses in a taxonomy file, it assumes that they contain a confidence score and removes them. I am working on a fix for this which will be part of our next release. In the meantime if you remove the parentheses you should be fine.
-Sarah