Hello,
I am working on classifying 23S sequences using the ugreen database. I am using Mothur v1.47.0, and I tried formatting my own taxonomy and align files so they were Mothur compatible. I started out using the silva LSU database, but I had a huge number of unclassified sequences, which is why I turned to a database specifically curated for 23S rDNA sequences.
When I try to run classify.seqs I get the following error: “‘XXXX’ is in your template file and is not in your taxonomy file. Please correct.”
I then tried list.seqs(fasta=final_edit_23S.pcr.align) and get.seqs(taxonomy=final_edit_23S.tax, accnos=current) to correct this issue. I got a new error:
“[WARNING]: final_edit_23S.tax does not contain any sequence from the .accnos file.
Selected 0 sequences from final_edit_23S.tax.”
I double-checked that the # of lines in my align file was twice as many as the lines in my taxonomy file. The formatting looks correct on everything. Please let me know if you have any ideas on where to go from here.
Thanks!
First several lines of my taxonomy file:
JN603992 Eukaryota;Discoba;Euglenozoa;Euglenida;Euglenales;Monomorphina;Monomorphina_rudicula;
KM817942 Eukaryota;Ochrophyta;Synurophyceae;Synurales;Synurales_X;Mallomonas;Mallomonas_bronchartiana;
DQ629271 Eukaryota;Streptophyta;Embryophyceae;Embryophyceae_X;Embryophyceae_XX;Myurium;Myurium_hochstetteri;
L42985 Eukaryota;Chlorophyta;Chlorophyceae;Chlamydomonadales;Chlamydomonadales_X;Carteria;Carteria_crucifera;
KC598087_a Eukaryota;Ochrophyta;Eustigmatophyceae;Eustigmatophyceae_X;Eustigmatophyceae_XX;Nannochloropsis;Nannochloropsis_oculata;
KC598087_b Eukaryota;Ochrophyta;Eustigmatophyceae;Eustigmatophyceae_X;Eustigmatophyceae_XX;Nannochloropsis;Nannochloropsis_oculata;
FJ805954 Bacteria;Oxyphotobacteria;Nostocales;Chroococcidiopsaceae;Chroococcidiopsis;uncultured_Chroococcidiopsis_sp.;
KM590702 Eukaryota;Ochrophyta;Synurophyceae;Synurales;Synurales_X;Synura;Synura_curtispina;
FJ805858 Bacteria;Oxyphotobacteria;Nostocales;Chroococcidiopsaceae;Chroococcidiopsis;uncultured_Chroococcidiopsis_sp.;
KM817940 Eukaryota;Ochrophyta;Synurophyceae;Synurales;Synurales_X;Mallomonas;Mallomonas_aerolata;
NC_023033 Bacteria;Oxyphotobacteria;Thermosynechococcales;Thermosynechococcaceae;Thermosynechococcus;Thermosynechococcus_sp._NK55a;
JQ356829 Eukaryota;Discoba;Euglenozoa;Euglenida;Euglenales;Cryptoglena;Cryptoglena_sp._1_JIK-2012;
KM817972 Eukaryota;Ochrophyta;Synurophyceae;Synurales;Synurales_X;Mallomonas;Mallomonas_matvienkoae;
KM817929 Eukaryota;Ochrophyta;Synurophyceae;Synurales;Synurales_X;Mallomonas;Mallomonas_acaroides;
CP000239_a Bacteria;Oxyphotobacteria;Eurycoccales;Eurycoccales_Incertae_Sedis;Synechococcus;Synechococcus_sp._JA-3-3Ab;
CP000239_b Bacteria;Oxyphotobacteria;Eurycoccales;Eurycoccales_Incertae_Sedis;Synechococcus;Synechococcus_sp._JA-3-3Ab;
KF907430 Eukaryota;Cryptophyta;Cryptophyceae;Cryptophyceae_X;Cryptomonadales;Cryptomonas;Cryptomonas_phaseolus;
First several lines of my alignment file:
JN603992;Eukaryota;Discoba;Euglenozoa;Euglenida;Euglenales;Monomorphina;Monomorphina_rudicula;
TCAAGAGGGAAACAGCCCAGATCACCGTTTAAGGCCCCTAAATAATTACTAAGTGGTAAAGGAGGTAATCGTACATAGACAACCAGGAGGTTTGCCTAGAAGCAGCCACCCTTTAAAAATAGCGTAATAGCTTACTGGTCAAGTGCTTTTGCGCCGAAAATGAATGGGACTAAGTAATTTGCCGAAAATGTGAGATAATATCGGTAGAAGAGCGTTCTGTTTAAGTTTGAAGCAATAGTGTAAACAGTTGTGGACTAATCAGAAGTGAGAATGTCGGCTTAAGTAACGAAAACATTGGTGGAAATCCAATGCCCCGAAAACCTAAGGATTCCTTCGCAAGGTTCGTCCACGAAGGGTTAGTCAGGACCTAAAATGAGGTTTAAAAACGTAATTGATGGATAACGGGTTAATATTCCCGTACTGCATTTTATATGGTTAGGAGTGACGGAGAAGGCTATATCATCCAAGTTTTGGTTTTTGGTTTAATTTTTCAAGGTGTTGAGATATACTAAAAAAAGTATATTGAGCTGAGAAGAGAATACGAGAGATACTACGGTATTGAAAGTGATAAATGTCATACTTCCTAGAAAAACTTCTTTAATTTCTTTAATTTGTACCTGTACCTTAAACTGACACAGGTAGGTTGGTAGAGAATACCAAGGGGCGCGAGATAACTCTTTCTAAGGAACTCGGCAAAATGACTCTGTAACTTCGGGAGAAAGAGTGCCACATTTTTGTGGTCGCAGTAAACGGGCCCAAGCGACTGTTTACCAAAAACACAGGTCTCCGCTAAGTTGTAAAACTATGTATGGGGGCTGACGCCTGCCCAGTGCCGGAAGGTTAAAGAAATTGGTTAGCTTACGCGAAGCTGGTGACTGAAGCCCTGGTGAACGGCGGCCGTAACTATAACGGTCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCCGCACGAAAGGCGTAACGATTTGGGTACTGTCTCAGAAAGAGACTCGGTGAAATAGAATTGACTGTGAAGATGCGGTCTACTTGCACTTGGACAGAAAGACCCTATGAAGCTTTACTGTAACTTGAGACTGGTTTTGGGCTTTTCTTGCGCAGCATAGGTGGGAGGCTATGATTCTCTTTTTCGGAAGTTGAGGAGCCGCAATGTGAGATACCACTCTAGAAAAGTTAAAAATCTAATGACATTTTTTTAGCAAGATGTTTGACACTTTCAGGCGGGCAGTTTTACTGGGGCGGTAGCCTCCTAAAAAGTAACGGAGGCGTACAAAGGTTTTCTCAGGCTGGACGGAAATTAGTTGTAGAGTGTAAAGATATAAGAAAGCTTGACTGTGAGACCTACAAGTCGAACAGAGACTAAAGTCGGTCTTAGTGATCCGACGGTGCTGTATGGAAAGGCCGTCGCTCAACGGATAAAAGTTACTCTAGGGATAACAGGCTGATCTCCCCCAAGAGTTCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCATCGCAACCTGGGGCGGAAGTACGTTCCAAGGGTTGGGCTGTTCGCCCATTAAAGCGGTACGTGAGCTGGGTTCAGAACGTCGAGAGACAGTTCGGTCCATATCCGGTGTGAGTGTTAGAGTATTGAAAGGAGCTTTCCTTAGTACGAGAGGACCGGGAAGGACACACCACTGATGTGCCAGTTTTTGTACCAACAGAATATGCTGGGTAGTCACGTGTGGAGTGGATAACTGCTGAAAAGCATAAA
>KM817942;Eukaryota;Ochrophyta;Synurophyceae;Synurales;Synurales_X;Mallomonas;Mallomonas_bronchartiana;
GCGAAGAAGGACGTGGCTTCCGGCGAAACGTTTCGGGGAGTTGGAAGTAAGCTTTGATCCGAAAGTGTCCGAATGAGGAAACTCTAAAACTTATTACTGAATCTATAAGTAAAAAAAGAGCGAACCTAGGGAACTGAAACATCTTAGTACCTAGAGGAAAAGAAAGTAACAACGATTCCCTAAGTAGTGGCGAACGAAACGGGATCAGCCTAAACTTTTTAGAAATAAAAAGGGTTGTGGGGTAAGAAAGAGAGTGAATTGAGTAACTTTTAATAGTGAAAAGATTTTTCGGAATCATTTCATTATAAGAAATTAGGTGAAATAACTGGAAAGTTATACCAAAGAAAGTGATAGTCTTGTAACCGAAAATTTTTTAATAATTCTTTATTCCCGAGTAGCATGGGACACGTGGAATCCCGTGTGAATCTGCGAGGACCACCTCGTAAGGCTAAATATTCCTGGATGTCCGATAGCGAATAGTACCGCGAGGGAAAGGTGAAAAGAACCCCGGGAGGGGAGTGAAATAGAACGTGAAATTGTAAGCCCACAAACAGAAGGAGAACGACTTAGCGTTTAACTTCGTGCCTGTTGAAGAATGTTCCGGCGACTTATAGTTAGTGGCAGGTTAAGATAGAGATATCGAAGCCAGAGTGAAAGCGAGCTTGAATATAGAGCGTTGGTCACTAGTTATAGACCCGAACCCGGTTGATCTAACCATGGCCAGGATGAAACTTGGGTAACACTAAATGGAGGTCCGAACCGACTGATGTTGAAAAATCAGCGGATGAGTTGTGGTTAGGGGTGAAATGCCAATCGAAACCGGAGCTAGCTGGTTCTCCCCGAAATGTATTTAGGTACAGCGGTTAATATTATAGCGTAGGGGTAAAGCACTGTTTCGGTGCGGGCTGGTAACTCGGTACCAAATCGAGGCAAACTCTGAATACTACGTGTACAATTAACCAGTAAGACTATGGGGGATAAGCTTCATAGTCAAGAGGGAAACAGCCCAGATCACCAGCTAAGGCCCCTAAATAATTACTAAGTGATAAAGGAAGTGGAAAGGCTTAGACAACCAGGAGGTTTGCTTAGAAGCAGCAATCCTTTAAAAAGTGCGTAATAGCTTACTGGTCTAGCAATTCTGCGCCGAAAACTTACGGGACTAAGTAATTTGCCGAAGCTGTGAGATATACTTTTTGTATATCGGTAGGGGAGCGTTCTGTTGTAGGTTGAAGTGTTAGCGAAAGCGGGCATGGACGAAACAGAAGTGAGAATGTCGGATTGAGTAACGAAAACATTGGTGAGAATCCAATGCTCCGAAACCCTAAGGTTTCCTCCGCAAGGCTCGTCCACGGAGGGTGAGTCAGGTCCTAAGGTGAGGCTGAGAAGCGTAGTCGATGGACAATAGGTTAATATTCCTATACTGATTTTTACTGGTATCGAGGGACGAAGAAGGCTAAACTAGCCAGATGTTGGTTACTGGTTTAAGAATCGAGGTGTTGAAGATTAGAGAAAATAATCTGAGCTGAGAAACGAATACAAGATTGTAAAAAATCAAAGTAGTTGATGTCATACTTTCAAGAAAAGCTCGCTATGCCTTAAGTAAAAATCACCTGTACCTTAAACCGACACAGGTAGGGAGGTAGAGAATACCAAGGAGCGCGGGAGAACCCTCCTTAAGGAACTCGGCAAAATAGCTCTGTAACTTCGGAAGAAAGAGTGCCTTTTAGGCTGCAATATCAAGTTCCAAGCAACTGTTTAGCAAAAACACAGGTCTCCGCAAAGTGGTAACACGACGTATGGGGGCTGACGCCTGCCCAGTGCCGGAAGGTTAAAGAAGTCGGTTAGCGCAAGCAAAGCTGGTAACTGAAGCCCCGGTGAACGGCGGCCGTAACTATAACGGTCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCCGCACGAATGGCGTAATGATTTGGAAGCTGTCTCAAGGAGGGACTCGGTGAAATAGAACTGGCTGTGAAGATGCGGCCTACCTGCGCCTGGACAGTAAGACCCTATGAAGCTTTACTGTATCCTGAAATTGGATTTCGATTTTATTTGCGCAGAATAGGTGGGAGACATTGAAACTTTGCTTTCGGGCTTAGTTGAGTCATTAGTGAGATACCACTCTGATAAAATTAAAATTCTAACTTCAAACCGTAAGCCGGTTGAAAGACAGTTTCAGGTGGGCAGTTTGACTGGGGCGGTCGCCTCCCAAAAGGTAACGGAGGCGTACAAAGGTTCCCTCAGATCGGTCAGAAATCGATCTTCGAGTGTAAAGGCAAAAGGGAGCTTAACTGTGAGACAAACAAGTCGAGCAGAGACGAAAGTCGGTCTTAGTGATCTGACGATGCTGTGTGGAAAGGTCGTCACTCAACGGATAAAAGTTACTCTAGGGATAACAGGCTGATCTCCCCCAAGAGTTCCTATCGACGGGGAGGTTTGGCACCTCAATTGGGGGCACATTTTTAGTAATAGAATGTGACAATTTGGCTATATGCTGGAAACTCCGTTCACGAAAAGTGATGTAGAATCCTAAAATACTGCATCATCAAAAATGATGAAGTAAAAATTTTTAGGTGCGGACAATCAGCAGGGAAGTTTACCATTCTTTTGGTATAACCCCTCAACGACTACACGCCGAACAACCAATTTTTTGGTTGATGATATAGTCTGATCTTTTAGGCGACTAAAAGGGCTTTTAAAAAGCCATAGAATTTTAAAGGTTGTTAGAACCTTTGAATATTTTATTAACATATTGCGATGTCGGCTCATCGCATCCTGGGGCGGTAGTACGTCCCAAGGGTTGGGCTGTTCGCCCATTAAAGCGGTACGTGAGCTGGGTTCAGAACGTTGCGAAACAGTTCGGTCCATATCCGGCGTAGGCGTTAGAGTATTGAGAGTGACCTTTCTCTAGTACGAGAGGACCGAGAAAGACATACCTCTAGTGTACCAGTTATCGTGCCATCGGTAGACGCTGGGTAGCTATGTATGGTTAGGATAACCGCTGAAAGCATCT
>DQ629271;Eukaryota;Streptophyta;Embryophyceae;Embryophyceae_X;Embryophyceae_XX;Myurium;Myurium_hochstetteri;
GCACCCAGAGACGAGGAAGGGCGTAGCAAGCGACGAAATGCTTCGGGGAGCTGAAAATAAGCATGGATCCGGAGATTCCCGAATAGGTTAACCTTTTGAACTGCTGCTGAATTCATAGGCAGGCAAGAGACAACCTGGCGAACTGAAACATCTTAGTAGCCAGAGGAAAAGAAAGCAAAAGCGATTCCCGTAGTAGCGGCGAGCGAAATGGGAGTAGCCTAAACCGTGAAAACGGGGTTGTGGGGGAGCAAAGTAGGCGTTGTGTTGCTAGGCGAAGCAGTTGAATCCTGCACCATAGATGGTGAGAGTCCAGTAGCCGAAAGCATCACTAGTTTATGCTCTAACCCGAGTAGCATGGGGCACGTGGAATCCCGTGTGAATCAGCAAGGACCACCTTGTAAGGCTAAATACTCCTGGGTGACCGATAGCGAAGTAGTACCGTGAGGGAAAGGTGAAAAGAACCCCCATAGGGGAGTGAAATAGAACATGAAACCGTAAGCTCCCAAGCAGTGGGAGGAGAATCGAATCTCTGACCGCGTGCCTGTTGAAGAATGAGCCGGCGACTTATAGGTAGTGGCCTGGTTAAGGGAACCCACCGGAGCCGTAGCGAAAGCGAGTCTTCCTAGGGCAATTGTCACTACTTATGGACCCGAACCTGGGTGATCTATCCATGACCAGGATGAAGCTTGGGTGAAACTAAGTGGAGGTCCGAACCGACTGATGTTGAAAAATCAGCGGATGAGTTGTGGTTAGGGGTGAAATGCCACTCGAACCCAGAGCTAGCTGGTTCTCCCCGAAATGCGTTGAGGCGCAGCAGTTGACTAGACTATCTAGGGGTAAAGCACTGTTTCGGTGCGGGCTGCGAGAGCGGTACCAAATCGAGGCAAACTCTGAATACTAGATATGATCCCCAAATAACAGGGGATAAAGGTCAACCAGTGAGACGGTGGGGGATAAGCTCCATCGTCAAGAGGGAAACAGCCCAGATCACCAGGTAAGGCCCCTAAATGACCGCTCAGTGGTAAAGGAGGTAGGAGTGCAAAGACAGCCAGGAGGTTTGCCTAGAAGCAGCCACCCTTGAAAGAGTGCGTAATAGCTCACTGATCAAGCGCTCTTGCGCCGAAGATGAACGGGACTAAGCGGCCTGCCGAAGCTGTGGGATGTCAAAATACATCGGTAGGGGAGCGTTCCGCCTTAGAGGGAAGCATCAGCGCGAGCAGGTGTGGACGAAGCGGAAGCGAGAATGTCGGCTTGAGTAACGCAAACATTGGTGAGAATCCAATGCCCCGAAAACGTAAGGGTTCCTCCGCAAGGTTCGTCCACGGAGGGTGAGTCAGGGCCTAAGATCAGGCCGAAAGGCGTAGTCGATGGACAACAGGTGAATATTCCTGTACTACCCATTGTTGGTCCCGAGGGACGAAGGAGGCTAGGTTAGCCGAAAGATGGTTATCGGTTCAAAGACGCAAGGTTGTTCACCTTAATTCTTTTAAGATAAAAAAGGGTAGAGAAAATGCCTCGAGCCAACGTCTGAGTACTAGGCGCTACGGTGCTGAAGTAACCAATGCCACACTTCCAAGAAGAGCTCGAACGACCATTAACAAGTGGGTACCTGTACCCGAAACCGACACAGGTAGGTAGGTAGAGAATACCTAGGGGCGCGAGATAACTCTCTCTAAGGAACTCGGCAAAATAGCCCCGTAACTTCGGGAGAAGGGGTGCCTCCTTACCAAGGAGGTCGCAGTGACCAGGCCCAGGCGACTGTTTACCAAAAACACAGGTCTCCGCAAAGTCGTAAGACCATGTATGGGGGCTGACGCCTGCCCAGTGCCGGAAGGTTAAGGAAGTTGGTGATCTGATGACAGAGAAGCCAGCGACCGAAGCCCCGGTGAACGGCGGCCGTAACTATAACGGTCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCCGCACGAAAGGCGTAACGATCTGGGCACTGTCTCGGAGAGAGACTCGGTGAAATAGACATGTCTGTGAAGATGCGGACTACCTGCACCTGGACAGAAAGACCCTATGAAGCTTTACTGTTCCCTGGGATTGGCTTTGGGCTCTTCTTGCGCAGCTTAGGTGGAAGGCGAAGAAGGCCCTCTTCCGGGAGGGCTCGAGCCATCAGTGAAATACCACTCTAGAAGAGCTAGAATTCTAACCTTGTGTCAATACGGGCCAAGGGACAGTCTCAGGTAGACAGTTTCTATGGGGCGTAGGCCTCCCAAAAGGTAACGGAGGCGTGCAAAGGTTTCCTCAGGCTGGACGGAAATCAGCCTTCGAGTGTAAAGGCAAAAGGGAGCTTGACTGCAAGACCTACCCGTCGAGCAGGGACGAAAGTCGGCCTTAGTGATCCGACGGTACCGAGTGGAAGGGCCGTCGCTCAACGGATAAAAGTTACTCTAGGGATAACAGGCTGATCTTCCCCAAGAGTTCACATCGACGGGAAGGTTTGGCACCTCGATGTCGGCTCTTCGCCACCTGGGGCGGTAGTACGTTCCAAGGGTTGGGCTGTTCGCCCATTAAAGCGGTACGTGAGCTGGGTTCAGAACGTCGTGAGACAGTTCGGTCCATATCCGGTGCGGGCGTTAGAGCATTGAGAGGACCTTTCCCTAGTACGAGAGGACCGGGAAGGACGCACCTCTGGTGTACCAGTTATCGTGCCCACGGTAAACGCTGGGTAGCCATGTGCGGAG