Error with classify.seqs(), " **** Exceeded maximum allowed command errors, quitting **** "

Hello,

Should probably start off by saying I’m new to this, and I’m trying to follow the MiSeq protocol to completion before diving into details.

Running into an error using classify.seqs() command. I’m unsure exactly why this is occurring. The .align and .tax file are from the mothur reference file recommendation. Error is below, copy/pasting the head and tail of the error.

      mothur > classify.seqs(fasta=04test.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta, count=04test.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table, reference=silva.v4.fasta, taxonomy=silva.nr_v138_2.tax)


      Using 32 processors.
      Generating search database...    DONE.
      It took 28 seconds generate search database.

      Reading in the silva.nr_v138_2.tax taxonomy...  [ERROR]: KC189639.Unc81268 is missing the final ';', ignoring.
      [ERROR]: JQ769778.Unc88719 is missing the final ';', ignoring.
      [ERROR]: LN612929.UncR5461 is missing the final ';', ignoring.
      [ERROR]: KC439348.E88Spec8 is missing the final ';', ignoring.
      [ERROR]: JQ684271.Unc76929 is missing the final ';', ignoring.
      [ERROR]: FPLK01001426.GJ6Z3519 is missing the final ';', ignoring.
      [ERROR]: FJ230802.Unc66247 is missing the final ';', ignoring.
      [ERROR]: KC683071.Unc09sfs is missing the final ';', ignoring.
      [ERROR]: KC683078.Unc66555 is missing the final ';', ignoring.
      [ERROR]: AY221059.Unc77530 is missing the final ';', ignoring.

      **** Exceeded maximum allowed command errors, quitting ****
      [ERROR]: KC683124.Unc09sjy is missing the final ';', ignoring.
      DONE.
      'KM213004.Unc55991' is in your template file and is not in your taxonomy file. Please correct.
      'KC247157.HZLGynue' is in your template file and is not in your taxonomy file. Please correct.
      'DQ181686.UncCy124' is in your template file and is not in your taxonomy file. Please correct.


Couple thousand lines or so later…

      'GU437694.Unc46467' is in your template file and is not in your taxonomy file. Please correct.
      'FJ800528.Unc47314' is in your template file and is not in your taxonomy file. Please correct.
      'KF037397.Unc57379' is in your template file and is not in your taxonomy file. Please correct.
      'FJ538172.UncCl239' is in your template file and is not in your taxonomy file. Please correct.
      DONE.
      It took 35 seconds get probabilities.

      mothur > 

Code I’ve run is below;

      make.file(inputdir=., type=fastq, prefix=04test)
      make.contigs(inputdir=., outputdir=., trimoverlap=T, file=04test.files, pdiffs=2, checkorient=t)
      summary.seqs(fasta=04test.trim.contigs.fasta)
      screen.seqs(fasta=04test.trim.contigs.fasta, count=04test.contigs.count_table, maxambig=0, maxlength=275, maxhomop=8)
      summary.seqs(fasta=04test.trim.contigs.good.fasta, count=04test.contigs.good.count_table)
      unique.seqs(fasta=04test.trim.contigs.good.fasta, count=04test.contigs.good.count_table)
      summary.seqs(fasta=04test.trim.contigs.good.fasta, count=04test.contigs.good.count_table)
      pcr.seqs(fasta=silva.nr_v138_2.align, start=11895, end=25318, keepdots=F)
      rename.file(input=silva.nr_v138_2.pcr.align, new=silva.v4.fasta)
      align.seqs(fasta=04test.trim.contigs.good.unique.fasta, reference=silva.v4.fasta)
      summary.seqs(fasta=04test.trim.contigs.good.unique.align, count=04test.trim.contigs.good.count_table)
      screen.seqs(fasta=04test.trim.contigs.good.unique.align, count=04test.trim.contigs.good.count_table, start=1977, end=11546)
      summary.seqs(fasta=current, count=current)
      filter.seqs(fasta=04test.trim.contigs.good.unique.good.align, vertical=T, trump=.)
      unique.seqs(fasta=04test.trim.contigs.good.unique.good.filter.fasta, count=04test.trim.contigs.good.good.count_table)
      summary.seqs(fasta=current, count=current)
      pre.cluster(fasta=04test.trim.contigs.good.unique.good.filter.unique.fasta, count=04test.trim.contigs.good.unique.good.filter.count_table, diffs=2)
      summary.seqs(fasta=04test.trim.contigs.good.unique.good.filter.unique.precluster.fasta)
      chimera.vsearch(fasta=04test.trim.contigs.good.unique.good.filter.unique.precluster.fasta, count=04test.trim.contigs.good.unique.good.filter.unique.precluster.count_table, dereplicate=t)
      classify.seqs(fasta=04test.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta, count=04test.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table, reference=silva.v4.fasta, taxonomy=silva.nr_v138_2.tax)

I have a suspicion the error is a result from the taxonomy and alignment file lineages mismatching. Ran some grep searches between the files and easily found some discrepancies. The alignment file (silva.v4.fasta) has an carrot prefix, spaces instead of underscores, and a more complete taxonomic lineage. See below;

      grep "KC189639.Unc81268" silva.v4.fasta
      >KC189639.Unc81268      93.31   Bacteria;Pseudomonadota;Alphaproteobacteria;Hyphomicrobiales;Hyphomicrobiales Incertae Sedis;Incertae Sedis;


      grep "KC189639.Unc81268" silva.nr_v138_2.tax
      KC189639.Unc81268       Bacteria;Pseudomonadota;Alphaproteobacteria;Hyphomicrobiales;Hyphomicrobiales_

So then I searched for “_” in the taxonomy file.

      grep "_" ./silva.v4.fasta

      [elundin@bio 04_Test_20241030]$ grep "_" ./silva.nr_v138_2.tax
      AB824402.UncB2940       Bacteria;Bacillota;Clostridia;Lachnospirales;Lachnospiraceae;Lachnospiraceae_NK4A136_group;
      AF005457.Di2Litto       Eukaryota;Arthropoda;Insecta;Archaeognatha;Archaeognatha_fa;Archaeognatha_ge;
      EF465492.FrrCapuc       Eukaryota;Diatomea;Coscinodiscophytina_cl;Fragilariales;Fragilariales_fa;Fragilaria;

      .....Couple thousand lines later.....

      EF032753.UncA4888       Bacteria;Acidobacteriota;Acidobacteriae;Subgroup_2;
      CP030993.AraHy387       Eukaryota;Phragmoplastophyta;Embryophyta;Fabales;Fabales_fa;Arachis;
      EF032777.Unc59335       Bacteria;Verrucomicrobiota;Omnitrophia;Omnitrophales;Omnitrophaceae;Candidatus_Omnitrophus;
      MF034602.HazBasi3       Eukaryota;Chlorophyta_ph;Ulvophyceae;Ulotrichales;Ulotrichales_fa;Hazenia;
      LC081127.A8NRugos       Eukaryota;Scalidophora;Kinorhyncha_cl;Homalorhagida;Homalorhagida_fa;Homalorhagida_ge;
      HQ384693.D2ECaryo       Eukaryota;Phragmoplastophyta;Embryophyta;Solanales;Solanales_fa;Montinia;

So, something’s up with the taxonomy file, the lineages have been messed up somehow. the ‘_fa’ suffixes aren’t correct.

I’m assuming the taxonomy between the two files should match, so I considered running a script to input the lineages from the alignment file into the taxonomy file using the unique identifiers, but given I’m new to this I don’t know what unknown errors I’ll be creating.

I feel like that may be the source of the issue for classify.seqs() not working. I reran the script with another persons code, whose mothur.log file indicated it had worked yet I ran into the same issue. I also re-downloaded the reference files without a change.

Any help is appreciated, thanks!

Update: I ran the same script with the Silva 138.1 alignment and taxonomy file. I did get some [WARNINGS] but no [ERRORS] after running classify.seqs(), so I think it works with the older reference files. Output below;

      mothur > classify.seqs(fasta=04test.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta, count=04test.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table, reference=silva.v4.fasta, taxonomy=silva.nr_v138_1.tax)


      Using 32 processors.
      Reading template taxonomy...     DONE.
      Reading template probabilities...     DONE.
      It took 6 seconds get probabilities.
      Classifying sequences from 04test.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta ...
      [WARNING]: M03075_700_000000000-LFHRK_1_1111_24969_25468 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_1111_13672_3211 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_2108_17191_22419 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_2111_4016_15124 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_2110_16130_25216 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_2102_14024_6183 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_1105_4578_15010 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_2105_23713_24630 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_1103_22850_15726 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_1101_27262_14088 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

      **** Exceeded maximum allowed command warnings, silencing warnings ****
      [WARNING]: M03075_700_000000000-LFHRK_1_1110_26246_17600 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_1114_27275_13949 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      [WARNING]: M03075_700_000000000-LFHRK_1_1112_10395_19553 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.

      ....... 100 or so lines later.........

      157
      [WARNING]: M03075_700_000000000-LFHRK_1_2107_8852_16030 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences.
      157
      157
      156
      157
      157
      157
      157
      157
      156
      157
      156
      157
      157
      157
      157
      157
      
      It took 6 secs to classify 5015 sequences.
      
      
      It took 0 secs to create the summary file for 5015 sequences.
      
      
      Output File Names: 
      04test.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.taxonomy
      04test.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.nr_v138_1.wang.tax.summary

Thanks for bringing this to my attention. The reference files should now be fixed if you want to redownload them and try again.

Please let me know how it goes for you with the new files.

Pat

Hey Dr. Schloss,

It appears to work, I didn’t see any errors. Thanks for updating that!

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.