Remove.lineage gets stuck on alignreport file

I’m running the following

remove.lineage(taxonomy=file.taxonomy.dat,taxon='Bacteria;Firmicutes;-Bacteria;Actinobacteria;',fasta=fasta_in.dat,group=group_in.dat,alignreport=alignreport_in.dat,list=list_in.dat,name=name_in.dat,dups=true)

where alignreport_in.dat points to this file tools-iuc/tools/mothur/test-data/Mock_S280_L001_R1_001_small.contigs.report at 08fc71363fdec7c4ff2c912e1f85dab4ea306c43 · bernt-matthias/tools-iuc · GitHub

Then I get gigabytes of the following message

[WARNING]: M00967_43_000000000-A3JHG_1_1101_19936_3208 is in your alignreport file more than once.  Mothur requires sequence names to be unique. I will only add it once.

I guess the format of the file is wrong, but mothur should still not fail in this way.

Background, I’m currently trying to work on updating the mothur Galaxy tool wrappers.

Edit: just retried with an up to date alignreport file (which has 1 more column) with the same effect. Also noticed that the header has one additional tab character at the end.

xref remove.lineage gets stuck on alignreport file · Issue #848 · mothur/mothur · GitHub

I don’t know anything about what galaxy has loaded since we don’t work with them. Can you retry without including the alignreport argument?

Pat

Thanks for the reply. Without the alignreport the command executes successful in less then a second.

The Galaxy tool uses a conda environment for executing the mothur command (and it just executes the single command).

The file names mentioned above are pointing to the following files:

  • file.taxonomy.dat: abrecovery.pds.wang.taxonomy
  • fasta_in.dat abrecovery.fasta
  • group_in.dat: abrecovery.groups
  • alignreport_in.dat: Mock_S280_L001_R1_001_small.contigs.report
  • list_in.dat: amazon.an.list
  • name_in.dat: abrecovery.names

Those files can be found at tools-iuc/tools/mothur/test-data at 08fc71363fdec7c4ff2c912e1f85dab4ea306c43 · bernt-matthias/tools-iuc · GitHub

Let me know if I can make it easier for you to reproduce the problem, e.g. a zip file with all the files and a command line.

Thanks - I’d just leave out the align report. mothur doesn’t use it for much of anything

Pat