Changes in version 1.47.0

Hi,
I’m trying to upgrade the Debian package of mothur from version 1.46.1 to version 1.47.0. Debian is using a CI test using this batch file which worked smoothly in the previous version. When I try to run this with version 1.47.0 in the Debian CI system I get:

mothur > cluster(method=furthest, column=HA.unique.dist, name=HA.names, cutoff=0.01, precision=1000)
Unable to open HA.names. Trying mothur's executable directory HA.names.
Unable to open HA.names.
Unable to open HA.names
Using 2 processors.
[ERROR]: did not complete cluster.

I wonder what I need to change to get this test functional again. (Please note: I’m not a mothur user, just maintaining the package for Debian)

Kind regards, Andreas.

1 Like

Seems this issue is very similar to this thread - I opened it anyway since I need actual syntax help for the given example since I’m not a mothur user.

The 1.47.0 release does include several changes that will effect end user scripts and batch files. Some are forced and some are optional. The MiSeq_SOP MiSeq SOP reflects the changes. These are some of the highlights:

  1. Addition of mothurhome keyword. This can be used throughout mothur, but perhaps is most helpful for the make.file command.

mothur > make.file(inputdir=mothurhome, type=fastq, prefix=stability)

  1. Changes intended to move users to count tables over the name / group files. This starts with the make.contigs or trim.seqs commands. Both commands no longer output group files, instead they output count files. This will require updates to batches and scripts since there is no longer a group file outputted.

  2. Screening options are now added to make.contigs. You can still run them separately, but running them with make.contigs improves speed by avoiding reprocessing the files.

mothur > make.contigs(file=stability.files, maxambig=0, maxlength=275)

  1. Unique.seqs now outputs a count file by default.

  2. Chimeras are removed by default. You can still run the remove.seqs command without error, but it is not necessary.

  3. Blast options are removed, so any batches or scripts with Blast as an option will fail.

If you post your complete batch file I can help you figure out the changes needed.

Another quick note: We do not recommend using the ‘furthest’ method for clustering. The opti method is the default and our recommendation due to the improved quality of OTU assignments. Here’s a link to our paper about this method, http://www.schlosslab.org/assets/pdf/2017_westcott.pdf.

If you post your complete batch file I can help you figure out the changes needed.

In my initial posting I linked to this batch file which reads:

unique.seqs(fasta=HA.fasta)
dist.seqs(fasta=HA.unique.fasta, countends=F, cutoff=0.01)
cluster(method=furthest, column=HA.unique.dist, name=HA.names, cutoff=0.01, precision=1000)
summary.single(list=HA.unique.fn.list, calc=nseqs-sobs-chao, label=unique-0.001-0.003-0.005-0.008)
heatmap.bin(scale=linear, label=unique-0.001-0.003)
rarefaction.single(calc=sobs-chao, label=unique-0.001-0.003, freq=10)

The according data are in the directory of the Debian autopkgtest.

Please note: There is no need to get this actual batch file forcibly fixed. We simply want to do any sensible test for mothur. Lacking a sensible answer for the issue “please provide url for testfiles #746” (no idea why this was simply closed - I just don’t get what XXX might mean) we have to use some simple datasets and a simple test.

Any help to get a sensible test framework for the Debian package is welcome,
Andreas.

I recommend using the example files from the MiSeq SOP for testing.

Try this instead:

unique.seqs(fasta=HA.fasta) - **now outputs a count file by default**
dist.seqs(fasta=HA.unique.fasta, countends=F, cutoff=0.01)
cluster(method=furthest, column=HA.unique.dist, count=current, cutoff=0.01, precision=1000) - **update to use count file instead of name file**
summary.single(list=HA.unique.fn.list, calc=nseqs-sobs-chao, label=unique-0.001-0.003-0.005-0.008)
heatmap.bin(scale=linear, label=unique-0.001-0.003)
rarefaction.single(calc=sobs-chao, label=unique-0.001-0.003, freq=10)

Alternatively you could force the unique.seqs command to output a names file with the following:

unique.seqs(fasta=HA.fasta, output=name)

Thank you. This seems to work.

Any further hint for downloading the data for running TestBatches?

Andreas.

We are still working on making the testing batches and files accessible.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.