Processing fails with 1.43.0

Kendra · February 11, 2020, 8:39pm

I haven’t been able to complete a project processing with 1.43.0. I’m using the same bash as I was using with 1.41.x. Two projects have failed at remove.seqs after chimera checking with the following error (different seq name but same error) [ERROR]: M01676_138_000000000-CB4PJ_1_1101_10080_19438 is not in your count table. Please co rrect. The other fails at pre.cluster with no error, just quits. I’ll send you all of the logfiles.

westcott · February 13, 2020, 4:57pm

Could you send your input files for the pre.cluster as well?

westcott · February 13, 2020, 8:28pm

I noticed that the count file contained 3 empty groups which was causing a problem. I removed them using the make.table command, and the pre.cluster command was able to finish without issue.

mothur > make.table(count=zymoTest.trim.contigs.good.good.count_table, compress=t) - removed 3 empty samples

mothur > pre.cluster(fasta=zymoTest.trim.contigs.good.unique.good.filter.fasta, count=current, diffs=2)

Kendra · February 14, 2020, 4:03pm

thanks Sarah, I’ve updated my bash after chimera checking.

would it be reasonable to add the make.table command before pre.cluster to my usual processing script and just run it every time?

westcott · February 14, 2020, 5:43pm

I added a check for missing groups in pre.cluster, so that it will provide a error message instead of crashing. I’d like to correct the issue if it was caused within mothur. Did you modify the table outside of mothur or did mothur include empty groups?

Kendra · February 19, 2020, 6:14pm

Hi Sarah, I didn’t do anything outside of mothur. I’m running this as a batch.

`#!bash
PROJECTNAME=$1




mothur "#make.contigs(file=$PROJECTNAME.file, processors=32); 
summary.seqs(fasta=current); 
screen.seqs(fasta=current, group=current, summary=current, maxambig=0, maxlength=275); 
summary.seqs(fasta=current); 
unique.seqs(fasta=current); 
summary.seqs(fasta=current, name=current); 
count.seqs(name=current, group=current); 
align.seqs(fasta=current, reference=silva.nr_v119.v4.align); 
summary.seqs(fasta=current, count=current); 
screen.seqs(fasta=current, count=current, summary=current, start=1968, end=11550, maxhomop=8); 
filter.seqs(fasta=current, vertical=T); 
summary.seqs(fasta=current, count=current);
make.table(count=current, compress=t);
pre.cluster(fasta=current, diffs=2, count=current); 
summary.seqs(fasta=current, count=current); 
chimera.vsearch(fasta=current, count=current, dereplicate=t); 
remove.seqs(fasta=current, accnos=current, count=current); 
summary.seqs(fasta=current, count=current); 
classify.seqs(fasta=current, count=current, reference=silva.nr_v119.v4.align, taxonomy=silva.nr_v119.tax, cutoff=80); 
remove.lineage(fasta=current, count=current, taxonomy=current, taxon=Chloroplast-Mitochondria-unknown-Archaea-Eukaryota); 
summary.tax(taxonomy=current, count=current); 
dist.seqs(fasta=current, countends=F, cutoff= 0.03, processors=16); 
cluster(column=current, count=current, method=opti); 
summary.seqs(processors=32); 
make.shared(list=current, count=current); 
classify.otu(list=current, count=current, taxonomy=current); 
get.oturep(fasta=current, count=current, list=current, method=abundance); 
count.groups(shared=current); 
summary.single(shared=current, calc=nseqs-sobs-coverage-shannon-shannoneven-invsimpson, subsample=10000); 
dist.shared(shared=current, calc=braycurtis-jest-thetayc, subsample=10000); 
sub.sample(count=$PROJECTNAME.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.pick.count_table, shared=current, list=$PROJECTNAME.trim.contigs.good.unique.good.filter.precluster.pick.pick.opti_mcc.list, size=10000, persample=true, label=0.03); 

sub.sample(taxonomy=$PROJECTNAME.trim.contigs.good.unique.good.filter.precluster.pick.nr_v119.wang.pick.taxonomy, count=$PROJECTNAME.trim.contigs.good.unique.good.filter.precluster.denovo.uchime.pick.pick.pick.count_table, list=$PROJECTNAME.trim.contigs.good.unique.good.filter.precluster.pick.pick.opti_mcc.list, size=10000, persample=true, label=0.03); 
summary.tax(taxonomy=current, count=current); system(mkdir send); 
system(cp *shared send); system(cp *cons.tax* send); system(cp *pick.tax.summary send); system(cp *pick.subsample.tax.summary send); system(cp *.rep.fasta send); system(cp *lt.ave.dist send); system(cp *groups.ave-std.summary send); system(cp mothur.bash send); system(cp mothur.*.logfile send);"

`

westcott · February 19, 2020, 8:19pm

The issue is the combination of chimera.vsearch with dereplicate=t and remove.seqs with the count table included. If the dereplicate parameter is false, then if one group finds the sequence to be chimeric, then all groups find it to be chimeric. If you set dereplicate=t, if a group finds a sequence to be chimeric it is only removed from that group. When dereplicate=t mothur creates a modified count file with the chimeric reads removed for you. You do not want to include the modified count file with the remove.seqs command. Instead try this:

mothur > chimera.vsearch(fasta=current, count=current, dereplicate=t) - remove chimeras from count table and create accnos file for removing them from other files

mothur > remove.seqs(fasta=current, accnos=current) - remove chimeras from the fasta file

Kendra · February 24, 2020, 5:24pm

thanks, this worked. However minor point, the resulting fasta doesn’t include “denovo.vsearch” in the name like it did in past versions. Can that be added back in?

1.43 output

mothur > 
remove.seqs(fasta=current, accnos=current)
Using zymoTest.trim.contigs.good.unique.good.filter.precluster.denovo.vsearch.accnos as input file for the accnos parameter.
Using zymoTest.trim.contigs.good.unique.good.filter.precluster.fasta as input file for the fasta parameter.
[WARNING]: This command can take a namefile and you did not provide one. The current namefile is zymoTest.trim.contigs.good.names which seems to match zymoTest.trim.contigs.good.unique.good.filter.precluster.fasta.
Removed 31904 sequences from your fasta file.

Output File Names: 
zymoTest.trim.contigs.good.unique.good.filter.precluster.pick.fasta

system · March 5, 2020, 5:24pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
pre.cluster error: removing group mothur bugs	7	4010	February 18, 2015
Help in pre.cluster Commands in mothur	3	325	August 18, 2023
Cluster.split problem Commands in mothur	1	2822	October 28, 2014
Pre.cluster bug Commands in mothur	3	464	February 1, 2019
Pre.cluster crash v1.46.1 mothur bugs	7	696	November 25, 2021

Processing fails with 1.43.0

Related topics