Issues with cluster.split command removing groups no list file provided and is stuck

jgcx December 31, 2021, 10:15pm 1

The cluster.split command is not providing a list file and is removing groups at the end. I have repeated the command with taxlevel=4 and taxlevel=6 and the same thing keeps happening. I do not have a mock community so did not perform any of the commands concerning the removal of a mock community. Once the cluster.split command is completed it displays the number of seconds it took to complete, in my case it was about 47,000 seconds, but no output files get listed and it is then stuck with no mothur> showing. I know it is not a computer RAM and space issue b/c the whole time the command was running only 20%-35% of the RAM was being used and I am using a 4TB storage device.

Below are the commands used up to the cluster.split command.

fastq.info(fastq=R_2021_10_05_12_00_01_user_S51010052021_Chip.fastq)

summary.seqs(fasta=,R_2021_10_05_12_00_01_user_S51010052021_Chip.fasta, processors=8)

trim.seqs(fasta=R_2021_10_05_12_00_01_user_S51010052021_Chip.fasta, oligos=CXMicroMothurOligos.txt, maxambig=0, maxhomop=6, bdiffs=0, pdiffs=0, minlength=265, keepfirst=285, flip=F, processors=8)

summary.seqs(fasta=current, processors=8)

get.current()

unique.seqs(fasta=current)

get.current()

summary.seqs(fasta=current, name=current, processors=8)

count.seqs(name=current, group=current)

get.current()

#pcr.seqs(fasta=silva.nr_v138_1.align,start=11895,end=25318,keepdots=F,processors=32)
#rename.file(input=silva.nr_v138_1.pcr.align,new=silva.v4.fasta)
#summary.seqs(fasta=silva.v4.fasta)
#get.current()

align.seqs(fasta=current, reference=silva.v4.fasta, processors=2)

summary.seqs(fasta=current, count=current, processors=8)

get.current()

summary.seqs(fasta=current, count=current, processors=8)

screen.seqs(fasta=current, count=current, summary=current, start=1967, optimize=end, criteria=95, processors=8)

summary.seqs(fasta=current, count=current, processors=8)

filter.seqs(fasta=current, vertical=T, trump=.)

summary.seqs(fasta=current, count=current, processors=8)

unique.seqs(fasta=current, count=current)

summary.seqs(fasta=current, count=current, processors=8)

get.current()

pre.cluster(fasta=current, count=current, diffs=2)

summary.seqs(fasta=current, count=current, processors=8)

chimera.vsearch(fasta=current, count=current, dereplicate=t)

remove.seqs(accnos=current, fasta=current)

summary.seqs(fasta=current, count=current, processors=8)

classify.seqs(fasta=current, count=current, template=silva.nr_v138_1.align, taxonomy=silva.nr_v138_1.tax, cutoff=80, processors=8)

remove.lineage(fasta=current, count=current, taxonomy=current, taxon=Chloroplast-Mitochondria-unknown-Eukaryota)

summary.seqs(fasta=current, count=current, processors=8)

summary.tax(taxonomy=current, count=current)

get.current()

rename.file(fasta=current, count=current, taxonomy=current, prefix=10_05_2021CurationComplete)

cluster.split(fasta=current, count=current, taxonomy=current, taxlevel=6, cutoff=0.30)

Alexandre_Thibodeau January 5, 2022, 2:26pm 2

Hello!

It is strange that not file are being produced when it is over.

Firstly, I suggest using 0.03 as cut-off for the clustering. The idea here is to cluster at “species/traditional level” after everything was split based on taxonomy at the genus level.

Also, could you please post the lines in the log file for the end message of cluster.split?

Regards,

jgcx January 7, 2022, 1:46pm 3

Hi @Alexandre_Thibodeau Happy New Year!

Update: I finally figured out how to get a virtual computer via a cloud computing service. In the mean time I have been using split.abund. However, I have a few questions about this that I will make a different post for.

Here are the lines of the end message I was getting when trying to use cluster.split().

Removing group: BC44.515F because all sequences have been removed.

Removing group: BC45.515F because all sequences have been removed.

Removing group: BC47.515F because all sequences have been removed.

Removing group: BC9.515F because all sequences have been removed.

Removed 4345912 sequences from your count file.

/******************************************/

It took 42377 seconds to split the distance file.

system Closed January 17, 2022, 1:47pm 4

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views	Activity
cluster.split issues Commands in mothur	1	1551	May 25, 2015
Cluster.split issue (again, sorry) mothur bugs	4	499	December 11, 2021
cluster.split hangs before merging list files mothur bugs	4	2782	November 25, 2015
Cluster.split leaving temp files behind, but no error message Commands in mothur	2	478	June 11, 2021
cluster.split Commands in mothur	4	1275	May 26, 2017