problem with clearcut

Hello,

I’m using the standard operating procedure to analyse my 454 data. Everything worked well excepted for the generation of the phylogenetic trees. This command: mothur > dist.seqs(fasta=final.fasta, output=phylip, processors=2) is ok but it seems that the next command: mothur > clearcut(phylip=final.phylip.dist) is not working because nothing happens. Did I miss something? I read on the website that clearcut was now integrated in mothur, I have the following version: mothur 1.31.2+repack-0biolinux1 so I thing it should be the case for mine. If not could you explain me how to install it ? I tried to do it but I don’t know anything about programming and I am a beginner with linux… :?

I would be very grateful if you can help me to find a solution. Thank you very much.

Cheers,

Stephanie

How big is your distance matrix? Are you sure it’s failing and not just still processing?

My distance matrix is composed of 24362 seq and the file size is around 2.1 Go. It tooks 3775 sec to calculate this matrix at the step “dist.seqs”. I let clearcut working more than one hour. Do you think it’s not enough and that I’m too impatient ?

The algorithm is O(N^2) or greater. If it hasn’t given you an error, I would give it more time.

Hello, I am having the same problem where i run clearcut and nothing happens. I am rining it on a linux server (the Mac version). Now the question, does the Mac version run on the linux server or not? shall be just running the clearcut on my Mac?/

many thanks

O.

What do you mean “nothing happens”? How long have you waited? clearcut in mothur will run on any operating system.

How many sequences are you trying to make a tree from?

Pat

Pat, when i run it on the server it took hours and my terminal was not responsive to anything.when i check the ruing processes using another terminal, it does not show me that mother was running at all. when i run on the Mac i get this : (in a matter of seconds and it shuts down my mothur)


mothur > clearcut(phylip=final.pick.phylip.dist)

Clearcut: Distance value out-of-range.
Clearcut: Syntax error in distance matrix at offset 44.

i will try to keep it running on the server for longer time and let you know.

I am running 24,177 sequence. I already emailed my fast and log file for the Mac run to mothur.bugs@gmail.com (subject:clearcut trouble)

many thanks

Ousama

I was simply not patient enough with it. on the server (Linux, 32 cores/128GB of RAM), it took 7 hours (thinking it was “doing nothing” and i kept shutting it down). But however, running the same on a Mac (4 cores/8GB of RAM). i will still get the same above error where my mothur get terminated.

many thanks

O.

when I run on Mac I get this:

mothur > clearcut(phylip=hongjuan.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.pick.phylip.dist)

Clearcut: Distance value out-of-range.
Clearcut: Syntax error in distance matrix at offset 110.

Many thanks,

HJ

Thanks for bringing this bug to our attention. We have fixed it and will be releasing a new version with the change shortly.

Apologies for beating this horse. If it isn’t dead, I know the curators want it to be.

I ran into problems with clearcut at the end of last week after following the MiSeq SOP with my 2x250bp PE V4 reads (Caporaso et al). Searched this forum and saw that my issue might have something to do with the short names I’d given my samples, so I made each sample name 10 characters. Just finished re-running the same analysis with the longer sample names and got the same problem:

mothur > clearcut(phylip=1411_readF.trim.contigs.good.unique.good.filter.precluster.pick.pick.phylip.dist)

Clearcut: Incorrect number of distance values on row.
Clearcut: Syntax error in distance matrix at offset 1932793883.

My dist.seq-generated phylip distance matrices for each run are large. (The first is only 3 Gb and the more recently-generated one is 13 Gb – must I have messed up to produce such drastically differently-sized phylip.dists?)

Would really appreciate help in completing the phylogenetic analysis. Thanks in advance.


Joe

Joe,

What version of mothur are you running?

Pat

The freshly downloaded v.1.34.0. Thanks for any help, Pat.


Joe

Thanks for sending your distance file. After taking a closer look at it, it appears that the issue is not with clearcut. The distance file does in fact have the wrong number of distances on for the row. Sequence M00232_40_000000000-ABTDE_1_2108_6613_5244 has 11601 distances on its row which is perfect, but on the next row sequence M00232_40_000000000-ABTDE_1_2108_5861_12699 only has 3738. The following row has 25943 distances. Can you try rerunning the dist.seqs command?

[Dear All, I am following the 454 SOP. After clear cut, I have got a tree. But, I don’t understand how can I get the tree with real genera or species name. At present it showing only code but no identification.

Another problem is I cannot make a shared file:

Please check, if I did any mistake:
mothur > dist.seqs(fasta=fileAP.shhh.trim.unique.good.filter.unique.precluster.fasta, output=phylip, processors=2)

Using 2 processors.

Output File Names:
C:\Back up\mothur\fileAP.shhh.trim.unique.good.filter.unique.precluster.phylip.dist

It took 2 to calculate the distances for 843 sequences.

mothur > clearcut(phylip=fileAP.shhh.trim.unique.good.filter.unique.precluster.phylip.dist)

Output File Names:
C:\Back up\mothur\fileAP.shhh.trim.unique.good.filter.unique.precluster.phylip.tre


mothur > cluster.split(fasta=fileAP.shhh.trim.unique.good.filter.unique.precluster.fasta, taxonomy=fileAP.shhh.trim.unique.good.filter.unique.precluster.silva.knn.equalized.taxonomy, name=fileAP.shhh.trim.unique.good.filter.unique.precluster.names, taxlevel=3, processors=1)

Using 1 processors.
Using splitmethod fasta.
Splitting the file…
/******************************************/

Using 1 processors.
/******************************************/

Output File Names:
C:\Back up\mothur\fileAP.shhh.trim.unique.good.filter.unique.precluster.fasta.0.dist

It took 5 to calculate the distances for 841 sequences.
/******************************************/

Using 1 processors.
/******************************************/
C:\Back up\mothur\fileAP.shhh.trim.unique.good.filter.unique.precluster.fasta.1.dist is blank. This can result if there are no distances below your cutoff.

Output File Names:
C:\Back up\mothur\fileAP.shhh.trim.unique.good.filter.unique.precluster.fasta.1.dist

It took 0 to calculate the distances for 2 sequences.
It took 21 seconds to split the distance file.

Reading C:\Back up\mothur\fileAP.shhh.trim.unique.good.filter.unique.precluster.fasta.0.dist
********************###########
Reading matrix: |||||||||||||||||||||||||||||||||||||||||||||||||||


Clustering C:\Back up\mothur\fileAP.shhh.trim.unique.good.filter.unique.precluster.fasta.0.dist
Cutoff was 0.255 changed cutoff to 0.18
Cutoff was 0.255 changed cutoff to 0.18
It took 4 seconds to cluster
Merging the clustered files…
It took 0 seconds to merge.

Output File Names:
C:\Back up\mothur\fileAP.shhh.trim.unique.good.filter.unique.precluster.an.sabund
C:\Back up\mothur\fileAP.shhh.trim.unique.good.filter.unique.precluster.an.rabund
C:\Back up\mothur\fileAP.shhh.trim.unique.good.filter.unique.precluster.an.list


mothur > make.shared(list=fileAP.shhh.trim.unique.good.filter.unique.precluster.an.list, group=fileAP.shhh.good.groups, label=0.03) [ERROR]: IQHHCZ102JWU1Q is in your listfile and not in your groupfile. Please correct. [ERROR]: IQHHCZ102JDFCI is in your listfile and not in your groupfile. Please correct. [ERROR]: IQHHCZ102JA6UZ is in your listfile and not in your groupfile. Please correct.

How did you create the group file? Are the sequences reported by mothur in the group file?

Dear mothur
I using mothur v.1.31.2 on linux
mothur > clearcut(phylip=stability.trim.contigs.good.redundant.pick.phylip.dist,neighbor=t)
Clearcut: Memory allocation error in NJ_parse_distance_matrix()
Clearcut: Syntax error in distance matrix at offset 7.

Can you try it with our current version? https://github.com/mothur/mothur/releases

Good morning,

I am running into the exact same issue, I have created my .dist file, its around 195Gb and when I run clearcut command I get that exact error. I am running the latest version of mothur so I am curious as to what could be causing the problem and if there are any ways to work around that error.

Any help with the issue would be greatly appreciated.

Thanks,

http://blog.mothur.org/2014/09/11/Why-such-a-large-distance-matrix%3F/