cluster problem with opticlust

svazquez · January 31, 2017, 9:57am

Hi
I´m using the new release to redo the analysis I was doing shortly before the new release. Now, when I run cluster (454 SOP, old data, sorry) I get the final.opti_mcc.list file instead of the final.an.list file.
When I try make.shared I get a warning saying:
Your group file contains 218058 sequences and list file contains 215315 sequences. Please correct.
Any idea why is this problem happening? Did I do sth wrong or is there any problem with the new opticlust?
It worked with the previous version and the final.an.list file. Nevertheless, if now I try cluster with method=average it only gives a final.an.list file with unique (and not 0.03). And the same happens if I added cutoff=0.03 to the cluster parameters.
So, I cannot go further with shared at 0.03 level, either using the new default method or the average algorithm.
Thanks
Susana

Kendra · January 31, 2017, 2:49pm

you need your cutoff to be bigger. Cutoff tells it what distances to include in your dist files (not in your shared). I typically use dist=0.15

svazquez · January 31, 2017, 3:30pm

for dist.seqs i used cutoff=0.2, the problem was later, usually with the old 38 version I did:

dist.seqs(fasta=final.fasta, cutoff=0.2, processors=24)
cluster(column=final.dist, name=final.names)
make.shared(list=final.an.list, group=final.groups, label=0.03)

Now, with the new release, the name of the list file changed so I did:

dist.seqs(fasta=final.fasta, cutoff=0.2, processors=24)
cluster(column=final.dist, name=final.names)
make.shared(list=final.opti_mcc.list, group=final.groups, label=0.03)

And there was that warning: Your group file contains 218058 sequences and list file contains 215315 sequences. Please correct.

So, I tried to use the average algorithm:

mothur > cluster(column=final.dist, name=final.names, method=average)
make.shared(list=final.an.list, group=final.groups, label=0.03)

but the shared file was based on unique and not 0.03 level. I added the cutoff 0.03 to cluster:

mothur > cluster(column=final.dist, name=final.names, method=average, cutoff=0.03)
make.shared(list=final.an.list, group=final.groups, label=0.03)

But the result was the same. So I opened the list file and realized that no matter I added the cutoff=0.03 option or not, the list file generated by the cluster command was both times only with the unique. The other list file I got when I used the new default opticlust algorithm had indeed the 0.03 level and not unique, but it failed with the make.shared command.

Should I go back to the 38 version, with which I could work the same dataset one weeks ago? I just wanted to try the new version of cluster

Cheers,

pschloss · February 1, 2017, 9:03pm

You actually don’t need to set the higher theshold with the opticlust algorithm. Use cutoff=0.03 from now on…

Kendra · February 1, 2017, 10:35pm

I read the paper but don’t understand why you don’t need a higher cutoff anymore?

pschloss · February 6, 2017, 1:37pm

If you look through the example in the supplement, you’ll see that we don’t use any distances above the threshold you’re interested in. The first thing the method does is go through and find the sequence pairs with a distance <= 0.03. After that we don’t even look at the distances.

Pat

Topic		Replies	Views
cluster.split with opticlust Commands in mothur	1	735	February 1, 2017
problem with cluster command Commands in mothur	4	4733	July 23, 2010
Your file does not include the label 0.03. I will use unique. Commands in mothur	1	940	March 2, 2017
opticlust w/fasta rather than dist Commands in mothur	4	970	June 27, 2017
cluster bug? mothur bugs	2	2705	February 7, 2013

cluster problem with opticlust

Related topics