cluster problem with opticlust

Hi
I´m using the new release to redo the analysis I was doing shortly before the new release. Now, when I run cluster (454 SOP, old data, sorry) I get the final.opti_mcc.list file instead of the final.an.list file.
When I try make.shared I get a warning saying:
Your group file contains 218058 sequences and list file contains 215315 sequences. Please correct.
Any idea why is this problem happening? Did I do sth wrong or is there any problem with the new opticlust?
It worked with the previous version and the final.an.list file. Nevertheless, if now I try cluster with method=average it only gives a final.an.list file with unique (and not 0.03). And the same happens if I added cutoff=0.03 to the cluster parameters.
So, I cannot go further with shared at 0.03 level, either using the new default method or the average algorithm.
Thanks
Susana

you need your cutoff to be bigger. Cutoff tells it what distances to include in your dist files (not in your shared). I typically use dist=0.15

for dist.seqs i used cutoff=0.2, the problem was later, usually with the old 38 version I did:

dist.seqs(fasta=final.fasta, cutoff=0.2, processors=24)
cluster(column=final.dist, name=final.names)
make.shared(list=final.an.list, group=final.groups, label=0.03)

Now, with the new release, the name of the list file changed so I did:

dist.seqs(fasta=final.fasta, cutoff=0.2, processors=24)
cluster(column=final.dist, name=final.names)
make.shared(list=final.opti_mcc.list, group=final.groups, label=0.03)

And there was that warning: Your group file contains 218058 sequences and list file contains 215315 sequences. Please correct.

So, I tried to use the average algorithm:

mothur > cluster(column=final.dist, name=final.names, method=average)
make.shared(list=final.an.list, group=final.groups, label=0.03)

but the shared file was based on unique and not 0.03 level. I added the cutoff 0.03 to cluster:

mothur > cluster(column=final.dist, name=final.names, method=average, cutoff=0.03)
make.shared(list=final.an.list, group=final.groups, label=0.03)

But the result was the same. So I opened the list file and realized that no matter I added the cutoff=0.03 option or not, the list file generated by the cluster command was both times only with the unique. The other list file I got when I used the new default opticlust algorithm had indeed the 0.03 level and not unique, but it failed with the make.shared command.

Should I go back to the 38 version, with which I could work the same dataset one weeks ago? I just wanted to try the new version of cluster :frowning:

Cheers,

You actually don’t need to set the higher theshold with the opticlust algorithm. Use cutoff=0.03 from now on…

I read the paper but don’t understand why you don’t need a higher cutoff anymore?

If you look through the example in the supplement, you’ll see that we don’t use any distances above the threshold you’re interested in. The first thing the method does is go through and find the sequence pairs with a distance <= 0.03. After that we don’t even look at the distances.

Pat