Cluster () gives me only a unique list...

kimitas · May 28, 2012, 4:59am

Hi There,
AS fallowing the Schloss SOP example I run my 16s pyro sequences as fallow:
And when I use the cluster () I loose all the cutoffs…I end up only with one unique list…(see below)…is this a bug?!
Thanks
kim

mothur > dist.seqs(fasta=Lema_16s_adults_final.unique.fasta, cutoff=0.15)

Output File Name: Lema_16s_adults_final.unique.dist

It took 559 to calculate the distances for 13029 sequences.

mothur > cluster(column=Lema_16s_adults_final.unique.dist, name=Lema_16s_adults_final.names)

********************###########
Reading matrix: ||||||||||||||||||||||||||||||||||||||||||||||||||||

unique 1 13029
changed cutoff to 0.006536

Output File Names:
Lema_16s_adults_final.unique.an.sabund
Lema_16s_adults_final.unique.an.rabund
Lema_16s_adults_final.unique.an.list

It took 371 seconds to cluster

pschloss · May 29, 2012, 3:31pm

This generally happens when people include sequences that do not fully overlap with each other (i.e. did you use filter.seqs(trump=., vertical=T)?)

mattias · October 25, 2013, 3:27pm

Hi,

I have the same issue as above for a data set. The filter.seqs command was run with the mentioned settings. This is the output:

Length of filtered alignment: 1176
Number of columns removed: 48824
Length of the original alignment: 50000
Number of sequences used to construct filter: 287861

Before running dist.seqs I also run unique.seqs, pre.cluster and chimera.uchime.

Any more suggestions what might be the problem?

Thanks!

westcott · October 25, 2013, 3:34pm

You might also try increasing your cutoff value.

"Why does the cutoff change when I cluster with average neighbor?

This is a product of using the average neighbor algorithm with a sparse distance matrix. When you run cluster, the algorithm looks for pairs of sequences to merge in the rows and columns that are getting merged together. Let’s say you set the cutoff to 0.05. If one cell has a distance of 0.03 and the cell it is getting merged with has a distance above 0.05 then the cutoff is reset to 0.03, because it’s not possible to merge at a higher level and keep all the data. All of the sequences are still there from multiple phyla. Incidentally, although we always see this, it is a bigger problem for people that include sequences that do not fully overlap. "

Topic		Replies	Views
cluster() function only gives "unique" results Commands in mothur	1	2430	June 7, 2011
Cluster cutoff issue Commands in mothur	7	7398	July 8, 2011
issues with cutoffs and dist mothur bugs	2	3502	March 31, 2014
Cluster Commands in mothur	1	1134	August 5, 2015
cluster bug? mothur bugs	2	2707	February 7, 2013

Cluster () gives me only a unique list...

Related topics