Getting to cluster, having problems, what am I doing wrong?

hanshew · December 15, 2014, 5:38pm

I cannot seem to get one of my datasets to process. I’ve pasted my batch file below. I get all the way up to cluster, and then things just fall apart. Either the command crashes, or like today, it clusters in 7 seconds, but has only unique, not 0.01 or 0.03 to go into make.shared. The data set will process fine if I exclude the minflows and maxflows at the beginning, but I’d really like to keep the length if I can to match a previous publication, and because the extra basepairs allow me to better identify some important community members. I’ve also processed other datasets this way perfectly fine. This is the first time I’ve had this problem.

I have simplified the samples included as much as possible. Using the information from when I excluded minflows and maxflows, I found I had to eliminate 10 samples (too close to controls), and then determined that it was ok to remove my controls. Even with these 15 samples removed, I still can’t get past cluster.

What am I doing wrong?

sff.multiple(file=NDRIAllD.txt, minflows=360, maxflows=720, pdiffs=2, minlength=200, maxhomop=6, processors=6)
unique.seqs(fasta=NDRIAllD.fasta, name=NDRIAllD.names)
align.seqs(fasta=NDRIAllD.unique.fasta, reference=silva.nr_v119.align, flip=t)
screen.seqs(fasta=NDRIAllD.unique.align, name=NDRIAllD.unique.names, group=NDRIAllD.groups, minlength=200, optimize=end)
filter.seqs(fasta=NDRIAllD.unique.good.align, vertical=T)
unique.seqs(fasta=NDRIAllD.unique.good.filter.fasta, name=NDRIAllD.unique.good.names)
pre.cluster(fasta=NDRIAllD.unique.good.filter.unique.fasta, name=NDRIAllD.unique.good.filter.names, group=NDRIAllD.good.groups, diffs=2)
chimera.uchime(fasta=NDRIAllD.unique.good.filter.unique.precluster.fasta, name=NDRIAllD.unique.good.filter.unique.precluster.names, group=NDRIAllD.good.groups)
remove.seqs(accnos=NDRIAllD.unique.good.filter.unique.precluster.uchime.accnos, fasta=NDRIAllD.unique.good.filter.unique.precluster.fasta, name=NDRIAllD.unique.good.filter.unique.precluster.names, group=NDRIAllD.good.groups)
classify.seqs(fasta=NDRIAllD.unique.good.filter.unique.precluster.pick.fasta, name=NDRIAllD.unique.good.filter.unique.precluster.pick.names, group=NDRIAllD.good.pick.groups, template=silva.nr_v119.align, taxonomy=silva.nr_v119.tax, cutoff=80)
remove.lineage(fasta=NDRIAllD.unique.good.filter.unique.precluster.pick.fasta, name=NDRIAllD.unique.good.filter.unique.precluster.pick.names, group=NDRIAllD.good.pick.groups, taxonomy=NDRIAllD.unique.good.filter.unique.precluster.pick.nr_v119.wang.taxonomy, taxon=Bacteria;Cyanobacteria;-Eukaryota;-Archaea;)
system(cp NDRIAllD.unique.good.filter.unique.precluster.pick.pick.fasta NDRIAllD.final.fasta)
system(cp NDRIAllD.unique.good.filter.unique.precluster.pick.pick.names NDRIAllD.final.names)
system(cp NDRIAllD.good.pick.pick.groups NDRIAllD.final.groups)
system(cp NDRIAllD.unique.good.filter.unique.precluster.pick.nr_v119.wang.taxonomy NDRIAllD.final.taxonomy)
dist.seqs(fasta=NDRIAllD.final.fasta, cutoff=0.4)
cluster(column=NDRIAllD.final.dist, name=NDRIAllD.final.names)

westcott · December 18, 2014, 2:06am

From the shhh.flows documentation about min and max flows, “shhh.flows uses a expectation-maximization algorithm to correct flowgrams to identify the idealized form of each flowgram and translate that flowgram to a DNA sequence. Our testing has shown that when Titanium data are trimmed to 450 flows using trim.flows, shhh.flows provides the highest quality data for any other method available. In contrast, when we use the min/max number of flows suggested by Quince of 360/720, the error rate is not that great.”

http://www.mothur.org/wiki/Frequently_asked_questions#Why_is_my_dataset_only_clustering_to_.22unique.22.3F

Topic		Replies	Views
Error in pre.cluster command mothur bugs	1	5188	July 18, 2012
Pre.cluster command after demultiplexing is taking too long	3	23	December 2, 2024
An error occurs while running pre.cluster command Commands in mothur	7	732	February 13, 2023
Trouble shoting of cluster Commands in mothur	21	12387	January 11, 2013
Issues with cluster command Commands in mothur	5	4453	December 19, 2012

Getting to cluster, having problems, what am I doing wrong?

Related topics