Getting to cluster, having problems, what am I doing wrong?

I cannot seem to get one of my datasets to process. I’ve pasted my batch file below. I get all the way up to cluster, and then things just fall apart. Either the command crashes, or like today, it clusters in 7 seconds, but has only unique, not 0.01 or 0.03 to go into make.shared. The data set will process fine if I exclude the minflows and maxflows at the beginning, but I’d really like to keep the length if I can to match a previous publication, and because the extra basepairs allow me to better identify some important community members. I’ve also processed other datasets this way perfectly fine. This is the first time I’ve had this problem.

I have simplified the samples included as much as possible. Using the information from when I excluded minflows and maxflows, I found I had to eliminate 10 samples (too close to controls), and then determined that it was ok to remove my controls. Even with these 15 samples removed, I still can’t get past cluster.

What am I doing wrong?

sff.multiple(file=NDRIAllD.txt, minflows=360, maxflows=720, pdiffs=2, minlength=200, maxhomop=6, processors=6)
unique.seqs(fasta=NDRIAllD.fasta, name=NDRIAllD.names)
align.seqs(fasta=NDRIAllD.unique.fasta, reference=silva.nr_v119.align, flip=t)
screen.seqs(fasta=NDRIAllD.unique.align, name=NDRIAllD.unique.names, group=NDRIAllD.groups, minlength=200, optimize=end)
filter.seqs(fasta=NDRIAllD.unique.good.align, vertical=T)
unique.seqs(fasta=NDRIAllD.unique.good.filter.fasta, name=NDRIAllD.unique.good.names)
pre.cluster(fasta=NDRIAllD.unique.good.filter.unique.fasta, name=NDRIAllD.unique.good.filter.names, group=NDRIAllD.good.groups, diffs=2)
chimera.uchime(fasta=NDRIAllD.unique.good.filter.unique.precluster.fasta, name=NDRIAllD.unique.good.filter.unique.precluster.names, group=NDRIAllD.good.groups)
remove.seqs(accnos=NDRIAllD.unique.good.filter.unique.precluster.uchime.accnos, fasta=NDRIAllD.unique.good.filter.unique.precluster.fasta, name=NDRIAllD.unique.good.filter.unique.precluster.names, group=NDRIAllD.good.groups)
classify.seqs(fasta=NDRIAllD.unique.good.filter.unique.precluster.pick.fasta, name=NDRIAllD.unique.good.filter.unique.precluster.pick.names, group=NDRIAllD.good.pick.groups, template=silva.nr_v119.align,, cutoff=80)
remove.lineage(fasta=NDRIAllD.unique.good.filter.unique.precluster.pick.fasta, name=NDRIAllD.unique.good.filter.unique.precluster.pick.names, group=NDRIAllD.good.pick.groups,, taxon=Bacteria;Cyanobacteria;-Eukaryota;-Archaea;)
system(cp NDRIAllD.unique.good.filter.unique.precluster.pick.pick.fasta
system(cp NDRIAllD.unique.good.filter.unique.precluster.pick.pick.names
system(cp NDRIAllD.good.pick.pick.groups
dist.seqs(, cutoff=0.4)

From the shhh.flows documentation about min and max flows, “shhh.flows uses a expectation-maximization algorithm to correct flowgrams to identify the idealized form of each flowgram and translate that flowgram to a DNA sequence. Our testing has shown that when Titanium data are trimmed to 450 flows using trim.flows, shhh.flows provides the highest quality data for any other method available. In contrast, when we use the min/max number of flows suggested by Quince of 360/720, the error rate is not that great.”