shhh.flows doesn't make for me what I need :(


Since a month ago, I’ve been trying to do a metagenomic analysis using mothur. After review the wiki, I started to use the Standard Operational Procedure (SOP) but I had a trouble in the third step, while I need to use the command shhh.flows. The way I execute the command was this:

shhh.flows(file=sequencing_file.flow.files, cutoff=0.01, sigma=0.06, processors=6)

I have the flow file in order, barcodes works fine, I have the appropiate lookup file and I tried to use theese parameters. But it didin’t work.

So, I modified it and I left it as it looks:

shhh.flows(file=sequencing_file.flow.files, processors=6)

But I had the same problem. I considered to use each flow file separately as this:


But it remains in the same step than the others. Exactly the problem is this: when I execute the command it starts, goes good but it doesn’t do more when it is in the step Denoising flowgrams…

31300   2102    2098.61
31400   2117    2113.12
31500   2131    2127.66
31600   2146    2142.32
31700   2161    2157
31800   2175    2171.62
31900   2190    2186.42
32000   2205    2201.3
32100   2220    2216.18
32200   2235    2231.15
32300   2250    2246.08
32400   2265    2261.11
32500   2280    2276.24
32600   2295    2291.41
32700   2311    2306.62
32800   2326    2321.91
32900   2341    2337.23
32971   2352    2347.97

Clustering flowgrams...
Reading matrix:     ||||||||||||||||||||||||||||||||||||||||||||||||||||

Denoising flowgrams...
iter    maxDelta        nLL             cycletime

It remains therefor ever :S

I’ve used 3 machines.

The first. 6 cores, 8 gb RAM, 2 gb Swap. It failed before
The second. 8 cores, 16 gb RAM, 5 gb Swap. I’ve left working until 150 hours and doesn’t do more
The last. 192 cores, 180 gb RAM, 13 gb Swap. It does the same than the second.

The previous commands I used are theese.

sffinfo(sff=sequencing_file.sff, flow=T)

summary.seqs(fasta=sequencing_file.fasta, processors=5)

trim.flows(flow=sequencing_file.flow, oligos=barcodes.oligos, pdiffs=2, bdiffs=1, minflows=360, fasta=T, maxhomop=9, processors=5)

I’ve tried using MPI, diferent configurations… AND NOTHING WORKS.
My sequenciation is a 454 sequencing and it didn’t have problems with blast or another software.
The file size is 500 mb and is the non processed sff.

So … What am I doing wrong? I would be really gratefully if somebody could help me.




sffinfo(sff=sequencing_file.sff, flow=T)
summary.seqs(fasta=sequencing_file.fasta, processors=6)
trim.flows(flow=sequencing_file.flow, oligos=barcodes.oligos, pdiffs=2, bdiffs=1, maxhomop=9, processors=6)
shhh.flows(file=sequencing_file.flow.files, processors=6)

minflows and maxflows both need to be set to 450 to get proper denoising. When they are set to different values you get problems and if it does go through there is only minimal improvement of the error rate relative to what you had to start with. Just use the default in trim.flows as above.