Clustering flowgrams: error

Armin · September 25, 2012, 9:26am

Hello,

I ran this command:
shhh.flows(flow=Glbl.trim.flow,processors=8)

Starting to cluster the flowgrams, the program gives me the following error message:

…
123600 22186 22125.1
123700 22224 22162.3
123800 22261 22199.6
123800 22261 22199.6

Total time: 22261 23015.2

Clustering flowgrams…
********************###########
Reading matrix: |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

[ERROR]: std::bad_alloc has occurred in the ShhherCommand class function getOTUD
ata. Please contact Pat Schloss at mothur.bugs@gmail.com, and be sure to include
the mothur.logFile with your inquiry.

I’d appreciate it if you could help me.

Thank you,
Armin

westcott · September 25, 2012, 10:25am

The error you are getting indicates you are running out of memory. The shhh.flows command is very memory intensive. Here are 2 options that may help reduce the memory needed.

Run trim.flows with an oligos file before you run shhh.flows. When you run trim.flows with an oligos file, mothur will separate your flows by sample which allows you to run smaller sets through shhh.flows. Trim.flows will create a .flow.files file that can be used with shhh.flows file parameter to run each sample separately and combine the results into one .shhh.fasta file for you when complete.
Use 1 processor. The more processors you use the more memory is needed.

Armin · September 27, 2012, 12:57pm

Thanks Westcott

I used 1 processor and I got the same error. And when I set the oligo file in trim.flow command almost all the flows ended up in scrap file. I checked the oligo file, and the samples have a reasonable number of reads when I run trim.seqs command with oligo file set. Maybe I should mention that in my oligo file primer+tag were used in combination as barcode and no primer was defined.

Thank you very much in advance,
a

westcott · September 28, 2012, 11:02am

For the flows that were scrapped, what was the scrap code?

Armin · September 28, 2012, 1:28pm

Below is the first line of the file, which shows that barcode is missing (right?), but as I mentioned barcodes work fine in trim.seqs

720
#|lb 100 0.09 0.11 0.12 1.08 0.11 0.96 1.01 0.11 0.98 0.91 0.90 1.05 0.90 0.81 0

Thanks again

westcott · September 28, 2012, 3:16pm

Yes, barcode and length are the reason it failed. If you want to send your logfile, flow and oligos files to mothur.bugs@gmail.com, I can try to troubleshoot it for you.

pschloss · October 4, 2012, 12:06pm

It looks like your data are bad. When I run:

trim.flows(flow=glbl.flow, oligos=oligos_2.txt, pdiffs=2, bdiffs=1, processors=1, minflows=360,maxflows=720)

among the first 10000, 8000 don’t have 360 good flows and the other 2000 don’t have the right barcode.

If you look at one sequence that went to the scrap:

G329W3U02IGRE2|lb 60 0.07 0.10 0.10 1.06 0.09 0.96 1.02 0.09 1.01 0.98 1.04 1.08 1.00 0.88 1.01 0.11 0.10 0.86 0.95 0.09 0.09 0.88 0.97 0.09 1.06 0.89 0.93 0.07 1.01 0.85 0.92 0.09 0.96 0.90 0.16 0.10 0.96 0.16 0.10 1.05 1.77 0.12 0.99 0.17 0.98 0.11 0.18 1.73 0.15 0.78 1.49 0.23 0.75 0.23 0.43 1.36 1.05 0.12 0.11 1.32 0.28 0.61 0.13 0.94 1.92 0.33 1.76 0.59 1.28 0.11 1.30 0.61 0.86 0.12 0.91 1.07 1.15 0.36 0.14 1.07 1.07 0.35 0.35 0.66 0.31 1.18 0.42 0.35 1.10 1.40 0.39 0.33 0.71 1.43 0.66 0.77 0.74 0.48 1.14 0.84 0.15 0.20 1.11 1.06 0.11 0.41 0.57 0.77 0.41 0.38 1.10 0.69 0.44 0.31 1.46 0.16 0.81 0.40 1.40 0.12 0.68 0.26 ...

If you translate this to DNA sequence, the beginning is:

GACTACGTACACACTACTACTATGTTCTGGAC [The initial GACT is the test sequence]

This isn’t close to anything in your oligos file. A bigger problem for you is that your sequences are very short and noisy. I’d figure out what’s wrong with your sequencer and go from there…

Pat

Armin · October 29, 2012, 9:48am

Hi again,
Thank you for your help. I figured out the problem: I had to include the linker and sequencing primer in the oligos file. As you mentioned the general quality of my reads is low. I decided to trim the flows with the setting of noise=0.5, and I got ~240,000 reads (out of ~400,000) in the output file (noise=0.7 gave only ~100,000); is this a wise thing to do?

Also I would like to get a consensus sequence for each OTU, but the function “Consensus.seqs” requires aligned sequences. The problem is that my reads are fungal ITS sequences which as you know are not alienable across the kingdom. Do you have any idea how to solve this?

I appreciate you time.
Best wishes,
a

pschloss · November 1, 2012, 9:43am

Thank you for your help. I figured out the problem: I had to include the linker and sequencing primer in the oligos file. As you mentioned the general quality of my reads is low. I decided to trim the flows with the setting of noise=0.5, and I got ~240,000 reads (out of ~400,000) in the output file (noise=0.7 gave only ~100,000); is this a wise thing to do?

Um, probably not - but do you have mock community data? Without mock community data showing no bad effects, I wouldn’t trust it. Also, the lookup files were originally developed for 454, not IonTorrent - so that might be another hickup in the works. We’re working on creating a workup file but it might be a few weeks.

Pat

Topic		Replies	Views
shhh.flows error at "Clustering flowgrams..." mothur bugs	1	2926	May 9, 2014
shhh.flows memory problem Commands in mothur	6	2558	August 28, 2015
Error in flowgrams command mothur bugs	4	1729	October 20, 2016
shhh.flows memory requirements Commands in mothur	5	4430	February 24, 2012
shhh.flows gives an error mothur bugs	5	5644	February 6, 2013

Clustering flowgrams: error

Related topics