Get.oturep command

Hi everybody

I work with mothur 1.48.0, when I run the get.oturep command, I have an error :

mothur > get.oturep(column=final.pick.Archaea.dist, count=final.pick.Archaea.count_table, list=final.pick.Archaea.opti_mcc.list, fasta=final.pick.Archaea.fasta, cutoff=0.0-0.01-0.02-0.03-0.04-0.05-0.06)
19
[ERROR]: std::bad_allocRAM used: 0.015255Gigabytes . Total Ram: 31.3363Gigabytes.

has occurred in the ListVector class function getLabels. This error indicates your computer is running out of memory. This is most commonly caused by trying to process a dataset too large, using multiple processors, or a file format issue. If you are using multiple processors, try running the command with processors=1, the more processors you use the more memory is required. Also, you may be able to reduce the size of your dataset by using the commands outlined in the Schloss SOP, http://www.mothur.org/wiki/Schloss_SOP. If you are unable to resolve the issue, please contact Pat Schloss at mothur.bugs@gmail.com, and be sure to include the mothur.logFile with your inquiry.sed: can’t read final.pick.Archaea.opti_mcc.0.biom: No such file or directory
sed: can’t read final.pick.Archaea.opti_mcc.0.03.biom: No such file or directory

Here is the log report just before the problem :
mothur > dist.seqs(fasta=final.pick.Archaea.fasta, cutoff=0.10)

Using 40 processors.

Sequence Time Num_Dists_Below_Cutoff

It took 0 secs to find distances for 88 sequences. 664 distances below cutoff 0.1.

Output File Names:
final.pick.Archaea.dist

mothur > cluster(column=final.pick.Archaea.dist, count=final.pick.Archaea.count_table, method=opti, cutoff=0.0-0.01-0.02-0.03-0.04-0.05-0.06)

Using 40 processors.

Clustering final.pick.Archaea.dist

iter time label num_otus cutoff tp tn fp fn sensitivity specificity ppv npv fdr accuracy mcc f1score

0.0
0 0 0 88 0 0 3828 0 0 0 1 0 1 1 1 0 0
1 0 0 88 0 0 3828 0 0 0 1 0 1 1 1 0 0

0.01
0 0 0.01 88 0.01 0 3816 0 12 0 1 0 0.996865 1 0.996865 0 0
1 0 0.01 78 0.01 12 3815 1 0 1 0.999738 0.923077 1 0.923077 0.999739 0.960643 0.96
2 0 0.01 78 0.01 12 3815 1 0 1 0.999738 0.923077 1 0.923077 0.999739 0.960643 0.96

0.02
0 0 0.02 88 0.02 0 3749 0 79 0 1 0 0.979363 1 0.979363 0 0
1 0 0.02 52 0.02 51 3740 9 28 0.64557 0.997599 0.85 0.992569 0.85 0.990334 0.736148 0.733813
2 0 0.02 52 0.02 51 3744 5 28 0.64557 0.998666 0.910714 0.992577 0.910714 0.991379 0.762845 0.755556
3 0 0.02 51 0.02 54 3743 6 25 0.683544 0.9984 0.9 0.993365 0.9 0.991902 0.780529 0.776978
4 0 0.02 52 0.02 54 3744 5 25 0.683544 0.998666 0.915254 0.993367 0.915254 0.992163 0.787319 0.782609
5 0 0.02 52 0.02 54 3744 5 25 0.683544 0.998666 0.915254 0.993367 0.915254 0.992163 0.787319 0.782609

0.03
0 0 0.03 88 0.03 0 3657 0 171 0 1 0 0.955329 1 0.955329 0 0
1 0 0.03 35 0.03 114 3647 10 57 0.666667 0.997266 0.919355 0.984611 0.919355 0.982497 0.774708 0.772881
2 0 0.03 34 0.03 131 3641 16 40 0.766082 0.995625 0.891156 0.989133 0.891156 0.985371 0.818854 0.823899
3 0 0.03 34 0.03 134 3640 17 37 0.783626 0.995351 0.887417 0.989937 0.887417 0.985893 0.826704 0.832298
4 0 0.03 33 0.03 138 3635 22 33 0.807018 0.993984 0.8625 0.991003 0.8625 0.985632 0.826836 0.833837

0.04
0 0 0.04 88 0.04 0 3570 0 258 0 1 0 0.932602 1 0.932602 0 0
1 0 0.04 32 0.04 172 3565 5 86 0.666667 0.998599 0.971751 0.976445 0.971751 0.976228 0.794231 0.790805
2 0 0.04 28 0.04 200 3546 24 58 0.775194 0.993277 0.892857 0.983907 0.892857 0.978579 0.820834 0.829876
3 0 0.04 28 0.04 220 3531 39 38 0.852713 0.989076 0.849421 0.989353 0.849421 0.979885 0.84028 0.851064
4 0 0.04 27 0.04 226 3525 45 32 0.875969 0.987395 0.833948 0.991004 0.833948 0.979885 0.843939 0.854442
5 0 0.04 27 0.04 226 3525 45 32 0.875969 0.987395 0.833948 0.991004 0.833948 0.979885 0.843939 0.854442

0.05
0 0 0.05 88 0.05 0 3511 0 317 0 1 0 0.917189 1 0.917189 0 0
1 0 0.05 27 0.05 236 3495 16 81 0.744479 0.995443 0.936508 0.977349 0.936508 0.97466 0.822304 0.829525
2 0 0.05 23 0.05 255 3492 19 62 0.804416 0.994588 0.930657 0.982555 0.930657 0.97884 0.854202 0.862944
3 0 0.05 23 0.05 256 3491 20 61 0.807571 0.994304 0.927536 0.982827 0.927536 0.97884 0.854399 0.863406
4 0 0.05 23 0.05 256 3491 20 61 0.807571 0.994304 0.927536 0.982827 0.927536 0.97884 0.854399 0.863406

0.06
0 0 0.06 88 0.06 0 3439 0 389 0 1 0 0.89838 1 0.89838 0 0
1 0 0.06 21 0.06 310 3428 11 79 0.796915 0.996801 0.965732 0.977474 0.965732 0.976489 0.865239 0.873239
2 0 0.06 20 0.06 367 3406 33 22 0.943445 0.990404 0.9175 0.993582 0.9175 0.985632 0.922395 0.930292
3 0 0.06 19 0.06 371 3403 36 18 0.953728 0.989532 0.911548 0.994738 0.911548 0.985893 0.924588 0.932161
4 0 0.06 19 0.06 371 3403 36 18 0.953728 0.989532 0.911548 0.994738 0.911548 0.985893 0.924588 0.932161

It took 0 seconds to cluster

Output File Names:
final.pick.Archaea.opti_mcc.list
final.pick.Archaea.opti_mcc.steps
final.pick.Archaea.opti_mcc.sensspec

I have the problem only with Archaea
I use the same script with Bacteria and the result is good

Thanks for your help

Hi there,

You are running out of RAM possibly because the distance matrix is so large. A few suggestions…

  1. Pick a single threshold. I’m not sure there’s a scientific reason to use numerous thresholds like you are trying

  2. Instead of giving a distance matrix use method=abundance.

  3. If you are using get.oturep for classification, I’d strongly recommend using classify.otu instead.

  4. If you absolutely have to use multiple thresholds and want a distance-based approach, check the quality of your data

pat

Hi
I tested with one level and I have the same problem.

I tested with the method=abundance, the same problem.

I use the get.oturep function to get the fasta file for each level 0 and 0.03 and the classify.otu function for the OTU taxanomy.

I don’t understand the memory problem because the files used are less than 5Okb in size.
I use the same script for the analysis of bacteria with much larger files
This problem appeared when we changed version of mothur (1.48.0)

Thanks for your help

Can you send the files to mothur.bugs@gmail.com for us to take a look?

Thanks,
Pat

Hello!

I also tried to get there using the column file but even with small dataset and my single cutoff, I get a core dump, so I am getting out of ram even if everything went super smooth with the normal pipeline (finished in 2 hours) using the same computer ressources (cpus-per-task=32, mem=128000M)

I will try to get there with abundance.

Mothur is still dumping me (sad I am ) even with abundance.

set.logfile(name=database_chicken_getoturep)

set.current(processors=32)

set.seed(seed=100)

get.oturep(list=sophiefile.trim.contigs.unique.good.good.filter.unique.precluster.denovo.vsearch.pick.opti_mcc.list,
cutoff=0.02, fasta=sophiefile.trim.contigs.unique.good.good.filter.unique.precluster.denovo.vsearch.pick.fasta, method=abundance, count=sophiefile.trim.contigs.unique.good.good.filter.unique.precluster.denovo.vsearch.pick.count_table)