10

My current configs are:

> cat /proc/sys/vm/panic_on_oom
0
> cat /proc/sys/vm/oom_kill_allocating_task
0
> cat /proc/sys/vm/overcommit_memory
1

but when I run a task, it's killed anyway.

> ./test/mem.sh
Killed
> dmesg | tail -2
[24281.788131] Memory cgroup out of memory: Kill process 10565 (bash) score 1001 or sacrifice child
[24281.788133] Killed process 10565 (bash) total-vm:12601088kB, anon-rss:5242544kB, file-rss:64kB

Update

My tasks are used to scientific computing, which costs many memories, it seems that overcommit_memory=1 may be the best choice.

Update 2

Actually, I'm working on a data analyzation project, which costs memory more than 16G, but I was asked to limit them in about 5G. It might be impossible to implement this requirement via optimizing the program itself, because the project uses many sub-commands, and most of them does not contains options like Xms or Xmx in Java.

Update 3

My project should be an overcommited system. Exacetly as what a3f saying, it seems that my apps prefer to crash by xmalloc when mem allocated failed.

> cat /proc/sys/vm/overcommit_memory
2
> ./test/mem.sh
./test/mem.sh: xmalloc: .././subst.c:3542: cannot allocate 1073741825 bytes (4295237632 bytes allocated)

I don't want to surrender, although so many aweful tests make me exhausted. So please show me a way to the light ; )

Community
  • 1
  • 1
Yang
  • 759
  • 2
  • 9
  • 30
  • I just found this: "How to Configure the Linux Out-of-Memory Killer" https://www.oracle.com/technical-resources/articles/it-infrastructure/dev-oom-killer.html I'm glad the accepted answer links to documentation, but a full tutorial/guide would be more helpful I think. – PJ Brunet Jul 11 '22 at 16:34
  • Since you tag `docker`, the best way to limit memory resources is using docker/compose/k8s. Just check the doc, depending on the docker orchestration mechanism you are using, e.g. in docker-compose it's [`mem_limit`](https://docs.docker.com/compose/compose-file/#mem_limit). – RodolfoAP Mar 07 '23 at 09:46

2 Answers2

13

The OOM killer won't go away. If there is no memory, someone's got to pay. What you can do is set a limit after which memory allocations fail. That's exactly what setting vm.overcommit_memory to 2 achieves.

From the docs:

The Linux kernel supports the following overcommit handling modes

2 - Don't overcommit. The total address space commit for the system is not permitted to exceed swap + a configurable amount (default is 50%) of physical RAM. Depending on the amount you use, in most situations this means a process will not be killed while accessing pages but will receive errors on memory allocation as appropriate.

Normally, the kernel will happily hand out virtual memory (overcommit). Only when you reference a page, the kernel has to map the page to a real physical frame. If it can't service that request, a process needs to be killed by the OOM killer to make space.

Disabling overcommit means that e.g. malloc(3) will return NULL if the kernel couldn't commit the amount of memory requested. This makes things a bit more predictable, albeit limited (many applications allocate more than they would ever need).

a3f
  • 8,517
  • 1
  • 41
  • 46
  • Thanks! My tasks always cost many memories. If set the `overcommit_memory=2`, will the task be paused? It's not good for scientific computing tasks, maybe. – Yang Mar 04 '16 at 09:04
  • @Yang I updated the answer. If you want to avoid OOM situations, you need to buy more RAM (or fix your memory management strategy or keep using overcommit and hope for the best). – a3f Mar 04 '16 at 09:23
  • Thanks for the so detailed explanation about VM and page, I really appreciate it. In the _Update 2_, I need a method to limit memory usage for a process, and ensure they are running rather than killed, i.e., they should get the `NULL` when request more mem, but it should keep computing, like the `docker run --oom-kill-disable`. Is there something I missed, or this thought is totally wrong? – Yang Mar 04 '16 at 13:44
  • 1
    @Yang Check out the link I posted. You can set an upper limit after which allocations fail. Your application would then have to deal with the memory allocation failure. Many applications just crash (either implicitly by null pointer dereference or explicitly by e.g. `xmalloc`). I don't know how your application handles it. – a3f Mar 04 '16 at 13:50
  • In _Update 3_, my stupid apps crashed by `xmalloc`, it seems that they does have not any plans to handle the memory allocation failure. As [an overcommit project](http://stackoverflow.com/questions/35695261/how-does-the-container-use-more-memory-than-the-limit) , the `Always overcommit` might be the appropriate overcommit handling mode I think. – Yang Mar 04 '16 at 14:24
  • 1
    @Yang Either rewrite the application to consume less or increase the physical memory. The second option is probably cheaper. – a3f Mar 04 '16 at 15:27
  • malloc NEVER returns NULL under linux, either some other process is killed or itself is killed – davide May 09 '16 at 11:07
  • 1
    @workless This depends on the overcommit strategy, check the linked doc for more information. And even with the default overcommit behavior,`malloc(-1)` will give you `NULL`. – a3f May 09 '16 at 17:04
  • @a3f thanks i wasn't aware of that, there are so many programs out there that don't even check the returned address, let's be careful to disable overommit – davide May 10 '16 at 09:55
3

The possible values of oom_adj range from -17 to +15. The higher the score, more likely the associated process is to be killed by OOM-killer. If oom_adj is set to -17, the process is not considered for OOM-killing.

But, increase ram is better choice ,if increasing ram is not possible, then add swap memory.

To increase swap memory try this link,

Ravipati Praveen
  • 446
  • 1
  • 7
  • 28
  • 2
    Please [do not post an answer that consists essentially of a link](https://stackoverflow.com/questions/how-to-answer). Include the important points in your answer; leave the link for extra information or as a reference. – glennsl Sep 25 '17 at 11:32
  • @glennsl Thanks. – Ravipati Praveen Sep 25 '17 at 11:42
  • @glennsl, thanks for information and link you provided. I updated my post. – Ravipati Praveen Sep 25 '17 at 11:59
  • @RavipatiPraveen, Thank you! Disabling the OOM-killing is really what I want, by set the `oom_adj` to -17 for the task. Increasing swap sounds really cool, and it might solve the problems. I'll give it a try later, Thank you! – Yang Sep 26 '17 at 00:03