1

I am testing my server app in the docker container, I saw it stopped with code 137.

root@debian:~# docker ps -a
CONTAINER ID        IMAGE                   COMMAND                 CREATED             STATUS                      PORTS               NAMES
821959f20624        webserver-in-c_server   "./webserver -p 8080"   2 weeks ago         Exited (137) 40 hours ago                       server
root@debian:~# 

Here is the docker inspect of dead process, OOMKilled is set to false:

root@debian:~# docker inspect server
[
    {
        "Id": "821959f206244d90297cfa0e31a89f4c8e06e3459cd8067e92b7cbb2e6fca3e0",
        "Created": "2020-11-25T15:13:10.989199751Z",
        "Path": "./webserver",
        "Args": [
            "-p",
            "8080"
        ],
        "State": {
            "Status": "exited",
            "Running": false,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 0,
            "ExitCode": 137,
            "Error": "",
            "StartedAt": "2020-11-25T15:13:12.321234415Z",
            "FinishedAt": "2020-12-09T17:55:30.649883125Z"
        },

So my question is that message in dmesg like below will cause the container being killed also?

...
[1969112.586796] TCP: out of memory -- consider tuning tcp_mem
[1969122.585736] TCP: out of memory -- consider tuning tcp_mem
[1969132.585344] TCP: out of memory -- consider tuning tcp_mem
[1969142.585455] TCP: out of memory -- consider tuning tcp_mem
[1969152.598334] TCP: out of memory -- consider tuning tcp_mem
[1969162.585242] TCP: out of memory -- consider tuning tcp_mem

Thanks in advance!

Xin
  • 113
  • 3
  • 9
  • My guess is that they're not directly related, but if your system is generally low on memory then you could see both symptoms. Without more details or a [mcve] it's a little hard to say more, or give more than a yes-or-no answer. – David Maze Dec 11 '20 at 12:14
  • @DavidMaze The only more info I can provide is my VPS memory info, 1GB. – Xin Dec 11 '20 at 12:51
  • Quick search through sources shows that when general TCP memory limit is reached, the kernel just drops (sends the reset packet to the peer) connection which hits the memory limit, thus, decreases the TCP stack memory usage. No OOM code is involved, which is logical: OOM is the last resort thing, and there is no reason for killing some process as a whole when you can just get rid of some connections. – Danila Kiver Dec 11 '20 at 16:19
  • As to why the container gets killed... Do you have some TCP/HTTP-based health check for your container (also, do you run it directly with Docker, or using some orchestrator like k8s)? If yes, the container could be killed just because its health check connection was dropped (due to the TCP memory limit being reached), and the container was considered unhealthy by the supervising process. – Danila Kiver Dec 11 '20 at 16:21
  • @DanilaKiver, hi thanks for your comments! I just run it directly with Docker, and I don't have TCP/HTTP health check for this container either. So I am very curious about how it ended up this... – Xin Dec 12 '20 at 12:46

2 Answers2

1

If you don't set a memory limit on the container, then docker will never be the one to kill the process, and OOMKilled is expected to always be false.

Whether or not the memory limit is set in docker, it's always possible for the Linux kernel itself to kill the process when the Linux host runs out of memory. (The reason for configuring a docker memory limit is to avoid this, or at least have some control of which containers get killed before this happens.) When the kernel kills your process, you'll get a signal 9, aka SIGKILL, which the application cannot trap and it will immediate exit. This will be seen as an exit code 137 (128 + 9).

You can dig more into syslog, various kernel logs under /var/log, and dmesg to find more evidence of the kernel encountering an OOM and killing processes on the host. When this happens, the options are to avoid running memory hungry processes, adjust the app to use less memory, or add more memory to the host.

BMitch
  • 231,797
  • 42
  • 475
  • 450
  • This answer just does not address the core point of the question (which makes it unique and interesting): is TCP memory overflow related to the container's crash, and how? Does the kernel kill the process which causes TCP memory overflow or not? Does Docker somehow react to such exotic kind of event or no? – Danila Kiver Dec 16 '20 at 18:15
  • @DanilaKiver my suspicion is that TCP out of memory is a side effect of the host running out of memory. Docker is unlikely to have special handling, but perhaps Go would panic or throw an error that would crash part of docker. That's unlikely to be the case here since the app itself is shown to have exited. – BMitch Dec 16 '20 at 18:35
  • I dug more before, from the kernel logs and the dmesg, they showed me nothing concerning kernel was doing this "dirty" job... – Xin Dec 16 '20 at 21:11
  • If it did, the log can be in a few places depending on the distro settings, see the various answers to https://stackoverflow.com/q/624857/596285 – BMitch Dec 16 '20 at 21:24
0

I never seen a "TCP: out of memory" before. So I will drop you some lines to help you; first, regarding those options from the inspection command:

        "OOMKilled": false,
        "Dead": false,

They are set as default when you run a container ( as example "docker run xxx"), and I found an entry already responded. Over there the last answer you can find more information related about why the container gets OOM with those flags set as false, so BNT reply said: "By default, kernel kills processes in a container if an out-of-memory (OOM) error occurs" grabbed from docker official documentation.

Second, the flag "TCP: out of memory -- consider tuning tcp_mem" as far I can see here, that is more likely an issue on kernel TCP level, explained on this SAP-tech-article where on their case had changed the network kernel parameters to solve the issue, however I will recommended to check out your setting by:

sysctl -a | grep rmem

Try to change the parameters by /proc just to test and then if you need to make persist them through shutdown, do the changes at /etc/sysctl.conf. Here more info about tcp/ip kernel parameters.

Moreover, either you set the flag to not kill the container or not, you will get the "TCP OOM", until you tune up your tcp parameters socket, by the info explained above. Additionally I shared you a pretty good analysis here where explained the flow of the TCP_mem and the funtion tcp_check_oom(); so basically there not will be a sigkill just a tcp oom..one quota from here says:

In addition, the Linux kernel is transitioning to memory pressure mode when TCP oom occurs , limiting the memory allocated to the send / receive buffer of the TCP socket. Therefore, there is a penalty for sending and receiving performance over TCP.

Hope the information can be useful for you, and it can be marked as a answer or not, it opens to review/edit to find a better answer.

J.Rojas
  • 302
  • 2
  • 7
  • Hi thanks for the answer, I understand now that OOM killer won't be enabled unless I set that flag to enable it, good to know that! But what I found interesting is that I checked the sys log e.g. `dmesg` , I cannot find any clue that this container process was killed by the kernel. Also this is the mem set: `net.ipv4.tcp_rmem = 4096 131072 6291456` `net.ipv4.udp_rmem_min = 4096 ` – Xin Dec 12 '20 at 12:50
  • Hi Xin, as far as I could read and understand about the "TCP: out of memory", you are into a state when TCP memory quota exceeds the hard limit (tcp_mem), under this state the host apparently "does not kill the container", the scenario happened into kernel level, so my thought: the TCP oom when it's detected it sends RST at the corresponding socket, discard the socket, and send probe, so it is interrupted, but not killed (I do not 100% sure), from my last "answer/help" I will edited to add more info related. Because apparently it's a tricky topic but interesting. – J.Rojas Dec 15 '20 at 13:45