2

my containers run in kubernetes and I saw lots of OOM from kubernetes Node log. All the logs only have process id information. it's hard to dig what containers OOM. I don't know how to find container id by the host process Id.

Is there any way i can get host process id within container so that i can get a mapping.

this is node log

2020-04-28 09:27:15.530 HKT
I0428 01:27:15.530763 1627 log_monitor.go:115] New status generated: &{Source:kernel-monitor Events:[{Severity:warn Timestamp:2020-04-28 01:27:08.060896434 +0000 UTC m=+89600.088785273 Reason:OOMKilling Message:Memory cgroup out of memory: Kill process 2493556 (node) score 1432 or sacrifice child

2020-04-28 09:29:15.000 HKT
Memory cgroup out of memory: Kill process 2493562 (node) score 1529 or sacrifice child Killed process 2493562 (node) total-vm:14009952kB, anon-rss:3146688kB, file-rss:28720kB, shmem-rss:0kB

2020-04-28 09:29:15.000 HKT
Memory cgroup out of memory: Kill process 2496433 (node) score 1275 or sacrifice child Killed process 2496433 (node) total-vm:7183684kB, anon-rss:1833580kB, file-rss:28804kB, shmem-rss:0kB

2020-04-28 09:29:15.309 HKT
I0428 01:29:15.309829 1627 log_monitor.go:115] New status generated: &{Source:kernel-monitor Events:[{Severity:warn Timestamp:2020-04-28 01:29:07.829961434 +0000 UTC m=+89719.857850273 Reason:OOMKilling Message:Memory cgroup out of memory: Kill process 2493562 (node) score 1529 or sacrifice child

2020-04-28 09:29:15.330 HKT
I0428 01:29:15.329925 1627 log_monitor.go:115] New status generated: &{Source:kernel-monitor Events:[{Severity:warn Timestamp:2020-04-28 01:29:07.849907434 +0000 UTC m=+89719.877796273 Reason:OOMKilling Message:Memory cgroup out of memory: Kill process 2496433 (node) score 1275 or sacrifice child

2020-04-28 09:48:29.000 HKT
Memory cgroup out of memory: Kill process 3086395 (monitor) score 237 or sacrifice child Killed process 3086395 (monitor) total-vm:130128kB, anon-rss:9204kB, file-rss:15488kB, shmem-rss:0kB

Gabriel Wu
  • 1,938
  • 18
  • 30

2 Answers2

1

How to find pod name by the host process Id?

nsenter -t $PID -u hostname

Then you can find container id easily, I think.

YwH
  • 1,050
  • 5
  • 11
0

Alright, this is a design error that looks to me. I would suggest using 1 container - 1 process and let the process become the main process (PID 1). This design will tell you which pod is having issues since Kubernetes will restart the pod as soon as the process hits the memory limit. Anyway, using Kubernetes you should see which pod has been restarted using kubectl get pods and see either pods in error start or restarting counter increasign. If you want to use only node logs, which I discorage you to use it, you might find that this solution could help you CoreOS - get docker container name by PID?

Giorgio Cerruti
  • 896
  • 6
  • 17
  • It's hard to make sure only 1 process 1 container. Within container there could be lots of child processes. In the end, it turns out looking for mapping host pid with container Id is meaningless as there could be issue with child process, like child process OOM. And the parent process pid has nothing to do with OOM at all. So in the end i choose to schedually run "ps aux --forest" in node then get snapshot of processes there – Gabriel Wu Apr 28 '20 at 09:59
  • I took a look at the "CoreOS - get docker container name by PID?". It doesn't meet what i need. I need to get host pid within container when container is running. That docker command has to be executed within host. Still thanks @Giorgio appreciate your reply – Gabriel Wu Apr 28 '20 at 10:02