1

I need to export memory dump from Aks Cluster and save it in some location

How can I do it? Is easy to export to a storage account? Exist another solution? Can someone give me an step y step?

exitista
  • 563
  • 2
  • 10
  • 21

2 Answers2

0

EDIT: the previous answer was wrong, I didn't paid attention you needed a dump. You'll actually will need to get it from Boot Diagnostic or some command line:

https://learn.microsoft.com/en-us/azure/virtual-machines/troubleshooting/boot-diagnostics#enable-boot-diagnostics-on-existing-virtual-machine

Thiago Custodio
  • 17,332
  • 6
  • 45
  • 90
0

This question is quite old, but let me nevertheless share how I realized it:

Linux has an internal setting called RLIMIT_CORE which limits the size of the core dump you'll receive when your application crashes - this is what you find quite quickly.

Next, you have to define the location of where core files are saved, which is done in the file /proc/sys/kernel/core_pattern. The given path can either be a relative file name (saved next to the binary which crashed), an absolute path (absolute to the mounted namespace) or - here is where it gets interesting - a pipe followed by an absolute path to an executable (application or script). This script will (according to the docs - see headline Piping core dumps to a program) be started as user and group root - but furthermore, it will (according to this post in the Linux mailing list) also be executed in the global namespace - in other words, outside of the container.

If you are like me, and you do not have access to the image used for new nodes on your AKS cluster, you want to set these values using DaemonSets, a pod which runs once on every node.

Armed with all this knowledge, you can do the following:

  1. Create a DaemonSet - a pod running on every machine performing the initial setup.
  2. This DaemonSet will run as a privileged container to allow it to switch to the root namespace.
  3. After having switched namespaces successfully, it can change the value of /proc/sys/kernel/core_pattern.
  4. The value should be something like |/bin/dd of=/core/%h.%e.%p.%t (dd will take the stdin, the core file, and save it to the location defined by the parameter of). Core files will now be saved at /core/. The name of the file can be explained by the variables found in the docs for core files.
  5. After knowing that the files will be saved to /core/ of the root namespace, we can mount our storage there - in my case Azure File Storage. Here's a tutorial of how to mount AzureFileStorage.
  6. Pods have the RestartPolicy set to Always. Since the job of your pod is done, and you don't want it to restart automatically, let it remain running using sleep infinity.

This writeup is almost a copy of what I discovered while contacting the support from Microsoft. Here's the thread in their forum, which contains an almost finished configuration for a DaemonSet.

I'll leave some links here which I used during my research:

Sidenote:

I could also just have mounted the AzureFileSystem into every container and set the value for /proc/sys/kernel/core_pattern to just /core/%h.%e.%p.%t but this would require me to mention the mount on every container. Going this way I could free the configuration of the pods of this administrative task and put it where it (in my opinion) belongs, to the initial machine setup.

SimonSimCity
  • 6,415
  • 3
  • 39
  • 52
  • Could you please elaborate on the Sidenote you have written?. We have our Services running on each Container. There are 100+ containers. We want to get the core file of the container if there is a crash. Our requirement is to make each container write the core file to a Share storage Azure file system. Can we accomplish that by just changing the core_pattern for each container? – PraveenMak Oct 18 '21 at 18:53
  • @PraveenMak I only know of the prescribed ways. If you do not want to mount a share into every container, I'd advise you to follow the steps above and ignore my sidenote. If you cannot take control of the host-system (like not being able to run a privileged container), I'd have to check. This thread might help also: https://github.com/moby/moby/issues/11740 – SimonSimCity Oct 22 '21 at 18:07