2

I have been deploying containers on GCP Compute Engine VMs using google's Container Optimized OS. I have been slightly struggling to understand the shutdown behavior of the deployed containers when the host VM is stopped in GCP.

When my containers receive a SIGTERM or SIGINT signal, they perform some cleanup behavior and write some files into mounted volumes. I have tested this extensively with docker stop and docker kill -s SIGINT. However, this behavior doesn't seem to be occurring when I stop the host machine in GCP.

I'm not entirely sure how to debug this process. I tried attaching to the VM's serial console, but it doesn't seem to have any info pertaining to the container shutdown logic.

Any guidance would be very appreciated! For reference, this is the image I am deploying.


Full reproduction steps:

Create a new "Compute Engine" VM with "Deploy a container image to this VM." I have been using an e2 medium with a 20GB boot disk.

Use the "lloesche/valheim-server" image.

Set the following env variables:

SERVER_NAME: Test
WORLD_NAME: Test
SERVER_PASS: Password # must be at least 5 characters

Add a Directory mount of type "Directory" with "/config" as the mount path and "/home/YOUR_GCP_USERNAME/valheim-server-config" as the host path in "Read/write" mode.

After the container starts up, you should have the image running on the host machine (lloesche/valheim-server). You should also have a file created at ~/valheim-server-config/worlds/ called Test.fw1.

Now, stopping this container (docker stop) should cause a write to that file. You can verify this by stopping the container and then observing that file's modified date.

However, this process doesn't seem to be occurring when the host instance is stopped. If you restart the host so the container is again running, then issue a "stop" to the host, that file isn't saved before the container is killed.

  • Can you provide more details on how you tested shutdown behavior or any logs that you have ? How did you try to test the - lets call it "GCP shutdown" ? Can you provide steps for reproduction ? What's you goal here ? – Wojtek_B Feb 17 '21 at 08:47
  • @Wojtek_B - part of my issue is that I'm not seeing any logs that pertain to the docker on the host during shutdown. I have provided full reproduction steps. – si1entstill Feb 17 '21 at 15:14
  • I think the issue is that the log level for shutdown in "INFO". The Container OS log level is set lower. I never figured out how to change that "persistently" meaning change the level that stays the same after rebooting the instance. My comments to this question show how to change the logging level (2nd comment from the end): https://stackoverflow.com/q/65721133/8016720 – John Hanley Feb 17 '21 at 19:40
  • Details from the referenced question: Edit the file /etc/stackdriver/logging.config.d/fluentd-lakitu.conf Look for the section Collects all journal logs with priority >= warning. The PRIORITY is 0 -> 4. If you add "5" and "6" to the list, then the startup-scripts are logged in Operations Logging. However, this change is not persistent across reboots. The question now is how to make this change persistent. – John Hanley Feb 17 '21 at 19:40
  • @JohnHanley - My docker service's logs are typically being written fine (I see them in the gce logs explorer). However, the first log I see after shutting down the instance is a "Daemon shutdown complete" log about a second after the container receives the "stop" command. – si1entstill Feb 17 '21 at 20:11
  • I think you missed my point. There are different logging levels. The container startup and shutdown detail is not logged at the current level. – John Hanley Feb 17 '21 at 20:51
  • ah - you just mentioned startup scripts so I thought it may be tied to something with script output specifically. I can see if the daemon or the containers say anything else if I change it. – si1entstill Feb 17 '21 at 22:13
  • Were you able to solve this ? Did you build your container image directly from Github repo or first cloned it to your machine ? – Wojtek_B Feb 22 '21 at 11:50
  • @Wojtek_B - unfortunately not. I tried to alter the logging levels per John's suggestion, but I still wasn't seeing anything from the containers during shutdown before I got the "Daemon shutdown complete" log. I'm not sure if there aren't any additional logs, or my tweak was unsuccessful. – si1entstill Feb 22 '21 at 14:57
  • Maybe try with a different container - or maybe another version ? I tried to replicate your issue but got the same results too. – Wojtek_B Feb 22 '21 at 15:28
  • @Wojtek_B - I tried a different image and observed similar behavior. It doesn't look the the daemon is waiting for the containers to exit after sending a SIGTERM (if its even sending one). – si1entstill Feb 22 '21 at 16:20
  • Do you have any logs from the host machine that can be analysed ? – Wojtek_B Feb 23 '21 at 14:28
  • @Wojtek_B - [here](https://pastebin.com/rDkXPBPJ) are all of the logs I'm getting – si1entstill Feb 25 '21 at 22:02

2 Answers2

2

I had the same problem and I found a workaround (not perfect but works for me). Add as part of your startup-script:

mkdir -p /etc/systemd/system/docker.service.d
printf "[Service]\nExecStop=/bin/sh -c 'docker stop \$(docker ps -q)'" > /etc/systemd/system/docker.service.d/override.conf

Usually (and also in this case for testing) you can edit the override file (which adds your config to the existing config) with sudo systemctl edit docker.service. Unfortunately, the override file is apparently deleted every time the system starts, which is why I persisted it via the startup-script.

Before this approach a tried what Wojtek_B suggested (sorry, my reputation is too low to comment directly) but that did not work. The reason is, that the docker daemon gets the termination signal before the shutdown script is processed. As involving docker within the shutdown-script of the "Container Optimized OS" fails (or is at least risky) it could be regarded as a bug.

0

I've went through the logs and found nothing that would point me to a solution.

There may be however a workaround for this.

You can use shutdown script to stop your containers more "gracefully" before VM shutdown;

You can provide the script using gcloud command:

gcloud compute instances create example-instance \
    --metadata-from-file shutdown-script=examples/scripts/install.sh

or using console UI:

In Cloud Console, specify a shutdown script directly using the shutdown-script metadata key:

In the Cloud Console, go to the VM instances page. Go to VM instances

Click Create instance. On the Create a new instance page, fill in the properties for your instance. For advanced configuration options, expand the Management, security, disks, networking, sole tenancy section. In the Metadata section, fill in shutdown-script as the metadata key. In the Value box, supply the contents of your shutdown script. Click Create to create the instance.

Ultimately you can create a new issue at Google Issuetracker and explain what you expect (what kind of behavior).

Wojtek_B
  • 4,245
  • 1
  • 7
  • 21
  • Thanks for taking a look! I did try this and it didn't seem as though the host OS was executing the shutdown script (supplied via the metadata key). I was attempting to have it issue a shutdown command to the docker container. I may give it another shot and open an issue on the the issue tracker. Thanks again! – si1entstill Feb 27 '21 at 02:36
  • Thenks for feedback - if the shutdown script doesn't work for some reason in your case report a bug at [IssueTracker](https://issuetracker.google.com) since this may affect more users. – Wojtek_B Mar 01 '21 at 07:26