3

I am using golang to programmatically create and destroy one-off Compute Engine instances using the Compute Engine API.

I can create an instance just fine, but what I'm really having trouble with is launching a container on startup.

You can do it from the Console UI:

enter image description here

But as far as I can tell it's extremely hard to do it programmatically, especially with Container Optimized OS as the base image. I tried doing a startup script that does a docker pull us-central1-docker.pkg.dev/project/repo/image:tag but it fails because you need to do gcloud auth configure-docker us-central1-docker.pkg.dev first for that to work and COOS doesn't have gcloud nor a package manager to get it.

All my workarounds seem hacky:

  • Manually create a VM template that has the desired container and create instances of the template
  • Put container in external registry like docker hub (not acceptable)
  • Use Ubuntu instead of COOS with a package manager so I can programmatically install gcloud, docker, and the container on startup
  • Use COOS to pull down an image from dockerhub containing gcloud, then do some sort of docker-in-docker mount to pull it down

Am I missing something or is it just really cumbersome to deploy a container to a compute engine instance without using gcloud or the Console UI?

Gillespie
  • 5,780
  • 3
  • 32
  • 54

4 Answers4

2

To have a Compute Engine start a container when the Compute Engine starts, one has to define meta data for the description of the container. When the COOS starts, it appears to run an application called konlet which can be found here:

https://github.com/GoogleCloudPlatform/konlet

If we look at the documentation for this, it says:

The agent parses container declaration that is stored in VM instance metadata under gce-container-declaration key and starts the container with the declared configuration options.

Unfortunately, I haven't found any formal documentation for the structure of this metadata. While I couldn't find documentation, I did find two possible solutions:

  1. Decipher the source code of konlet and break it apart to find out how the metadata maps to what is passed when the docker container is started

or

  1. Create a Compute Engine by hand with the desired container definitions and then start the Compute Engine. SSH into the Compute Engine and then retrieve the current metadata. We can read about retrieving meta data here:

https://cloud.google.com/compute/docs/metadata/overview

Kolban
  • 13,794
  • 3
  • 38
  • 60
  • Thanks for this. I ended up figuring out that COOS comes bundled with `docker-credential-gcr` so you can use that pull down your image and launch a container on startup using a startup script (https://cloud.google.com/compute/docs/instances/startup-scripts/linux#api) – Gillespie Nov 12 '21 at 04:48
2

It turns out, it's not too hard to pull down a container from Artifact Registry in Container Optimized OS:

  • Run docker-credential-gcr configure-docker --registries [region]-docker.pkg.dev

See: https://cloud.google.com/container-optimized-os/docs/how-to/run-container-instance#accessing_private_images_in_or

So what you can do is put the above line along with docker pull [image] and docker run ... into a startup script. You can specify a startup script when creating an instance using the metadata field: https://cloud.google.com/compute/docs/instances/startup-scripts/linux#api

This seems the least hacky way of provisioning an instance with a container programmatically.

Gillespie
  • 5,780
  • 3
  • 32
  • 54
  • 1
    Your solution looks like it will work great ... however, I'm wondering if it is the "architected" way? If I'm understanding the COOS story correctly, if you have a Compute Engine that loads the COOS image AND that Compute Engine contains in its defined metadata the identity of the image you want to run, doesn't everything else (including the startup of the identified container) just happen? – Kolban Nov 12 '21 at 05:11
1

You mentioned you used docker-credential-gcr to solve your problem. I tried the same in my startup script:

docker-credential-gcr configure-docker --registries us-east1-docker.pkg.dev

But it returns:

ERROR: Unable to save docker config: mkdir /root/.docker: read-only file system

Is there some other step needed? Thanks.

Derrick
  • 323
  • 2
  • 10
  • It uses the `HOME` environment variable. You can do `HOME=/tmp docker-credential-gcr configure-docker --registries us-east1-docker.pkg.dev` and then when invoking docker use `docker --config /tmp/.docker ...` – Gillespie Mar 03 '22 at 02:13
  • And if that doesn't work (it should) you can always fallback on: `echo '{"auths":{},"credHelpers":{"us-east1-docker.pkg.dev": "gcr"}}' > /tmp/config.json` and then when invoking docker use docker --config /tmp ... – Gillespie Mar 03 '22 at 02:16
  • Thanks. I just realized that I can simply use `gcloud -q auth configure-docker us-east1-docker.pkg.dev` since I am using their Debian rather than Container-Optimized VM. This allows me to shutdown the VM automatically after my docker batch process is finished. – Derrick Mar 03 '22 at 02:32
0

I recently ran into the other side of these limitations (and asked a question on the topic).

Basically, I wanted to provision a COOS instance without launching a container. I was unable to, so I just launched a container from a base image and then later in my CI/CD pipeline, Dockerized my app, uploaded it to Artifact Registry and replaced the base image on the COOS instance with my newly built app.

The metadata I provided to launch the initial base image as a container:

spec:
  containers:
    - image: blairnangle/python3-numpy-ta-lib:latest
      name: containervm
      securityContext:
        privileged: false
      stdin: false
      tty: false
      volumeMounts: []
      restartPolicy: Always
      volumes: []

I'm a Terraform fanboi, so the metadata exists within some Terraform configuration. I have a public project with the code that achieves this if you want to take a proper look: blairnangle/dockerized-flask-on-gce.

Blair Nangle
  • 1,221
  • 12
  • 18