5

There are claims that build arguments of a Docker image can be extracted after you pull the image (example).

I've tested this with the following Dockerfile:

FROM scratch

ARG SECRET

ADD Dockerfile .

When I build the image:

$ docker build -t build-args-test --build-arg SECRET=12345 .

And inspect it as specified in the article:

$ docker image history --no-trunc build-args-test
IMAGE                   CREATED          CREATED BY                    SIZE      COMMENT
sha256:(hash omitted)   17 minutes ago   ADD Dockerfile . # buildkit   43B       buildkit.dockerfile.v0
<missing>               17 minutes ago   ARG SECRET                    0B        buildkit.dockerfile.v0

I can't see the actual build argument (12345).

Is there a way to extract the build arguments from the image?

Would the answer be different if the image is not built on my machine but pulled from a repository?

I am aware of the Docker build secret functionality. However, I am asking specifically about ARG.

Koterpillar
  • 7,883
  • 2
  • 25
  • 41

1 Answers1

5

It depends.

If you use the secret, it will show up in the layers of the image where that secret was used. Since the ADD step, didn't use the ARG, you didn't see it, but every RUN step injects the ARG values as environment variables, you get something like:

$ cat df.secret-arg 
FROM alpine:latest

ARG SECRET
RUN echo doing something with a secret

$ DOCKER_BUILDKIT=0 docker build -t test-secret-arg --build-arg SECRET=password123 -f df.secret-arg .
Sending build context to Docker daemon  22.02kB
Step 1/3 : FROM alpine:latest
 ---> 49f356fa4513
Step 2/3 : ARG SECRET
 ---> Using cache
 ---> 7181367a28e6
Step 3/3 : RUN echo doing something with a secret
 ---> Running in a46ee00e682a
doing something with a secret
Removing intermediate container a46ee00e682a
 ---> e3eeea5f5d6d
Successfully built e3eeea5f5d6d
Successfully tagged test-secret-arg:latest

$ docker history test-secret-arg
IMAGE          CREATED          CREATED BY                                      SIZE      COMMENT
e3eeea5f5d6d   6 seconds ago    |1 SECRET=password123 /bin/sh -c echo doing …   0B
7181367a28e6   33 seconds ago   /bin/sh -c #(nop)  ARG SECRET                   0B
49f356fa4513   3 months ago     /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B
<missing>      3 months ago     /bin/sh -c #(nop) ADD file:7119167b56ff1228b…   5.61MB

There you can see the SECRET=password123 in the top layer with the echo command. There is a lot I'm glossing over in the "layers of the image where that secret was used" condition, since there are things like multi-stage builds. But the risk is if you get it wrong, the secret leaks.

Pushing the image to a registry doesn't help either:

$ docker tag test-secret-arg localhost:5000/test:secret-arg

$ docker push localhost:5000/test:secret-arg
The push refers to repository [localhost:5000/test]
8ea3b23f387b: Mounted from test/layer-bot
secret-arg: digest: sha256:969350bede26545fd35b53abed429ca0ecf2fc8717435a2edd842b4c9572b5bc size: 528

$ regctl image config localhost:5000/test:secret-arg
{
  "created": "2021-06-30T01:06:04.766667999Z",
  "architecture": "amd64",
  "os": "linux",
  "config": {
    "Env": [
      "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
    ],
    "Cmd": [
      "/bin/sh"
    ]
  },
  "rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:8ea3b23f387bedc5e3cee574742d748941443c328a75f511eb37b0d8b6164130"
    ]
  },
  "history": [
    {
      "created": "2021-03-31T20:10:06.686359124Z",
      "created_by": "/bin/sh -c #(nop) ADD file:7119167b56ff1228b2fb639c768955ce9db7a999cd947179240b216dfa5ccbb9 in / "
    },
    {
      "created": "2021-03-31T20:10:06.934368604Z",
      "created_by": "/bin/sh -c #(nop)  CMD [\"/bin/sh\"]",
      "empty_layer": true
    },
    {
      "created": "2021-06-30T01:05:37.908964654Z",
      "created_by": "/bin/sh -c #(nop)  ARG SECRET",
      "empty_layer": true
    },
    {
      "created": "2021-06-30T01:06:04.766667999Z",
      "created_by": "|1 SECRET=password123 /bin/sh -c echo doing something with a secret",
      "empty_layer": true
    }
  ]
}

(Note, regctl is available from my repo here, but this is just making registry API calls. Pulling the image and inspecting on another machine would still show the build arg.)

In general, don't use secrets inside your docker image builds, it's a code smell. Typically the CI system should be checking out the code with the secrets, and the image should have binaries and libraries in it, not data or configurations that you want to keep private. Data belongs in a volume, and configurations are injected into the container in a variety of ways (config file mount, secret, environment variable, etc, but in the container, not in the image). For more ways to work with secrets, see this answer to What is the best way to pass AWS credentials to a Docker container?.

BMitch
  • 231,797
  • 42
  • 475
  • 450
  • Does BuildKit change the answer? I just tested with BuildKit and it is visible in the same way, but curious to why you disabled it. – Koterpillar Jun 30 '21 at 01:24
  • @Koterpillar I disabled it for visibility (so you could see the echo step running). It doesn't change this as you've seen, but I leave it enabled by default for lots of other really good reasons. – BMitch Jun 30 '21 at 01:34
  • Ah, you want `--progress plain` then! – Koterpillar Jun 30 '21 at 01:40
  • @Koterpillar I could, but it goes the other way of being a lot more verbose, which clutters up the answer. I didn't mean for turning off buildkit to be such a distraction from the answer that it seems to have become. The key here is no matter what you do, build args aren't treated as a secret. – BMitch Jun 30 '21 at 02:05
  • The main point here is that no matter what mechanism you use for secrets if you use them then that use will be recorded in the layer's of the image and can be extracted. – Software Engineer Jun 30 '21 at 09:05
  • "no matter what mechanism" - even the BuildKit secrets? @SoftwareEngineer – Koterpillar Jun 30 '21 at 23:28
  • 2
    Buildkit secrets should be secure. The risk isn't that the secret itself is leaked by the image metadata, but that using the secret is moving protected data from a protected location into the image (and people often accidentally log the secret). – BMitch Jul 01 '21 at 00:43
  • @Koterpillar -- I misspoke earlier; there are methods within buildkit that would work, but afaik they're non-standard and only apply specifically to Docker software not containers in general. Buildkit secrets should be a lot safer for this type of thing, or even just having the secret into a file and using a tmpfs to mount it should work ok. – Software Engineer Jul 04 '21 at 11:33
  • "the image should have binaries and libraries in it" - sure but you still have to get those libraries installed properly into the image in the first place. Dependency management tools (Maven, Gradle, Pip) sometimes require credentials. – Shannon Jul 13 '23 at 17:36
  • @Shannon this is answering the question of whether build args can be extracted. For the suggested way to use secrets when they are required, see the links at the end of my answer. – BMitch Jul 13 '23 at 19:58