Building docker image with cuda runtime

Question

I'm building a image which requires testing GPU usability in the meantime. GPU containers runs well:

$ docker run --rm --runtime=nvidia nvidia/cuda:9.2-devel-ubuntu18.04 nvidia-smi
Wed Aug  7 07:53:25 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.54                 Driver Version: 396.54                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN X (Pascal)    Off  | 00000000:04:00.0 Off |                  N/A |
| 24%   43C    P8    17W / 250W |   2607MiB / 12196MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

but failed when building with GPU:

$ cat Dockerfile
FROM nvidia/cuda:9.2-devel-ubuntu18.04

RUN nvidia-smi
# RUN build something
# RUN tests require GPU

$ docker build .
Sending build context to Docker daemon  2.048kB
Step 1/2 : FROM nvidia/cuda:9.2-devel-ubuntu18.04
 ---> cdf6d16df818
Step 2/2 : RUN nvidia-smi
 ---> Running in 88f12f9dd7a5
/bin/sh: 1: nvidia-smi: not found
The command '/bin/sh -c nvidia-smi' returned a non-zero code: 127

I'm new to docker but I think we need sanity checks when building an image. So how could I build docker image with cuda runtime?

Error 127 is command not found. So you don't have nvidia-smi installed or available — talonmies, Aug 07 '19 at 08:15
@talonmies I could run `nvidia-smi` on host machine and in docker container . It failed only when building docker image. — zingdle, Aug 07 '19 at 08:26
So you you don't have nivida-smi available to the docker build process — talonmies, Aug 07 '19 at 08:30
@talonmies Yes. I tried adding `--runtime=nvidia` when building but it didn't work. — zingdle, Aug 07 '19 at 08:40

score 6 · Accepted Answer · answered Aug 08 '19 at 10:29

6

Configuring docker daemon with --default-runtime=nvidia solved the problem.

Please refer to this wiki for more info.

answered Aug 08 '19 at 10:29

zingdle

491
5
17

1

For an updated answer for docker >= 19.03 see [this question](https://stackoverflow.com/questions/59691207/docker-build-with-nvidia-runtime) – RunOrVeith Mar 24 '21 at 13:08
It's also useful to note that if you don't want to modify the arguments to the docker daemon you can pass "--runtime=nvidia" to your "docker run" command. – Ethan May 19 '21 at 02:44
@Ethan `nvidia-docker` is deprecated. `--gpus [all|num|dev]` should be used instead from Docker 19.03 on as @RunOrVeith stated. See the comments in [this answer](https://stackoverflow.com/q/52865988/4368898) for more info. – Matt Popovich Jul 26 '21 at 20:31

score 0 · Answer 2 · answered Mar 27 '23 at 12:01

Today (27.03.2023) I faced another issue with building a docker image with cuda runtime.

Despite configuring proper nvidia runtime my docker build . and docker-compose commands couldn't access CUDA.

I solved it by disabling the new docker build kit with:

DOCKER_BUILDKIT=0 docker build .

or

DOCKER_BUILDKIT=0 docker-compose build

You can also permanently disable the new docker build kit by modifying /etc/docker/daemon.json by adding:

{
  "features": {
    "buildkit" : true
  }
}

It appears that the new docker build kit has some problems with handling CUDA and GPUs so if you are using it check my solution.

score -1 · Answer 3 · answered Aug 07 '19 at 13:23

-1

Maybe it's because you are using "RUN" command on the Dockerfile. I'd try "CMD" (see documentation for this command) or "ENTRYPOINT" due to call 'docker run' with arguments. I think that "RUN" commands are for previous jobs that you need to execute before the container get available, instead of a process with output and stuff.

Good luck with that,

answered Aug 07 '19 at 13:23

Santiago

1

Actually I'm running the tests when building the image instead of after the container is available. These tests make sure that the built binary is correct. – zingdle Aug 08 '19 at 01:55

Building docker image with cuda runtime

3 Answers3