22

I tried to install the nvidia-docker after installing docker-ce. I followed this : https://github.com/NVIDIA/nvidia-docker to install nvidia-docker. It seems to have installed correctly.

I tried to run:

$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: Unknown runtime specified nvidia.
See 'docker run --help'.

Although, this works (without --runtime=nvidia):

$ docker container run -ti ubuntu bash

Some additional info on my system: It is an ubuntu server 16.04 with 8 GPUs (Titan Xp) and nvidia driver version 387.26. I can run nvidia-smi -l 1 on the host system and it works as expected.

$ dpkg -l | grep -E '(nvidia|docker)'
ii  docker-ce                              18.06.1~ce~3-0~ubuntu                        amd64        Docker: the open-source application container engine
ii  libnvidia-container-tools              1.0.0-1                                      amd64        NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64             1.0.0-1                                      amd64        NVIDIA container runtime library
ii  nvidia-container-runtime               2.0.0+docker18.06.1-1                        amd64        NVIDIA container runtime
ii  nvidia-container-runtime-hook          1.4.0-1                                      amd64        NVIDIA container runtime hook
ii  nvidia-docker2                         2.0.3+docker18.06.1-1                        all          nvidia-docker CLI wrapper



$ cat /etc/docker/daemon.json 
{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

I have come across: https://github.com/NVIDIA/nvidia-docker/issues/501, but I am not sure how I should go about it.

mkuse
  • 2,250
  • 4
  • 32
  • 61
  • 14
    `--runtime nvidia` is just for **nvidia-docker2**. `--gpus [all|num|dev]` should be used instead from Docker 19.03 on. https://github.com/NVIDIA/nvidia-docker#usage – BugKiller Sep 22 '19 at 06:54
  • 1
    [nvidia-docker is deprecated.](https://superuser.com/questions/1636390/is-nvidia-docker-outdated-are-there-cases-where-a-new-project-would-still-r) – questionto42 Mar 26 '21 at 21:12
  • This worked for me: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker – Sheece Gardazi Dec 28 '21 at 21:28

7 Answers7

6

From nvidia-docker github repo:

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd
scepeda
  • 451
  • 7
  • 14
5

Actually, you can try to restart docker daemon by following command.

sudo systemctl daemon-reload
sudo systemctl restart docker

Or you can try to reboot your system. to make nvidia-docker work

4

This is how I resolve the above problem for CentOS 7; hopefully it can help anyone who has similar problems.

  • Add necessary repos to get nvidia-container-runtime:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.repo | sudo tee /etc/yum.repos.d/nvidia-container-runtime.repo
  • (Optional) In my case, I disabled the experimental repos:
sudo yum-config-manager --disable libnvidia-container-experimental
sudo yum-config-manager --disable nvidia-container-runtime-experimental
  • Install nvidia-container-runtime package:
sudo yum install nvidia-container-runtime
  • Update docker daemon:
sudo vim /etc/docker/daemon.json

with the path to nvidia-container-runtime:

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}
  • Finally, you need to make docker update the path:
sudo pkill -SIGHUP dockerd
Minh Nguyen
  • 755
  • 5
  • 11
2

It seems you may need to purge docker and reinstall it as in the post: github issues

sudo apt remove docker-ce
sudo apt autoremove
sudo apt-get install docker-ce=5:18.09.0~3-0~ubuntu-bionic
sudo apt install nvidia-docker2
  • 2
    Links are good but you should also copy paste the relevant contents from there to your post so that even if the link is not available in future your answer will still make sense. – Vaibhav Vishal Feb 07 '19 at 10:52
  • 6
    I believe `sudo apt install nvidia-docker2` is deprecated as of Jan 16, 2020. On [this](https://github.com/NVIDIA/nvidia-docker) page from nvidia-docker, it suggests `sudo apt-get install -y nvidia-container-toolkit` – Nathan majicvr.com Jan 17 '20 at 02:36
  • 1
    E: Version '5:18.09.0~3-0~ubuntu-bionic' for 'docker-ce' was not found – Mona Jalal Jan 07 '21 at 03:47
  • 1
    @Nathan E: Unable to locate package nvidia-container-toolkit – Mona Jalal Jan 07 '21 at 03:48
  • @MonaJalal Sorry, somehow I just saw this. Did you get it working? Probably haven't thought about this in quite awhile by now – Nathan majicvr.com Mar 05 '21 at 17:55
1

From nvidia-docker Frequently Asked Questions:

Why do I get the error Unknown runtime specified nvidia? Make sure the runtime was registered to dockerd. You also need to reload the configuration of the Docker daemon.

LW001
  • 2,452
  • 6
  • 27
  • 36
Richard Tran
  • 132
  • 1
  • 11
0

If you're having trouble installing nvidia-docker then try running this shell script. It worked for me even when nvidia-docker crashed.

-3

Change the --runtime=nvidia tag to --runtine=gpus all hopefully it will run

ammar naich
  • 73
  • 1
  • 4