Ubuntu离线安装nvidia-docker完整过程(最简单的解决方法解决nvidia-docker: command not found/Unknown runtime specified)

卢子民
2023-12-01

问题说明

安装完docker、NVIDIA驱动后,执行指令:

nvidia-docker

报错如下:

nvidia-docker: command not found

第二种错误:

Error response from daemon: Unknown runtime specified nvidia.
See 'docker run --help'.

【注】第二种错误的解决方法直接看 【3.3修改配置文件 daemon.json】 再按照4,5步骤依次进行

运行环境

  • Ubuntu 18.04
  • Docker version 18.09.6
  • NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2

解决方法

  • 1、下载5个deb文件
    libnvidia-container1
    libnvidia-container-tools
    nvidia-container-toolkit
    nvidia-container-runtime
    nvidia-docker2

    docker官网下载链接:
    http://mirror.cs.uchicago.edu/nvidia-docker/libnvidia-container/stable/ubuntu16.04/amd64/

    我下载的文件:

    libnvidia-container-tools_1.7.0-1_amd64.deb  
    nvidia-container-toolkit_1.7.0-1_amd64.deb
    libnvidia-container1_1.7.0-1_amd64.deb  
    nvidia-container-runtime_3.7.0-1_all.deb     
    nvidia-docker2_2.8.0-1_all.deb
    
  • 2、安装

    执行命令:

    sudo dpkg -i ./lib*  ./nvidia*
    
  • 3、如果第二步安装过程提示下面的错误:

    dpkg: dependency problems prevent configuration of nvidia-docker2:
     nvidia-docker2 depends on docker-ce (>= 18.06.0~ce~3-0~ubuntu) | docker-ee (>= 18.06.0~ce~3-0~ubuntu) | docker.io (>= 18.06.0); however:
      Package docker-ce is not installed.
      Package docker-ee is not installed.
      Package docker.io is not installed.
    
    

    解决方法:安装 docker-ce,docker-ee, docker.io

  • 3.1下载下面文件
    【说明】我安装的 docker 版本是 18.09.6。(docker 20.10.2的相关文件见后面)

    	containerd.io_1.2.6-3_amd64.deb  
    	docker-ce_18.09.6~3-0~ubuntu-bionic_amd64.deb  
    	docker-ce-cli_18.09.6~3-0~ubuntu-bionic_amd64.deb
    

    官网下载地址:https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/

    我的下载地址:

    https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce_18.09.6~3-0~ubuntu-bionic_amd64.deb
    https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce-cli_18.09.6~3-0~ubuntu-bionic_amd64.deb
    https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/containerd.io_1.2.6-3_amd64.deb
    

    下载文件保存到文件夹:docker_deb/

  • 3.2安装

    	cd docker_deb/
    	sudo dpkg -i ./*deb
    
      安装末尾提示:
    
    	Configuration file '/etc/docker/daemon.json'
    	 ==> File on system created by you or by a script.
    	 ==> File also in package provided by package maintainer.
    	   What would you like to do about it ?  Your options are:
    	    Y or I  : install the package maintainer's version
    	    N or O  : keep your currently-installed version
    	      D     : show the differences between the versions
    	      Z     : start a shell to examine the situation
    	 The default action is to keep your current version.
    	*** daemon.json (Y/I/N/O/D/Z) [default=N] ? 
    	
    

    选择:N

  • 3.3修改配置文件 daemon.json
    首先查看 daemon.json:

    {
      "data-root": "/var/lib/docker",
      "exec-opts": ["native.cgroupdriver=systemd"],
      "insecure-registries": ["xxx"],
      "max-concurrent-downloads": 10,
      "live-restore": true,
      "log-driver": "json-file",
      "log-level": "warn",
      "log-opts": {
        "max-size": "50m",
        "max-file": "1"
        },
      "storage-driver": "overlay2"
    }
    

    问题:缺少 runtimes、default-runtime,新增如下内容:

      "default-runtime": "nvidia",
      "runtimes": {
            "nvidia": {
                "path": "nvidia-container-runtime",
                "runtimeArgs": []
        }
        },
    
  • 4、重启docker服务
    先停止所有运行的容器:

    docker stop $(docker ps -a -q)
    

    再重启 docker 服务:

    sudo systemctl daemon-reload
    sudo systemctl restart docker
    
  • 5、验证nvidia-docker

    执行指令:

    nvidia-docker -v
    

    返回结果:

    Docker version 18.09.6, build 481bc77
    

    说明 nvidia-docker 安装成功

测试nvidia-docker

nvidia-docker  run -it -d \
    --name your_name \
    -e TZ='Asia/Shanghai' \
    -d your_ai_image:latest

不报错则说明 nvidia-docker 正常可用。

docker 20.10.2的安装

下载文件:

containerd.io_1.4.6-1_amd64.deb 
docker-ce_20.10.2_3-0_ubuntu-bionic_amd64.deb  
docker-ce-cli_20.10.2_3-0_ubuntu-bionic_amd64.deb

对应的下载链接:

https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/containerd.io_1.4.6-1_amd64.deb
https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce-cli_20.10.2~3-0~ubuntu-bionic_amd64.deb
https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce_20.10.2~3-0~ubuntu-bionic_amd64.deb

其他过程同 docker 18.09.6 一致。

参考

【1】https://blog.csdn.net/zengNLP/article/details/126732645?spm=1001.2014.3001.5502
【2】http://mirror.cs.uchicago.edu/nvidia-docker/libnvidia-container/stable/ubuntu16.04/amd64/
【3】https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/
【4】https://www.codenong.com/cs109532661/

 类似资料: