安装完docker、NVIDIA驱动后,执行指令:
nvidia-docker
报错如下:
nvidia-docker: command not found
第二种错误:
Error response from daemon: Unknown runtime specified nvidia.
See 'docker run --help'.
【注】第二种错误的解决方法直接看 【3.3修改配置文件 daemon.json】 再按照4,5步骤依次进行
1、下载5个deb文件
libnvidia-container1
libnvidia-container-tools
nvidia-container-toolkit
nvidia-container-runtime
nvidia-docker2
docker官网下载链接:
http://mirror.cs.uchicago.edu/nvidia-docker/libnvidia-container/stable/ubuntu16.04/amd64/
我下载的文件:
libnvidia-container-tools_1.7.0-1_amd64.deb
nvidia-container-toolkit_1.7.0-1_amd64.deb
libnvidia-container1_1.7.0-1_amd64.deb
nvidia-container-runtime_3.7.0-1_all.deb
nvidia-docker2_2.8.0-1_all.deb
2、安装
执行命令:
sudo dpkg -i ./lib* ./nvidia*
3、如果第二步安装过程提示下面的错误:
dpkg: dependency problems prevent configuration of nvidia-docker2:
nvidia-docker2 depends on docker-ce (>= 18.06.0~ce~3-0~ubuntu) | docker-ee (>= 18.06.0~ce~3-0~ubuntu) | docker.io (>= 18.06.0); however:
Package docker-ce is not installed.
Package docker-ee is not installed.
Package docker.io is not installed.
解决方法:安装 docker-ce,docker-ee, docker.io
3.1下载下面文件
【说明】我安装的 docker 版本是 18.09.6。(docker 20.10.2的相关文件见后面)
containerd.io_1.2.6-3_amd64.deb
docker-ce_18.09.6~3-0~ubuntu-bionic_amd64.deb
docker-ce-cli_18.09.6~3-0~ubuntu-bionic_amd64.deb
官网下载地址:https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/
我的下载地址:
https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce_18.09.6~3-0~ubuntu-bionic_amd64.deb
https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce-cli_18.09.6~3-0~ubuntu-bionic_amd64.deb
https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/containerd.io_1.2.6-3_amd64.deb
下载文件保存到文件夹:docker_deb/
3.2安装
cd docker_deb/
sudo dpkg -i ./*deb
安装末尾提示:
Configuration file '/etc/docker/daemon.json'
==> File on system created by you or by a script.
==> File also in package provided by package maintainer.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
*** daemon.json (Y/I/N/O/D/Z) [default=N] ?
选择:N
3.3修改配置文件 daemon.json
首先查看 daemon.json:
{
"data-root": "/var/lib/docker",
"exec-opts": ["native.cgroupdriver=systemd"],
"insecure-registries": ["xxx"],
"max-concurrent-downloads": 10,
"live-restore": true,
"log-driver": "json-file",
"log-level": "warn",
"log-opts": {
"max-size": "50m",
"max-file": "1"
},
"storage-driver": "overlay2"
}
问题:缺少 runtimes、default-runtime,新增如下内容:
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
4、重启docker服务
先停止所有运行的容器:
docker stop $(docker ps -a -q)
再重启 docker 服务:
sudo systemctl daemon-reload
sudo systemctl restart docker
5、验证nvidia-docker
执行指令:
nvidia-docker -v
返回结果:
Docker version 18.09.6, build 481bc77
说明 nvidia-docker 安装成功
nvidia-docker run -it -d \
--name your_name \
-e TZ='Asia/Shanghai' \
-d your_ai_image:latest
不报错则说明 nvidia-docker 正常可用。
下载文件:
containerd.io_1.4.6-1_amd64.deb
docker-ce_20.10.2_3-0_ubuntu-bionic_amd64.deb
docker-ce-cli_20.10.2_3-0_ubuntu-bionic_amd64.deb
对应的下载链接:
https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/containerd.io_1.4.6-1_amd64.deb
https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce-cli_20.10.2~3-0~ubuntu-bionic_amd64.deb
https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/docker-ce_20.10.2~3-0~ubuntu-bionic_amd64.deb
其他过程同 docker 18.09.6 一致。
【1】https://blog.csdn.net/zengNLP/article/details/126732645?spm=1001.2014.3001.5502
【2】http://mirror.cs.uchicago.edu/nvidia-docker/libnvidia-container/stable/ubuntu16.04/amd64/
【3】https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/
【4】https://www.codenong.com/cs109532661/