NVIDIA-Docker安装教程

谢麒
2023-12-01

然后,进行免秘钥配置:
sudo addgroup --system docker
sudo adduser $USER docker
newgrp docker

Ubuntu18.04:

1 安装docker-ce:
https://docs.docker.com/engine/install/ubuntu/

2 GPU驱动安装:如果没有按照驱动
https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(Native-GPU-Support)

3 安装 NVIDIA docker 驱动 :安装 CUDA toolkit
https://github.com/NVIDIA/nvidia-docker

centos安装docker:
https://www.cnblogs.com/yufeng218/p/8370670.html
https://blog.csdn.net/u010349092/article/details/107514401

docker run --gpus all nvidia/cuda:9.0-base nvidia-smi
4 安装
sudo apt-get install nvidia-container-runtime
5 然后重启下docker
sudo systemctl daemon-reload
sudo service docker restart

这里可以选择具体哪个版本 可以在
https://hub.docker.com/r/nvidia/cuda/tags?page=6
中查询

例如:https://www.cnblogs.com/journeyonmyway/p/11234732.html
注释:
以为tensoflow1.14.0只支持cuda10.0 所以
sudo docker pull nvidia/cuda:10.0-cudnn7-devel

进入容器:
1.安装conda/
sudo docker run -it --gpus all -P --name dl-base -v pwd:/host nvidia/cuda:10.0-devel

sudo docker run -it --gpus 1 --name train_server_1 -v /home/sctech/docker_easytrain:/host nvidia/cuda:10.0-sctech_12_2

docker run -it --gpus all -p 5000:5000 --name train_server -v /home/zjbing/ai_training_platform/:/host docker.io/nvidia/cuda:10.1-base-ubuntu18.0

sh Miniconda3-latest-Linux-x86_64.sh(一路yes 回车就可以了)
再使其生效::
source ~/.bashrc
2.安装tf
tensoflow1.14.0-gpu版本 上pip网站下载
3.安装cudnn
上cudnn 网站下载 cudnn 支持cuda10.0版本的
如下解压文件
$ cp cudnn-8.0-linux-x64-v5.1.solitairetheme8 cudnn-8.0-linux-x64-v5.1.tgz $ tar -xvf cudnn-8.0-linux-x64-v5.1.tgz
拷贝文件:
https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#installlinux-tar
Copy the following files into the CUDA Toolkit directory, and change the file permissions.
$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

增加环境变量:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
$source /etc/profile

安装vim
apt update
apt install vim

安装依赖:
pip install -U flask
Pip install opencv-python

Opencv 依赖问题:
apt-get install libsm6
apt-get install libxrender1
apt-get install libxext-dev
apt-get install libglib2.0-dev
.

打包容器为镜像:
docker commit 容器ID 新镜像名
保存镜像到本地:
docker save -o nvidia_cuda_10.0_setech_2019_9_6.tar 793

Flask docker 环境安装:
下载容器:
docker pull continuumio/miniconda3

容器中安装依赖:
pip install numpy
pip install flask
pip install opencv-python
pip install docker
pip install grpcio
pip install tensorflow-serving-api-gpu

容器启动:
docker run -it -p 5000:5000 -v /home/sctech/docker_easytrain:/host miniconda3:flask_9_26 /bin/bash

docker run -it -p 5000:5000 -v /home/zjbing/ai_training_platform:/host miniconda3:flask_11_25 /bin/bash

/home/zjbing/ai_training_platform

 类似资料: