CUDA9.2+Anaconda5.0+Pytorch1.0.0( py3.7_cuda90_cudnn7_1)安装完成后,不调用GPU跑程序,可以正常运行,当调用cuda()后出错:RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
而同样的代码在Ubuntu下运行正常。
// 出错提示如下
Traceback (most recent call last):
File "D:/ProjectWork/Pythonworkp/DFT01/dft1.py", line 98, in <module>
rnn.cuda()
File "C:\Users\OFC\Anaconda3\envs\torch2\lib\site-packages\torch\nn\modules\module.py", line 260, in cuda
return self._apply(lambda t: t.cuda(device))
File "C:\Users\OFC\Anaconda3\envs\torch2\lib\site-packages\torch\nn\modules\module.py", line 187, in _apply
module._apply(fn)
File "C:\Users\OFC\Anaconda3\envs\torch2\lib\site-packages\torch\nn\modules\rnn.py", line 117, in _apply
self.flatten_parameters()
File "C:\Users\OFC\Anaconda3\envs\torch2\lib\site-packages\torch\nn\modules\rnn.py", line 113, in flatten_parameters
self.batch_first, bool(self.bidirectional))
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
#之前在线安装的版本
C:\Windows\system32>activate torch2
(torch2) C:\Windows\system32>conda install pytorch torchvision cuda92 -c pytorch
Fetching package metadata ...............
Solving package specifications: .
Package plan for installation in environment C:\Users\OFC\Anaconda3\envs\torch2:
The following NEW packages will be INSTALLED:
cuda92: 1.0-0 pytorch
ninja: 1.8.2-py37he980bc4_1
pytorch: 1.0.0-py3.7_cuda90_cudnn7_1 pytorch
torchvision: 0.2.1-py_2 pytorch
Proceed ([y]/n)? y
#卸载
(torch2) C:\Windows\system32>conda uninstall pytorch
Fetching package metadata .............
Solving package specifications: .
下载离线的pytorch
pytorch离线安装包下载地址:https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/win-64/
将下载的安装包放到C:\Users\OFC\Anaconda3\envs路径下,安装pytorch:
conda install pytorch-0.4.1-py37_cuda92_cudnn7he774522_1.tar.bz2
pip install torchvision
#重装Pytorch
(torch2) C:\Users\OFC\Anaconda3\envs>conda install pytorch-0.4.1-py37_cuda92_cudnn7he774522_1.tar.bz2
(torch2) C:\Users\OFC\Anaconda3\envs>pip install torchvision
按照提示:
pip install PyHamcrest==1.9.0
python -m pip install --upgrade pip
Warning! HDF5 library version mismatched error
The HDF5 header files used to compile this application do not match
the version used by the HDF5 library to which this application is linked.
Data corruption or segmentation faults may occur if the application continues.
This can happen when an application was compiled by one version of HDF5 but
linked with a different version of static or shared HDF5 library.
You should recompile the application or check your shared library related
settings such as 'LD_LIBRARY_PATH'.
You can, at your own risk, disable this warning by setting the environment
variable 'HDF5_DISABLE_VERSION_CHECK' to a value of '1'.
Setting it to 2 or higher will suppress the warning messages totally.
Headers are 1.10.2, library is 1.10.1
在此虚拟环境下安装hdf5:conda install -c anaconda hdf5=1.10.2
(torch2) C:\Users\OFC\Anaconda3\envs>conda install -c anaconda hdf5=1.10.2
Fetching package metadata ...............
Solving package specifications: .
在pycharm中设置环境变量,可能原因是调用了Anaconda中的hdf5=1.10.1,而没有调用虚拟环境下新装的hdf5=1.10.2,需要设置环境变量
Run–>Edit Configurations…–>Environment–>Environment variables:点击框右边的文件夹图标–>点击“+”,添加环境变量
LD_LIBRARY_PATH: C:\Users\OFC\Anaconda3\envs\torch2\Library\mingw-w64
PATH: C:\Users\OFC\Anaconda3\envs\torch2\Library\bin
运行,成功调用GPU!