当前位置: 首页 > 知识库问答 >
问题:

我无法让Tensorflow 2.0在我的GPU上运行

云曦之
2023-03-14

我一直在我的电脑上用Tensorflow写程序,它使用Linux薄荷。无论出于什么原因,我都无法让Tensorflow在我的图形处理器上运行。

2021-04-26 15:46:11.462612: W tenstorflow/stream_executor/平台/默认/dso_loader.cc:60]无法加载动态库'libcudart.so.11.0';libcudart.so.11.0:无法打开共享对象文件:没有这样的文件或目录<--plhd--5/如果您的机器上没有设置GPU,请忽略上面的cudart dlfalse。

我知道我已经安装了CUDA,因为对于PyTorch,GPU工作良好:

mydevice = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(mydevice)

产量

库达

另外,我用tensorflow运行了一个程序,我得到:

START TIME:  Mon Apr 26 16:34:24 2021
2021-04-26 16:34:24.499178: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-04-26 16:34:24.499862: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-04-26 16:34:24.526372: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-04-26 16:34:24.526781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2021-04-26 16:34:24.526900: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.526986: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.527069: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.528676: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-04-26 16:34:24.528994: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-04-26 16:34:24.530990: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-04-26 16:34:24.531125: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.531230: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.531245: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-04-26 16:34:24.531641: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-04-26 16:34:24.532140: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-04-26 16:34:24.532178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-04-26 16:34:24.532192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      
2021-04-26 16:34:24.592917: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-04-26 16:34:24.593369: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2400000000 Hz

我使用conda在anaconda上安装了tensorflow,尽管我相信它是从PyPi构建的。请告诉我你的建议。非常感谢。

共有2个答案

百里俭
2023-03-14

你是从哪个频道安装的?如果使用默认通道,则必须指定tensorflow的GPU版本。

conda install tensorflow=2.4.*=gpu* -c anaconda 
商佑运
2023-03-14

从您的错误日志中可以看出,tensorflow正在拾取您的GPU(GTX 1650)。但是,问题是cudatoolkit和cudnn版本可能与tensorflow版本不兼容。TF对这些要求相当具体。您需要注意的错误行如下:

2021-04-26 16:34:24.526900: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library **'libcudart.so.11.0'**; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.526986: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library **'libcublas.so.11'**; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.527069: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library **'libcublasLt.so.11'**; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.528676: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10

2021-04-26 16:34:24.531125: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library **'libcusparse.so.11'**; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2021-04-26 16:34:24.531230: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library **'libcudnn.so.8'**; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory

最新的tensorflow发行版tensorflow-2.4。0(请参阅完整表格)只能与cuDNN 8.0和CUDA 11.0版本配合使用。(尽管这些新版本已经发布——您可能需要检查您的版本,我认为您可能正在使用CUDA 10)。

我建议大家看看这篇文章(虽然比较老,但命令和原则仍然适用)。

  1. 制作yaml文件(tensorflow的yaml文件示例)
  2. 使用上述yaml文件为Tensorflow创建一个新环境

conda env创建-f环境。yml

激活tensorflow_env_388

注意。全新的环境将避免任何冲突包。

conda列表cudnn

# packages in environment at /rds/general/user/home/anaconda3/envs/tensorflow_env_388:
#
# Name                    Version                   Build  Channel
cudnn                     7.0.5.39             ha5ca753_1    conda-forge

conda列表cudatoolkit

然后根据需要安装cudnn/cuda

conda安装cudatoolkit=11.0

conda安装cudnn=8.0

 类似资料:
  • 我已经成功地在Linux Ubuntu 16.04上安装了tensorflow(GPU),并做了一些小的修改,以便使它与新的Ubuntu LTS版本一起工作。

  • 我的脚本没有在firefox上运行,我使用的是firefox版本49.0.1,下面是错误Selenium::WebDriver::error::WebDriverError:无法在60秒内获得稳定的firefox连接(127.0.0.1:7055)。

  • 我已经成功地用安装了tensorflow,一切正常。 我也可以成功地用安装tenstorflow-gpu,但是我不能在我的python脚本中导入它: 我已经安装了CUDA v9。0并运行windows 10

  • 问题内容: 我正在运行Keras模型,提交截止日期为36小时,如果我在cpu上训练我的模型大约需要50个小时,是否可以在gpu上运行Keras? 我正在使用Tensorflow后端,并在未安装anaconda的Jupyter笔记本上运行它。 问题答案: 是的,您可以在GPU上运行keras模型。几件事您将必须首先检查。 您的系统具有GPU(Nvidia。因为AMD尚未运行) 您已经安装了Tenso

  • 我正在运行一个Keras模型,提交截止日期为36小时,如果我在cpu上训练我的模型,大约需要50小时,有没有办法在gpu上运行Keras? 我正在使用Tensorflow后端并在我的Jupyter笔记本上运行它,而没有安装anaconda。

  • ` ` 在TOmCAT服务器中。我无法运行这个项目。? ` 严重:web应用程序[/SpringMvc4]中的Servlet[spring]抛出了load()异常Java。伊奥。FileNotFoundException:无法打开组织上的ServletContext资源[/WEB-INF/spring servlet.xml]。springframework。网状物上下文支持ServletCont