Could not load dynamic library ‘libcudnn.so.7’;
问题背景:
安装tensorflow2.0
后, 并安装了cuda10.0
以及cudnn7.4.2
, 调用tf.test.is_gpu_available()
返回False
, 并报如下错误
W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.0/lib64
2023-06-17 15:25:07.128275: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
可以肯定的是, 我安装好了cuda10.0
, 以及将cudnn
对应的文件拷贝到了/usr/local/cuda-10.0/lib64
以及/usr/local/cuda-10.0/include
中, 并赋予了可读权限, 也配置了LD_LIBRARY_PATH
环境变量, 在终端中使用echo $LD_LIBRARY_PATH
也验证了环境变量, 可是在运行tensorflow
代码的时候, 就是死活找不到LD_LIBRARY_PATH
。
问题排查:
执行 ldconfig /usr/local/cuda-10.0/lib64
, 若显示类似ldconfig is for unknown machine 21
, 那么就找到问题所在了, 网上查了下是libcudnn文件和系统不兼容, 这个就是问题所在, 说明下载的cudnn
和这台机器不兼容, 故到https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.4.2/prod/10.0_20181213/cudnn-10.0-linux-x64-v7.4.2.24.tgz 下载了另外版本的cudnn, 重新安装并增加可读权限后问题解决