输入nvidia-smi 显示NVIDIA-SMI has failed because it couldn‘t communicate wi

news/2025/2/1 5:09:24/

现象描述

输入 nvidia-smi显示如下错误:

jiang@jiang-ThinkStation-P520:~$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

前几天测试的时候还好好的,突然不行了。
然后查看cuda和cudnn都是有的。

jiang@jiang-ThinkStation-P520:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)#include "driver_types.h"
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ 

原因分析

然后百度发现,有的说是内核自动升级了与英伟达显卡不匹配导致的,得指定内核版本。

解决办法

然后我用了下面方式最后正常了,

(1)首先,查看自己安装的nvidia版本

ls /usr/src | grep nvidia
jiang@jiang-ThinkStation-P520:~$ ls /usr/src | grep nvidia
nvidia-460.56
jiang@jiang-ThinkStation-P520:~$ 

(2)然后,终端执行一下命令

sudo apt install dkms
sudo dkms install -m nvidia -v 460.56

(3)再次输入nvidia-smi,显示:

在这里插入图片描述

过程日志

jiang@jiang-ThinkStation-P520:~$ ls /usr/src | grep nvidia
nvidia-460.56
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ sudo apt install dkms
正在读取软件包列表... 完成
正在分析软件包的依赖关系树       
正在读取状态信息... 完成       
dkms 已经是最新版 (2.3-3ubuntu9.7)。
dkms 已设置为手动安装。
下列软件包是自动安装的并且现在不需要了:libatomic1:i386 libbsd0:i386 libdrm-amdgpu1:i386 libdrm-intel1:i386 libdrm-nouveau2:i386 libdrm-radeon1:i386 libdrm2:i386 libedit2:i386 libelf1:i386 libexpat1:i386libffi6:i386 libfwup1 libgl1:i386 libgl1-mesa-dri:i386 libglapi-mesa:i386 libglvnd0:i386 libglx-mesa0:i386 libglx0:i386 libllvm10:i386 libllvm9 libnvidia-cfg1-440-serverlibnvidia-cfg1-450-server libnvidia-common-440 libnvidia-common-450 libnvidia-common-460 libnvidia-compute-440-server libnvidia-compute-450-serverlibnvidia-decode-440-server libnvidia-decode-450-server libnvidia-encode-440-server libnvidia-encode-450-server libnvidia-extra-440-server libnvidia-extra-450-serverlibnvidia-fbc1-440-server libnvidia-fbc1-450-server libpciaccess0:i386 libsensors4:i386 libstdc++6:i386 libx11-6:i386 libx11-xcb1:i386 libxau6:i386 libxcb-dri2-0:i386libxcb-dri3-0:i386 libxcb-glx0:i386 libxcb-present0:i386 libxcb-sync1:i386 libxcb1:i386 libxdamage1:i386 libxdmcp6:i386 libxext6:i386 libxfixes3:i386 libxnvctrl0libxshmfence1:i386 libxxf86vm1:i386 linux-hwe-5.4-headers-5.4.0-47 linux-hwe-5.4-headers-5.4.0-48 nvidia-compute-utils-440-server nvidia-compute-utils-450-servernvidia-prime nvidia-settings nvidia-utils-440-server nvidia-utils-450-server screen-resolution-extra xserver-xorg-video-nvidia-440-serverxserver-xorg-video-nvidia-450-server
使用'sudo apt autoremove'来卸载它(它们)。
升级了 0 个软件包,新安装了 0 个软件包,要卸载 0 个软件包,有 73 个软件包未被升级。
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ sudo dkms install -m nvidia -v 460.56Creating symlink /var/lib/dkms/nvidia/460.56/source ->/usr/src/nvidia-460.56DKMS: add completed.Kernel preparation unnecessary for this kernel.  Skipping...Building module:
cleaning build area...
'make' -j8 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.4.0-91-generic IGNORE_CC_MISMATCH='' modules.......
.
..
Signing module:- /var/lib/dkms/nvidia/460.56/5.4.0-91-generic/x86_64/module/nvidia-uvm.ko- /var/lib/dkms/nvidia/460.56/5.4.0-91-generic/x86_64/module/nvidia-modeset.ko- /var/lib/dkms/nvidia/460.56/5.4.0-91-generic/x86_64/module/nvidia-drm.ko- /var/lib/dkms/nvidia/460.56/5.4.0-91-generic/x86_64/module/nvidia.ko
Secure Boot not enabled on this system.
cleaning build area...DKMS: build completed.nvidia.ko:
Running module version sanity check.- Original module- No original module exists within this kernel- Installation- Installing to /lib/modules/5.4.0-91-generic/updates/dkms/nvidia-uvm.ko:
Running module version sanity check.- Original module- No original module exists within this kernel- Installation- Installing to /lib/modules/5.4.0-91-generic/updates/dkms/nvidia-modeset.ko:
Running module version sanity check.- Original module- No original module exists within this kernel- Installation- Installing to /lib/modules/5.4.0-91-generic/updates/dkms/nvidia-drm.ko:
Running module version sanity check.- Original module- No original module exists within this kernel- Installation- Installing to /lib/modules/5.4.0-91-generic/updates/dkms/depmod....DKMS: install completed.
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ nvidia-smi
Fri Dec 31 15:52:30 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.156.00   Driver Version: 460.56       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 2080    Off  | 00000000:65:00.0 Off |                  N/A |
| 22%   50C    P0    29W / 225W |      0MiB /  7974MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
jiang@jiang-ThinkStation-P520:~$ 
jiang@jiang-ThinkStation-P520:~$ 

参考:

1、https://www.jianshu.com/p/6b998ba2c6a6
2、https://blog.csdn.net/sinat_23619409/article/details/85220561


http://www.ppmy.cn/news/171697.html

相关文章

hpcp5225设置linux网络,hp cp5225驱动下载

惠普cp5225驱动是一款专业的打印机驱动软件,帮助你解决惠普打印机在连接电脑时遇上的难题,体积小巧,使用方便,感兴趣的小伙伴快来当易网下载体验吧! 惠普hp cp5225打印机驱动内容介绍 惠普cp5225打印机驱动是专门对于该…

程序员的520花式绘制爱心代码大全

声明:代码是祖传代码,我不知道原创是谁了,修修改改。主要是为了给情侣们用,虽然自己贵为单身狗。 一、花式浪漫爱心(一) matlab代码: clear; clc; close all; % NOTICE: Your MATLAB version…

显卡对应的Compute Capability值

目录 GeForce and TITAN Products GeForce Notebook Products NVIDIA Quadro and NVIDIA RTX Desktop GPUs NVIDIA Quadro and NVIDIA RTX Mobile GPUs Tesla Workstation Products NVIDIA Data Center Products Jetson Products Desktop Products Mobile Products Ge…

Python之520爱的表白

我的女神叫包包,今天,我要写个小python向她表达我油腻的爱意。 import time #示爱关键词 words LoveBaoBao print(" ") print(" ") print(" 撒浪嘿哟 to 包包 ") print(" &q…

华为P50 Pocket评测

华为折叠手机以往都是采用左右开合的方式,比如Mate X系列,展开如平板一般,但因其不便携性和昂贵的价格,使得它的目标用户很小众。华为P50 Pocket怎么样这些点很重要华为P50 Pocket更多使用感受和评价:http://www.adian…

520专属Python代码来了

快到 520 了,分享几段 520 专属 Python 代码,不多说了,下面直接上货。 No.1 效果:主要代码: import turtleturtle.speed(0) turtle.delay(10) turtle.penup() turtle.left(90) turtle.fd(200) turtle.pendown() turtle…

ibm服务器p系列小型机,IBM Power系列小型机发展历史

Power System服务器 AIX RISC/CISC SMT QCM SP Hypervisor LPAR (Performance Optimization With Enhanced RISC) 1990年,IBM发布了基于RISC的产品线,RS/6000系列小型机,运行AIX3,这个产品架构给起了一个响亮的名字POWER&#xf…

p520, 550, 570, 590, 595,以及520A, 550A之类的

p520, 550, 570, 590, 595,以及520A, 550A之类的 市场策略,纯粹的市场策略,很精妙的市场策略。从570,服务器被截然分开,高端、低端。高端市场,IBM要牢牢把握,从产品到服务,从芯片到…