-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k3s 节点重启后,GPU虚拟化失效,nvidia.com/gpu 数目变回物理GPU数目,重启pod hami-device-plugin后,nvidia.com/gpu 数目恢复正常 #829
Comments
你应该是之前没卸载掉nvidiade device-plugin吧 |
先安装的nvidia GPU Opeartor,再安装的Hami,安装完hami之后需要把 nvidia-device-plugin-daemonset 卸载掉吗? |
是的,除非你用自定义的资源名 |
好的,了解了,多谢! |
在安装 nvidia GPU Opeartor 之后,删除 nvidia-device-plugin-daemonset 的方法吗?尝试删除nvidia-device-plugin-daemonset ,但删除后就会重新创建 |
在安装 nvidia GPU Opeartor 之后,有删除 nvidia-device-plugin-daemonset 的方法吗?尝试删除nvidia-device-plugin-daemonset ,但删除后就会重新创建 |
关注一下这个问题。目前文档里没说要删掉nvidia-device-plugin-daemonset |
What happened:
k3s 节点重启后,GPU虚拟化失效,nvidia.com/gpu 数目变回物理GPU数目,重启pod hami-device-plugin后,nvidia.com/gpu 数目恢复正常
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
nvidia-smi -a
on your host/etc/docker/daemon.json
)sudo journalctl -r -u kubelet
)dmesg
Environment:
docker version
uname -a
The text was updated successfully, but these errors were encountered: