mirror of
https://github.com/XuehaiPan/nvitop.git
synced 2026-05-15 06:06:12 -06:00
[GH-ISSUE #29] [Enhancement] Backward compatible NVML Python bindings #21
Labels
No labels
api
bug
bug
cli / tui
dependencies
documentation
documentation
documentation
duplicate
enhancement
exporter
invalid
pull-request
pynvml
question
question
upstream
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/nvitop#21
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @XuehaiPan on GitHub (Jul 23, 2022).
Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/29
Originally assigned to: @XuehaiPan on GitHub.
Runtime Environment
3.9.13470.129.06nvitopversion or commit:v0.7.1python-ml-pyversion:11.450.51en_US.UTF-8Context
The official NVML Python bindings (PyPI package
nvidia-ml-py) do not guarantee backward compatibility for different NVIDIA drivers. For example, NVML addednvmlDeviceGetComputeRunningProcesses_v2andnvmlDeviceGetGraphicsRunningProcesses_v2in CUDA 11.x drivers (R450+). But the packagenvidia-ml-pyarbitrary call the latest version of the function in the unversioned function:This will cause
NVMLError_FunctionNotFounderror on CUDA 10.x drivers (e.g. R430).Now there are the
v3version ofnvmlDeviceGet{Compute,Graphics,MPSCompute}RunningProcessesfunctions come with the R510+ drivers. E.g., innvidia-ml-py==11.515.48:The
v2version ofc_nvmlMemory_v2_tis appearing on the horizon (not found in R510 driver yet). This causes issue #13.Possible Solutions
Determine the best dependency version of
nvidia-ml-pyduring installation.This requires the user to install the NVIDIA driver first, which may not be fulfilled on a freshly installed system. Besides, it's hard to list this driver dependency in the package metadata.
Wait for the PyPI package
nvidia-ml-pyto become backward compatible.The package
NVIDIA/go-nvmloffers backward compatible APIs:I posted this on the NVIDIA developer forums [PyPI/nvidia-ml-py] Issue Reports for
nvidia-ml-pybut did not get any official response yet.Vender the
nvidia-ml-pyinnvitop. (Note:nvidia-ml-pyis released under the BSD License)This requires bumping the vendered version and making a minor release of
nvitopeach time a new version ofnvidia-ml-pycomes out.Automatically patch the
pynvmlmodule when the first call fails when calling the versioned APIs. This can achieve by manipulating the__dict__attribute or themodule.__class__attribute.The goal of this solution is not to make fully backward-compatible Python bindings. That may be out of the scope of
nvitop, e.g.ExcludedDeviceInfo -> BlacklistDeviceInfo. Also, note that this solution may cause performance issues for a much deeper call stack.@wookayin commented on GitHub (Oct 17, 2022):
This is a great job. gpustat will have a conflicting dependency of
nvidia-ml-pyas it is still pinning at older versions, so I will also have to catch up to make them compatible.