mirror of
https://github.com/XuehaiPan/nvitop.git
synced 2026-05-15 06:06:12 -06:00
[PR #79] [MERGED] fix(api/libnvml): fix process info support for NVIDIA R535 driver (CUDA 12.2+) #150
Labels
No labels
api
bug
bug
cli / tui
dependencies
documentation
documentation
documentation
duplicate
enhancement
exporter
invalid
pull-request
pynvml
question
question
upstream
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/nvitop#150
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/XuehaiPan/nvitop/pull/79
Author: @XuehaiPan
Created: 7/14/2023
Status: ✅ Merged
Merged: 7/16/2023
Merged by: @XuehaiPan
Base:
main← Head:fix-r535-driver📝 Commits (9)
ecb23a6fix(api/libnvml): fix process info support for NVIDIA R535 driver788a1fbdocs(api/libnvml): add comments for type struct fields29b047cfeat(api/process): setused_gpu_cc_protected_memorforGpuProcess090cd6bdocs(api/libnvml): update docstringsce46b3achore(cli): remove unreachable warnings3486b45style(api/libnvml): update private function name727a432style(api/process): update method namedb9fb6cchore(api/process): addusedGpuCcProtectedMemoryto process snapshot9c7545fdeps(nvidia-ml-py): addnvidia-ml-py12.535.77 to support list📊 Changes
10 files changed (+425 additions, -295 deletions)
View changed files
📝
.pre-commit-config.yaml(+3 -3)📝
docs/source/spelling_wordlist.txt(+5 -0)📝
nvitop/api/device.py(+16 -7)📝
nvitop/api/libnvml.py(+351 -245)📝
nvitop/api/process.py(+38 -7)📝
nvitop/api/utils.py(+8 -0)📝
nvitop/cli.py(+1 -31)📝
nvitop/version.py(+1 -0)📝
pyproject.toml(+1 -1)📝
requirements.txt(+1 -1)📄 Description
Issue Type
Description
The start with the NVIDIA R510 driver, the new version 3 APIs have been added for
nvmlDeviceGet{Compute,Graphics,MPSCompute}RunningProcesses. But the version 3 functions still use the version 2 type struct as the function argument type:Recently, the NVIDIA R535 driver came out. The version 3 APIs starts to use the new version 3 type struct without a version bump. This results in invalid memory access and produces the wrong results.
The two type structs have different sizes:
This PR adds a helper function that determines the API version and type struct version of
nvmlDeviceGet{Compute,Graphics,MPSCompute}RunningProcesseson the first API call.Motivation and Context
Fixes #75
Fixes #76
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.