[GH-ISSUE #23] nvidia-ml-py version conflicts with other packages (e.g., gpustat) #19

Closed
opened 2026-05-05 03:22:13 -06:00 by gitea-mirror · 6 comments
Owner

Originally created by @wookayin on GitHub (Jul 4, 2022).
Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/23

Originally assigned to: @XuehaiPan on GitHub.

Context: https://github.com/wookayin/gpustat/pull/107 trying to use nvidia-ml-py, Related issues: #4

Hello @XuehaiPan,

I just realized that nvitop requires nvidia-ml-py to be pinned at 11.450.51 due to the incompatible API, as discussed in wookayin/gpustat#107. My solution (in gpustat) to this bothersome library is to use pynvml greater than 11.450.129, but this would create some nuisance problems for normal users who may have both nvitop and gpustat>=1.0 installed.

From nvitop's README:

IMPORTANT: pip will install nvidia-ml-py==11.450.51 as a dependency for nvitop. Please verify whether the nvidia-ml-py package is compatible with your NVIDIA driver version. You can check the release history of nvidia-ml-py at nvidia-ml-py's Release History, and install the compatible version manually by:

Since nvidia-ml-py>=11.450.129, the definition of nvmlProcessInfo_t has introduced two new fields gpuInstanceId and computeInstanceId (GI ID and CI ID in newer nvidia-smi) which are incompatible with some old NVIDIA drivers. nvitop may not display the processes correctly due to this incompatibility.

Is having pynvml version NOT pinned at the specific version an option for you? More specifically, nvmlDeviceGetComputeRunningProcesses_v2 exists since 11.450.129+. In my opinion, pinning nvidia-ml-py at too old and too specific version isn't a great idea, although I also admit that the solution I accepted isn't ideal at all.

We could discuss and coordinate together to avoid any package conflict issues, because in the current situation gpustat and nvitop would be not compatible with each other due to the nvidia-ml-py version.

Originally created by @wookayin on GitHub (Jul 4, 2022). Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/23 Originally assigned to: @XuehaiPan on GitHub. Context: https://github.com/wookayin/gpustat/pull/107 trying to use nvidia-ml-py, Related issues: #4 Hello @XuehaiPan, I just realized that `nvitop` requires nvidia-ml-py to be pinned at `11.450.51` due to the incompatible API, as discussed in wookayin/gpustat#107. *My solution* (in `gpustat`) to this bothersome library is to use pynvml greater than 11.450.129, but this would create some nuisance problems for normal users who may have both `nvitop` and `gpustat>=1.0` installed. From nvitop's README: > IMPORTANT: pip will install nvidia-ml-py==11.450.51 as a dependency for nvitop. Please verify whether the nvidia-ml-py package is compatible with your NVIDIA driver version. You can check the release history of nvidia-ml-py at [nvidia-ml-py's Release History](https://pypi.org/project/nvidia-ml-py/11.450.51/#history), and install the compatible version manually by: > Since nvidia-ml-py>=11.450.129, the definition of nvmlProcessInfo_t has introduced two new fields gpuInstanceId and computeInstanceId (GI ID and CI ID in newer nvidia-smi) which are incompatible with some old NVIDIA drivers. nvitop may not display the processes correctly due to this incompatibility. Is having `pynvml` version NOT pinned at the specific version an option for you? More specifically, `nvmlDeviceGetComputeRunningProcesses_v2` exists since 11.450.129+. In my opinion, pinning nvidia-ml-py at too old and too specific version isn't a great idea, although I also admit that the solution I accepted isn't ideal at all. We could discuss and coordinate together to avoid any package conflict issues, because in the current situation `gpustat` and `nvitop` would be not compatible with each other due to the nvidia-ml-py version.
gitea-mirror 2026-05-05 03:22:13 -06:00
Author
Owner

@XuehaiPan commented on GitHub (Jul 4, 2022):

Hi @wookayin, thanks for raising this.

The reason for pinning the version of nvidia-ml-py == 11.450.51 is to support Ubuntu 16.04 LTS. For the same reason, I also do not drop the support for Python 3.5 (the default Python on Ubuntu 16.04 LTS). Although both Ubuntu 16.04 and Python 3.5 are out of maintenance.

The maximum NVIDIA driver version in ppa:graphics-drivers is the R430 driver (the maximum supported CUDA version is 10.1). The struct nvmlProcessInfo_v2_t was introduced in CUDA 11.x, which requires at least R450 driver. Although the admin can upgrade the NVIDIA driver on UBuntu 16.04 to R465 with a CUDA .deb file. I think the goal of a PyPI package should be usable for users who do not have sudo privileges.

Is having pynvml version NOT pinned at the specific version an option for you?

I'm going to add an optional argument to pip and let the user choose the version of nvidia-ml-py themself as:

pip3 install 'nvitop[pynvml-11.450.51]'
pip3 install 'nvitop[pynvml-11.450.129]'

EDIT: The extra option in packge[extra] should start with a letter not numbers.

<!-- gh-comment-id:1173844849 --> @XuehaiPan commented on GitHub (Jul 4, 2022): Hi @wookayin, thanks for raising this. The reason for pinning the version of `nvidia-ml-py == 11.450.51` is to support Ubuntu 16.04 LTS. For the same reason, I also do not drop the support for Python 3.5 (the default Python on Ubuntu 16.04 LTS). Although both Ubuntu 16.04 and Python 3.5 are out of maintenance. The maximum NVIDIA driver version in `ppa:graphics-drivers` is the R430 driver (the maximum supported CUDA version is 10.1). The struct `nvmlProcessInfo_v2_t` was introduced in CUDA 11.x, which requires at least R450 driver. Although the admin can upgrade the NVIDIA driver on UBuntu 16.04 to R465 with a CUDA `.deb` file. I think the goal of a PyPI package should be usable for users who do not have `sudo` privileges. > Is having `pynvml` version NOT pinned at the specific version an option for you? I'm going to add an optional argument to `pip` and let the user choose the version of `nvidia-ml-py` themself as: ```bash pip3 install 'nvitop[pynvml-11.450.51]' pip3 install 'nvitop[pynvml-11.450.129]' ``` EDIT: The `extra` option in `packge[extra]` should start with a letter not numbers.
Author
Owner

@wookayin commented on GitHub (Jul 4, 2022):

Thank you, I wasn't aware of the Ubuntu 16.04 LTS compatibility, but giving a thought to more modern systems it would make more sense to allow CUDA-11 compatible nvidia-450+ pynvml installations.

I'm going to add an optional argument to pip and let the user choose the version of nvidia-ml-py themself as:

That sounds like nvitop may have support for different versions of pynvml, which will be great. Though features might be incomplete if the driver/library installation would not match.... We might need to add some fallback mechanisms --- I don't have an universal solution yet without bundling and copy-paste-ing the source code of the nvmlProcessInfo struct or nvmlDeviceGetComputeRunningProcesses_v2 function.

<!-- gh-comment-id:1173867490 --> @wookayin commented on GitHub (Jul 4, 2022): Thank you, I wasn't aware of the Ubuntu 16.04 LTS compatibility, but giving a thought to more modern systems it would make more sense to allow CUDA-11 compatible nvidia-450+ pynvml installations. > I'm going to add an optional argument to pip and let the user choose the version of nvidia-ml-py themself as: That sounds like nvitop may have support for different versions of pynvml, which will be great. Though features might be incomplete if the driver/library installation would not match.... We might need to add some fallback mechanisms --- I don't have an universal solution yet without bundling and copy-paste-ing the source code of the `nvmlProcessInfo` struct or `nvmlDeviceGetComputeRunningProcesses_v2` function.
Author
Owner

@XuehaiPan commented on GitHub (Jul 4, 2022):

We might need to add some fallback mechanisms --- I don't have an universal solution yet without bundling and copy-paste-ing the source code of the nvmlProcessInfo struct or nvmlDeviceGetComputeRunningProcesses_v2 function.

I haven't found a way to solve this either :(. The module pynvml.py provides either the v1 or the v2 struct of nvmlProcessInfo_t, not both. We need to copy-paste the source code.

<!-- gh-comment-id:1173911179 --> @XuehaiPan commented on GitHub (Jul 4, 2022): > We might need to add some fallback mechanisms --- I don't have an universal solution yet without bundling and copy-paste-ing the source code of the `nvmlProcessInfo` struct or `nvmlDeviceGetComputeRunningProcesses_v2` function. I haven't found a way to solve this either :(. The module `pynvml.py` provides either the v1 or the v2 struct of `nvmlProcessInfo_t`, not both. We need to copy-paste the source code.
Author
Owner

@XuehaiPan commented on GitHub (Jul 5, 2022):

We could discuss and coordinate together to avoid any package conflict issues, because in the current situation gpustat and nvitop would be not compatible with each other due to the n vidia-ml-py version.

@wookayin I have loosened the dependency constraints to nvidia-ml-py >=11.450.51,<11.500.00 in PR #24.

In gpustat, which are nvidia-ml-py >=11.450.129,<=11.495.46 (in nvitop, an extra version 11.450.51 provided for CUDA 10.x drivers).

cd65b1ef5f/setup.py (L77-L82)

For a fresh installation, both gpustat and nvitop will install nvidia-ml-py == 11.495.46:

pip3 install gpustat
pip3 install nvitop
<!-- gh-comment-id:1174566081 --> @XuehaiPan commented on GitHub (Jul 5, 2022): > We could discuss and coordinate together to avoid any package conflict issues, because in the current situation `gpustat` and `nvitop` would be not compatible with each other due to the n `vidia-ml-py` version. @wookayin I have loosened the dependency constraints to `nvidia-ml-py >=11.450.51,<11.500.00` in PR #24. In `gpustat`, which are `nvidia-ml-py >=11.450.129,<=11.495.46` (in `nvitop`, an extra version `11.450.51` provided for CUDA 10.x drivers). https://github.com/wookayin/gpustat/blob/cd65b1ef5fe115fd057691a2246263844dffed86/setup.py#L77-L82 For a fresh installation, both `gpustat` and `nvitop` will install `nvidia-ml-py == 11.495.46`: ```bash pip3 install gpustat pip3 install nvitop ```
Author
Owner

@mjkanji commented on GitHub (Sep 7, 2023):

Hi @XuehaiPan,

I'm running into the same issue when trying to install nvitop with cuml in a conda environment:

micromamba create -n test -c conda-forge -c rapidsai -c conda-forge cuml cuda-version=11.8 nvitop 

This installs a really old version of cuml (0.6.1), because cuml depends on dask-cuda, which depends on pynvml.

I don't know much about how conda packages are built and don't fully understand the discussion above about Nvidia driver internals, but a fix that resolves this for the conda package would be much appreciated. Thanks!

<!-- gh-comment-id:1710344589 --> @mjkanji commented on GitHub (Sep 7, 2023): Hi @XuehaiPan, I'm running into the same issue when trying to install `nvitop` with `cuml` in a conda environment: ```bash micromamba create -n test -c conda-forge -c rapidsai -c conda-forge cuml cuda-version=11.8 nvitop ``` This installs a _really_ old version of `cuml` (0.6.1), because `cuml` depends on `dask-cuda`, which depends on `pynvml`. I don't know much about how `conda` packages are built and don't fully understand the discussion above about Nvidia driver internals, but a fix that resolves this for the conda package would be much appreciated. Thanks!
Author
Owner

@XuehaiPan commented on GitHub (Sep 9, 2023):

Hi @mjkanji, maybe you can use pipx to install nvitop in an isolated environment. You can also set it as an alias:

alias nvitop='pipx run nvitop'
<!-- gh-comment-id:1712449841 --> @XuehaiPan commented on GitHub (Sep 9, 2023): Hi @mjkanji, maybe you can use [`pipx`](https://pypa.github.io/pipx) to install `nvitop` in an isolated environment. You can also set it as an alias: ```bash alias nvitop='pipx run nvitop' ```
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/nvitop#19
No description provided.