[GH-ISSUE #44] [Bug] Corrupted dependency of version 0.10.0 with pynvml #29

Closed
opened 2026-05-05 03:22:33 -06:00 by gitea-mirror · 2 comments
Owner

Originally created by @lounicotra on GitHub (Oct 19, 2022).
Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/44

Runtime Environment

  • Operating system and version: Ubuntu 20.04 LTS
  • Terminal emulator and version: N/A
  • Python version: 3.8.10
  • NVML version (driver version): [e.g. 460.84]
  • nvitop version or commit: 0.10.0
  • python-ml-py version: [e.g. 11.450.51]
  • Locale: [e.g. C / C.UTF-8 / en_US.UTF-8]
  • CUDA: 11.6/11.7

Current Behavior

Version 0.10.0 complains about 'pynvml' has no attribute '_nvmlGetFunctionPointer' Here's sequence of working/not working. Servers have the latest versions of py3nvml and pynvml. Dell servers running A100 GPUs. Just built a new Ubuntu 20.04 system running on Dell Poweredge R720 with cuda 11.5 and GTX1080s and I was able to install 0.10.0 with no issues and it is working fine. Thanks.

root@hydra1 ~# nvt
Wed Oct 19 13:57:29 2022
╒═════════════════════════════════════════════════════════════════════════════╕
│ NVITOP 0.9.0       Driver Version: 510.47.03      CUDA Driver Version: 11.6 │
├───────────────────────────────┬──────────────────────┬──────────────────────┤
│ GPU  Name        Persistence-M│ Bus-Id        Disp.A │ MIG M.   Uncorr. ECC │
│ Fan  Temp  Perf  Pwr:Usage/Cap│         Memory-Usage │ GPU-Util  Compute M. │
╞═══════════════════════════════╪══════════════════════╪══════════════════════╪══════════════════════════════════════════════════════╕
│   0  A100-SXM4-80GB      On   │ 00000000:01:00.0 Off │ Disabled           0 │ MEM: █████████▋ 22.4%                                │
│ N/A   28C    P0    69W / 500W │  18380MiB / 80.00GiB │      0%      Default │ UTL: ▏ 0%                                            │
├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤
│   1  A100-SXM4-80GB      On   │ 00000000:41:00.0 Off │ Disabled           0 │ MEM: █████████████████████████████████████████▊ 97%  │
│ N/A   55C    P0   142W / 500W │  77.74GiB / 80.00GiB │     99%      Default │ UTL: ██████████████████████████████████████████▋ 99% │
├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤
│   2  A100-SXM4-80GB      On   │ 00000000:81:00.0 Off │ Disabled           0 │ MEM: ▍ 1.0%                                          │
│ N/A   34C    P0    61W / 500W │    850MiB / 80.00GiB │      0%      Default │ UTL: ▏ 0%                                            │
├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤
│   3  A100-SXM4-80GB      On   │ 00000000:C1:00.0 Off │ Disabled           0 │ MEM: █████▋ 13.0%                                    │
│ N/A   28C    P0    68W / 500W │  10619MiB / 80.00GiB │      0%      Default │ UTL: ▏ 0%                                            │
╘═══════════════════════════════╧══════════════════════╧══════════════════════╧══════════════════════════════════════════════════════╛
[ CPU: █▏ 1.4%                                                                                  ]  ( Load Average:  1.91  1.65  2.59 )
[ MEM: █████▏ 6.1%                                                                              ]  [ SWP: ▏ 0.0%                     ]

╒════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╕
│ Processes:                                                                                                      root@hydra1.som.ma │
│ GPU     PID      USER  GPU-MEM %SM  %CPU  %MEM       TIME  COMMAND                                                                 │
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│   0   27113 C    root 10141MiB   0   0.5   0.5  28.0 days  trver --log-verbose=0 --strict-model-config=true --model-repos.. │
│   0   59066 C snaith+  7385MiB   0   0.0   0.8   9.8 days  /home/snani/anaconda3/envs//bin/python3.8 /home/sn.. │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│   1   13134 C    root 76.91GiB  89  99.7   0.7    9:58:03  python ./scripts/speectext_bpe.py --config-path=/rpice/dgx.. │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│   3   41771 C    root  9765MiB   0   0.5   0.4   46:36:44  tritr --log-verbose=0 --strict-model-config=true --model-repos.. │
╘════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╛
root@hydra1 ~#  pip3 install --upgrade nvitop
Collecting nvitop
  Downloading nvitop-0.10.0-py3-none-any.whl (159 kB)
     |████████████████████████████████| 159 kB 1.0 MB/s
Requirement already satisfied, skipping upgrade: cachetools>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from nvitop) (5.0.0)
Requirement already satisfied, skipping upgrade: nvidia-ml-py<11.516.0a0,>=11.450.51 in /usr/local/lib/python3.8/dist-packages (from nvitop) (11.450.51)
Requirement already satisfied, skipping upgrade: psutil>=5.6.6 in /usr/local/lib/python3.8/dist-packages (from nvitop) (5.9.0)
Requirement already satisfied, skipping upgrade: termcolor>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from nvitop) (1.1.0)
Installing collected packages: nvitop
  Attempting uninstall: nvitop
    Found existing installation: nvitop 0.9.0
    Uninstalling nvitop-0.9.0:
      Successfully uninstalled nvitop-0.9.0
Successfully installed nvitop-0.10.0
root@hydra1 ~# nvt
Traceback (most recent call last):
  File "/usr/local/bin/nvitop", line 5, in <module>
    from nvitop.cli import main
  File "/usr/local/lib/python3.8/dist-packages/nvitop/__init__.py", line 6, in <module>
    from nvitop import core
  File "/usr/local/lib/python3.8/dist-packages/nvitop/core/__init__.py", line 6, in <module>
    from nvitop.core import host, libcuda, libnvml, utils
  File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 543, in <module>
    __patch_backward_compatibility_layers()
  File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 539, in __patch_backward_compatibility_layers
    with_mapped_function_name()  # patch first and only for once
  File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 443, in with_mapped_function_name
    _pynvml._nvmlGetFunctionPointer  # pylint: disable=protected-access
AttributeError: module 'pynvml' has no attribute '_nvmlGetFunctionPointer'
root@hydra1 ~# pip3 install nvitop==0.9.0
Collecting nvitop==0.9.0
  Using cached nvitop-0.9.0-py3-none-any.whl (157 kB)
Requirement already satisfied: psutil>=5.6.6 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (5.9.0)
Requirement already satisfied: cachetools>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (5.0.0)
Requirement already satisfied: termcolor>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (1.1.0)
Requirement already satisfied: nvidia-ml-py<11.500.0a0,>=11.450.51 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (11.450.51)
Installing collected packages: nvitop
  Attempting uninstall: nvitop
    Found existing installation: nvitop 0.10.0
    Uninstalling nvitop-0.10.0:
      Successfully uninstalled nvitop-0.10.0
Successfully installed nvitop-0.9.0
root@hydra1 ~# nvt
Wed Oct 19 14:00:57 2022
╒═════════════════════════════════════════════════════════════════════════════╕
│ NVITOP 0.9.0       Driver Version: 510.47.03      CUDA Driver Version: 11.6 │
├───────────────────────────────┬──────────────────────┬──────────────────────┤
│ GPU  Name        Persistence-M│ Bus-Id        Disp.A │ MIG M.   Uncorr. ECC │
│ Fan  Temp  Perf  Pwr:Usage/Cap│         Memory-Usage │ GPU-Util  Compute M. │
╞═══════════════════════════════╪══════════════════════╪══════════════════════╪══════════════════════════════════════════════════════╕
│   0  A100-SXM4-80GB      On   │ 00000000:01:00.0 Off │ Disabled           0 │ MEM: █████████▋ 22.4%                                │
│ N/A   28C    P0    70W / 500W │  18380MiB / 80.00GiB │      0%      Default │ UTL: ▏ 0%                                            │
├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤
│   1  A100-SXM4-80GB      On   │ 00000000:41:00.0 Off │ Disabled           0 │ MEM: █████████████████████████████████████████▊ 97%  │
│ N/A   55C    P0   346W / 500W │  77.74GiB / 80.00GiB │    100%      Default │ UTL: ███████████████████████████████████████████ MAX │
├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤
│   2  A100-SXM4-80GB      On   │ 00000000:81:00.0 Off │ Disabled           0 │ MEM: ▍ 1.0%                                          │
│ N/A   34C    P0    61W / 500W │    850MiB / 80.00GiB │      0%      Default │ UTL: ▏ 0%                                            │
├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤
│   3  A100-SXM4-80GB      On   │ 00000000:C1:00.0 Off │ Disabled           0 │ MEM: █████▋ 13.0%                                    │
│ N/A   28C    P0    68W / 500W │  10619MiB / 80.00GiB │      0%      Default │ UTL: ▏ 0%                                            │
╘═══════════════════════════════╧══════════════════════╧══════════════════════╧══════════════════════════════════════════════════════╛
[ CPU: █▌ 1.8%                                                                                  ]  ( Load Average:  1.42  1.53  2.35 )
[ MEM: █████▎ 6.2%                                                                              ]  [ SWP: ▏ 0.0%                     ]

╒════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╕
│ Processes:                                                                                                      root@hydra1.som.ma │
│ GPU     PID      USER  GPU-MEM %SM  %CPU  %MEM       TIME  COMMAND                                                                 │
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│   0   27113 C    root 10141MiB   0   0.5   0.5  28.0 days  trir --log-verbose=0 --strict-model-config=true --model-repos.. │
│   0   59066 C snaith+  7385MiB   0   0.0   0.8   9.8 days  /home/snani/anaconda3/envs//home/sn.. │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│   1   13134 C    root 76.91GiB  88 100.6   0.7   10:01:31  python ./scripts/stext_bpe.py --config-path=/rprice/dgx.. │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│   3   41771 C    root  9765MiB   0   0.5   0.4   46:40:12  tritr --log-verbose=0 --strict-model-config=true --model-repos.. │
╘════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╛
root@hydra1 ~# logout

Reverting to 0.9.0 fixes the issue.

# Different server with CUDA 11.7
root@hydra4 aide#  pip3 install --upgrade nvitop
Collecting nvitop
  Downloading nvitop-0.10.0-py3-none-any.whl (159 kB)
     |████████████████████████████████| 159 kB 15.3 MB/s
Requirement already satisfied, skipping upgrade: psutil>=5.6.6 in /usr/local/lib/python3.8/dist-packages (from nvitop) (5.9.0)
Requirement already satisfied, skipping upgrade: termcolor>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from nvitop) (1.1.0)
Requirement already satisfied, skipping upgrade: cachetools>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from nvitop) (5.0.0)
Requirement already satisfied, skipping upgrade: nvidia-ml-py<11.516.0a0,>=11.450.51 in /usr/local/lib/python3.8/dist-packages (from nvitop) (11.450.51)
Installing collected packages: nvitop
  Attempting uninstall: nvitop
    Found existing installation: nvitop 0.9.0
    Uninstalling nvitop-0.9.0:
      Successfully uninstalled nvitop-0.9.0
Successfully installed nvitop-0.10.0
root@hydra4 aide# nvitop
Traceback (most recent call last):
  File "/usr/local/bin/nvitop", line 5, in <module>
    from nvitop.cli import main
  File "/usr/local/lib/python3.8/dist-packages/nvitop/__init__.py", line 6, in <module>
    from nvitop import core
  File "/usr/local/lib/python3.8/dist-packages/nvitop/core/__init__.py", line 6, in <module>
    from nvitop.core import host, libcuda, libnvml, utils
  File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 543, in <module>
    __patch_backward_compatibility_layers()
  File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 539, in __patch_backward_compatibility_layers
    with_mapped_function_name()  # patch first and only for once
  File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 443, in with_mapped_function_name
    _pynvml._nvmlGetFunctionPointer  # pylint: disable=protected-access
AttributeError: module 'pynvml' has no attribute '_nvmlGetFunctionPointer'

root@hydra4 aide# pip3 install nvitop==0.9.0
Collecting nvitop==0.9.0
  Using cached nvitop-0.9.0-py3-none-any.whl (157 kB)
Requirement already satisfied: termcolor>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (1.1.0)
Requirement already satisfied: nvidia-ml-py<11.500.0a0,>=11.450.51 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (11.450.51)
Requirement already satisfied: cachetools>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (5.0.0)
Requirement already satisfied: psutil>=5.6.6 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (5.9.0)
Installing collected packages: nvitop
  Attempting uninstall: nvitop
    Found existing installation: nvitop 0.10.0
    Uninstalling nvitop-0.10.0:
      Successfully uninstalled nvitop-0.10.0
Successfully installed nvitop-0.9.0
root@hydra4 aide# nvt
Wed Oct 19 14:01:18 2022
╒═════════════════════════════════════════════════════════════════════════════╕
│ NVITOP 0.9.0       Driver Version: 515.43.04      CUDA Driver Version: 11.7 │
├───────────────────────────────┬──────────────────────┬──────────────────────┤
│ GPU  Name        Persistence-M│ Bus-Id        Disp.A │ MIG M.   Uncorr. ECC │
│ Fan  Temp  Perf  Pwr:Usage/Cap│         Memory-Usage │ GPU-Util  Compute M. │
╞═══════════════════════════════╪══════════════════════╪══════════════════════╪══════════════════════════════════════════════════════╕
│   0  A100-SXM4-80GB      On   │ 00000000:01:00.0 Off │  Enabled           0 │ MEM: ▍ 1.0%                                          │
│ N/A   26C    P0    51W / 500W │    854MiB / 80.00GiB │     N/A      Default │ UTL: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ N/A │
├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤
│ 0:0      2g.20gb @ GI/CI: 3/0 │     13MiB / 19968MiB │ BAR1:    18MiB /  0% │ MEM: ▏ 0.1%                                          │
│ 0:1      2g.20gb @ GI/CI: 4/0 │     13MiB / 19968MiB │ BAR1:    22MiB /  0% │ MEM: ▏ 0.1%                                          │
│ 0:2      2g.20gb @ GI/CI: 5/0 │     13MiB / 19968MiB │ BAR1:     2MiB /  0% │ MEM: ▏ 0.1%                                          │
├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤
│   1  A100-SXM4-80GB      On   │ 00000000:41:00.0 Off │ Disabled           0 │ MEM: █████▊ 13.5%                                    │
│ N/A   38C    P0   157W / 500W │  11027MiB / 80.00GiB │     89%      Default │ UTL: ██████████████████████████████████████▎ 89%     │
├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤
│   2  A100-SXM4-80GB      On   │ 00000000:81:00.0 Off │ Disabled           0 │ MEM: █████▊ 13.5%                                    │
│ N/A   42C    P0   139W / 500W │  11043MiB / 80.00GiB │     90%      Default │ UTL: ██████████████████████████████████████▊ 90%     │
├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤
│   3  A100-SXM4-80GB      On   │ 00000000:C1:00.0 Off │ Disabled           0 │ MEM: ▍ 1.0%                                          │
│ N/A   25C    P0    55W / 500W │    815MiB / 80.00GiB │      0%      Default │ UTL: ▏ 0%                                            │
╘═══════════════════════════════╧══════════════════════╧══════════════════════╧══════════════════════════════════════════════════════╛
[ CPU: ███▎ 3.8%                                                                                ]  ( Load Average:  6.83  7.11  6.50 )
[ MEM: ███▌ 4.2%                                                                                ]  [ SWP: ▏ 0.0%                     ]

╒════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╕
│ Processes:                                                                                                 root@hydra4.rdct.som.ma │
│ GPU     PID      USER  GPU-MEM %SM  %CPU  %MEM   TIME  COMMAND                                                                     │
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│   1   25414 C aljolje 10209MiB  84 100.3   0.5  25:02  /n/redta/rc043h/PYTORCHY/miniconda2/envs/py3.9_torch1.10_cuda11.3/bin/p.. │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│   2   29172 C aljolje 10225MiB  86 100.0   0.5  20:35  /n/redta/rc043h/PYTORCHY/miniconda2/envs/py3.9_torch1.10_cuda11.3/bin/p.. │
╘════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╛
root@hydra4 aide# pip3 install --upgrade py3nvml
Requirement already up-to-date: py3nvml in /usr/local/lib/python3.8/dist-packages (0.2.7)
Requirement already satisfied, skipping upgrade: xmltodict in /usr/local/lib/python3.8/dist-packages (from py3nvml) (0.12.0)
root@hydra4 aide# pip3 install --upgrade pynvml
Requirement already up-to-date: pynvml in /usr/local/lib/python3.8/dist-packages (11.4.1)
Originally created by @lounicotra on GitHub (Oct 19, 2022). Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/44 <!-- Thank you for contributing to nvitop by opening this issue. Please check through this list, so you can be as helpful as possible: 1. Was this issue already reported? Please do a quick search. 2. Maybe the problem is solved in the current master branch already? Simply clone nvitop's git repository and run `LOGLEVEL=DEBUG ./nvitop.py` to find out. 3. Provide all the relevant information, as outlined in this template. Feel free to remove any sections you don't need. --> #### Runtime Environment - Operating system and version: Ubuntu 20.04 LTS - Terminal emulator and version: N/A - Python version: 3.8.10 - NVML version (driver version): [e.g. `460.84`] - `nvitop` version or commit: 0.10.0 - `python-ml-py` version: [e.g. `11.450.51`] - Locale: [e.g. `C` / `C.UTF-8` / `en_US.UTF-8`] - CUDA: 11.6/11.7 #### Current Behavior Version 0.10.0 complains about 'pynvml' has no attribute '_nvmlGetFunctionPointer' Here's sequence of working/not working. Servers have the latest versions of py3nvml and pynvml. Dell servers running A100 GPUs. Just built a new Ubuntu 20.04 system running on Dell Poweredge R720 with cuda 11.5 and GTX1080s and I was able to install 0.10.0 with no issues and it is working fine. Thanks. ```console root@hydra1 ~# nvt Wed Oct 19 13:57:29 2022 ╒═════════════════════════════════════════════════════════════════════════════╕ │ NVITOP 0.9.0 Driver Version: 510.47.03 CUDA Driver Version: 11.6 │ ├───────────────────────────────┬──────────────────────┬──────────────────────┤ │ GPU Name Persistence-M│ Bus-Id Disp.A │ MIG M. Uncorr. ECC │ │ Fan Temp Perf Pwr:Usage/Cap│ Memory-Usage │ GPU-Util Compute M. │ ╞═══════════════════════════════╪══════════════════════╪══════════════════════╪══════════════════════════════════════════════════════╕ │ 0 A100-SXM4-80GB On │ 00000000:01:00.0 Off │ Disabled 0 │ MEM: █████████▋ 22.4% │ │ N/A 28C P0 69W / 500W │ 18380MiB / 80.00GiB │ 0% Default │ UTL: ▏ 0% │ ├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤ │ 1 A100-SXM4-80GB On │ 00000000:41:00.0 Off │ Disabled 0 │ MEM: █████████████████████████████████████████▊ 97% │ │ N/A 55C P0 142W / 500W │ 77.74GiB / 80.00GiB │ 99% Default │ UTL: ██████████████████████████████████████████▋ 99% │ ├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤ │ 2 A100-SXM4-80GB On │ 00000000:81:00.0 Off │ Disabled 0 │ MEM: ▍ 1.0% │ │ N/A 34C P0 61W / 500W │ 850MiB / 80.00GiB │ 0% Default │ UTL: ▏ 0% │ ├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤ │ 3 A100-SXM4-80GB On │ 00000000:C1:00.0 Off │ Disabled 0 │ MEM: █████▋ 13.0% │ │ N/A 28C P0 68W / 500W │ 10619MiB / 80.00GiB │ 0% Default │ UTL: ▏ 0% │ ╘═══════════════════════════════╧══════════════════════╧══════════════════════╧══════════════════════════════════════════════════════╛ [ CPU: █▏ 1.4% ] ( Load Average: 1.91 1.65 2.59 ) [ MEM: █████▏ 6.1% ] [ SWP: ▏ 0.0% ] ╒════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╕ │ Processes: root@hydra1.som.ma │ │ GPU PID USER GPU-MEM %SM %CPU %MEM TIME COMMAND │ ╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡ │ 0 27113 C root 10141MiB 0 0.5 0.5 28.0 days trver --log-verbose=0 --strict-model-config=true --model-repos.. │ │ 0 59066 C snaith+ 7385MiB 0 0.0 0.8 9.8 days /home/snani/anaconda3/envs//bin/python3.8 /home/sn.. │ ├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ 1 13134 C root 76.91GiB 89 99.7 0.7 9:58:03 python ./scripts/speectext_bpe.py --config-path=/rpice/dgx.. │ ├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ 3 41771 C root 9765MiB 0 0.5 0.4 46:36:44 tritr --log-verbose=0 --strict-model-config=true --model-repos.. │ ╘════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╛ root@hydra1 ~# pip3 install --upgrade nvitop Collecting nvitop Downloading nvitop-0.10.0-py3-none-any.whl (159 kB) |████████████████████████████████| 159 kB 1.0 MB/s Requirement already satisfied, skipping upgrade: cachetools>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from nvitop) (5.0.0) Requirement already satisfied, skipping upgrade: nvidia-ml-py<11.516.0a0,>=11.450.51 in /usr/local/lib/python3.8/dist-packages (from nvitop) (11.450.51) Requirement already satisfied, skipping upgrade: psutil>=5.6.6 in /usr/local/lib/python3.8/dist-packages (from nvitop) (5.9.0) Requirement already satisfied, skipping upgrade: termcolor>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from nvitop) (1.1.0) Installing collected packages: nvitop Attempting uninstall: nvitop Found existing installation: nvitop 0.9.0 Uninstalling nvitop-0.9.0: Successfully uninstalled nvitop-0.9.0 Successfully installed nvitop-0.10.0 root@hydra1 ~# nvt Traceback (most recent call last): File "/usr/local/bin/nvitop", line 5, in <module> from nvitop.cli import main File "/usr/local/lib/python3.8/dist-packages/nvitop/__init__.py", line 6, in <module> from nvitop import core File "/usr/local/lib/python3.8/dist-packages/nvitop/core/__init__.py", line 6, in <module> from nvitop.core import host, libcuda, libnvml, utils File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 543, in <module> __patch_backward_compatibility_layers() File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 539, in __patch_backward_compatibility_layers with_mapped_function_name() # patch first and only for once File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 443, in with_mapped_function_name _pynvml._nvmlGetFunctionPointer # pylint: disable=protected-access AttributeError: module 'pynvml' has no attribute '_nvmlGetFunctionPointer' root@hydra1 ~# pip3 install nvitop==0.9.0 Collecting nvitop==0.9.0 Using cached nvitop-0.9.0-py3-none-any.whl (157 kB) Requirement already satisfied: psutil>=5.6.6 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (5.9.0) Requirement already satisfied: cachetools>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (5.0.0) Requirement already satisfied: termcolor>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (1.1.0) Requirement already satisfied: nvidia-ml-py<11.500.0a0,>=11.450.51 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (11.450.51) Installing collected packages: nvitop Attempting uninstall: nvitop Found existing installation: nvitop 0.10.0 Uninstalling nvitop-0.10.0: Successfully uninstalled nvitop-0.10.0 Successfully installed nvitop-0.9.0 root@hydra1 ~# nvt Wed Oct 19 14:00:57 2022 ╒═════════════════════════════════════════════════════════════════════════════╕ │ NVITOP 0.9.0 Driver Version: 510.47.03 CUDA Driver Version: 11.6 │ ├───────────────────────────────┬──────────────────────┬──────────────────────┤ │ GPU Name Persistence-M│ Bus-Id Disp.A │ MIG M. Uncorr. ECC │ │ Fan Temp Perf Pwr:Usage/Cap│ Memory-Usage │ GPU-Util Compute M. │ ╞═══════════════════════════════╪══════════════════════╪══════════════════════╪══════════════════════════════════════════════════════╕ │ 0 A100-SXM4-80GB On │ 00000000:01:00.0 Off │ Disabled 0 │ MEM: █████████▋ 22.4% │ │ N/A 28C P0 70W / 500W │ 18380MiB / 80.00GiB │ 0% Default │ UTL: ▏ 0% │ ├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤ │ 1 A100-SXM4-80GB On │ 00000000:41:00.0 Off │ Disabled 0 │ MEM: █████████████████████████████████████████▊ 97% │ │ N/A 55C P0 346W / 500W │ 77.74GiB / 80.00GiB │ 100% Default │ UTL: ███████████████████████████████████████████ MAX │ ├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤ │ 2 A100-SXM4-80GB On │ 00000000:81:00.0 Off │ Disabled 0 │ MEM: ▍ 1.0% │ │ N/A 34C P0 61W / 500W │ 850MiB / 80.00GiB │ 0% Default │ UTL: ▏ 0% │ ├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤ │ 3 A100-SXM4-80GB On │ 00000000:C1:00.0 Off │ Disabled 0 │ MEM: █████▋ 13.0% │ │ N/A 28C P0 68W / 500W │ 10619MiB / 80.00GiB │ 0% Default │ UTL: ▏ 0% │ ╘═══════════════════════════════╧══════════════════════╧══════════════════════╧══════════════════════════════════════════════════════╛ [ CPU: █▌ 1.8% ] ( Load Average: 1.42 1.53 2.35 ) [ MEM: █████▎ 6.2% ] [ SWP: ▏ 0.0% ] ╒════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╕ │ Processes: root@hydra1.som.ma │ │ GPU PID USER GPU-MEM %SM %CPU %MEM TIME COMMAND │ ╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡ │ 0 27113 C root 10141MiB 0 0.5 0.5 28.0 days trir --log-verbose=0 --strict-model-config=true --model-repos.. │ │ 0 59066 C snaith+ 7385MiB 0 0.0 0.8 9.8 days /home/snani/anaconda3/envs//home/sn.. │ ├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ 1 13134 C root 76.91GiB 88 100.6 0.7 10:01:31 python ./scripts/stext_bpe.py --config-path=/rprice/dgx.. │ ├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ 3 41771 C root 9765MiB 0 0.5 0.4 46:40:12 tritr --log-verbose=0 --strict-model-config=true --model-repos.. │ ╘════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╛ root@hydra1 ~# logout ``` Reverting to 0.9.0 fixes the issue. ```console # Different server with CUDA 11.7 root@hydra4 aide# pip3 install --upgrade nvitop Collecting nvitop Downloading nvitop-0.10.0-py3-none-any.whl (159 kB) |████████████████████████████████| 159 kB 15.3 MB/s Requirement already satisfied, skipping upgrade: psutil>=5.6.6 in /usr/local/lib/python3.8/dist-packages (from nvitop) (5.9.0) Requirement already satisfied, skipping upgrade: termcolor>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from nvitop) (1.1.0) Requirement already satisfied, skipping upgrade: cachetools>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from nvitop) (5.0.0) Requirement already satisfied, skipping upgrade: nvidia-ml-py<11.516.0a0,>=11.450.51 in /usr/local/lib/python3.8/dist-packages (from nvitop) (11.450.51) Installing collected packages: nvitop Attempting uninstall: nvitop Found existing installation: nvitop 0.9.0 Uninstalling nvitop-0.9.0: Successfully uninstalled nvitop-0.9.0 Successfully installed nvitop-0.10.0 root@hydra4 aide# nvitop Traceback (most recent call last): File "/usr/local/bin/nvitop", line 5, in <module> from nvitop.cli import main File "/usr/local/lib/python3.8/dist-packages/nvitop/__init__.py", line 6, in <module> from nvitop import core File "/usr/local/lib/python3.8/dist-packages/nvitop/core/__init__.py", line 6, in <module> from nvitop.core import host, libcuda, libnvml, utils File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 543, in <module> __patch_backward_compatibility_layers() File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 539, in __patch_backward_compatibility_layers with_mapped_function_name() # patch first and only for once File "/usr/local/lib/python3.8/dist-packages/nvitop/core/libnvml.py", line 443, in with_mapped_function_name _pynvml._nvmlGetFunctionPointer # pylint: disable=protected-access AttributeError: module 'pynvml' has no attribute '_nvmlGetFunctionPointer' root@hydra4 aide# pip3 install nvitop==0.9.0 Collecting nvitop==0.9.0 Using cached nvitop-0.9.0-py3-none-any.whl (157 kB) Requirement already satisfied: termcolor>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (1.1.0) Requirement already satisfied: nvidia-ml-py<11.500.0a0,>=11.450.51 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (11.450.51) Requirement already satisfied: cachetools>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (5.0.0) Requirement already satisfied: psutil>=5.6.6 in /usr/local/lib/python3.8/dist-packages (from nvitop==0.9.0) (5.9.0) Installing collected packages: nvitop Attempting uninstall: nvitop Found existing installation: nvitop 0.10.0 Uninstalling nvitop-0.10.0: Successfully uninstalled nvitop-0.10.0 Successfully installed nvitop-0.9.0 root@hydra4 aide# nvt Wed Oct 19 14:01:18 2022 ╒═════════════════════════════════════════════════════════════════════════════╕ │ NVITOP 0.9.0 Driver Version: 515.43.04 CUDA Driver Version: 11.7 │ ├───────────────────────────────┬──────────────────────┬──────────────────────┤ │ GPU Name Persistence-M│ Bus-Id Disp.A │ MIG M. Uncorr. ECC │ │ Fan Temp Perf Pwr:Usage/Cap│ Memory-Usage │ GPU-Util Compute M. │ ╞═══════════════════════════════╪══════════════════════╪══════════════════════╪══════════════════════════════════════════════════════╕ │ 0 A100-SXM4-80GB On │ 00000000:01:00.0 Off │ Enabled 0 │ MEM: ▍ 1.0% │ │ N/A 26C P0 51W / 500W │ 854MiB / 80.00GiB │ N/A Default │ UTL: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ N/A │ ├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤ │ 0:0 2g.20gb @ GI/CI: 3/0 │ 13MiB / 19968MiB │ BAR1: 18MiB / 0% │ MEM: ▏ 0.1% │ │ 0:1 2g.20gb @ GI/CI: 4/0 │ 13MiB / 19968MiB │ BAR1: 22MiB / 0% │ MEM: ▏ 0.1% │ │ 0:2 2g.20gb @ GI/CI: 5/0 │ 13MiB / 19968MiB │ BAR1: 2MiB / 0% │ MEM: ▏ 0.1% │ ├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤ │ 1 A100-SXM4-80GB On │ 00000000:41:00.0 Off │ Disabled 0 │ MEM: █████▊ 13.5% │ │ N/A 38C P0 157W / 500W │ 11027MiB / 80.00GiB │ 89% Default │ UTL: ██████████████████████████████████████▎ 89% │ ├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤ │ 2 A100-SXM4-80GB On │ 00000000:81:00.0 Off │ Disabled 0 │ MEM: █████▊ 13.5% │ │ N/A 42C P0 139W / 500W │ 11043MiB / 80.00GiB │ 90% Default │ UTL: ██████████████████████████████████████▊ 90% │ ├───────────────────────────────┼──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────┤ │ 3 A100-SXM4-80GB On │ 00000000:C1:00.0 Off │ Disabled 0 │ MEM: ▍ 1.0% │ │ N/A 25C P0 55W / 500W │ 815MiB / 80.00GiB │ 0% Default │ UTL: ▏ 0% │ ╘═══════════════════════════════╧══════════════════════╧══════════════════════╧══════════════════════════════════════════════════════╛ [ CPU: ███▎ 3.8% ] ( Load Average: 6.83 7.11 6.50 ) [ MEM: ███▌ 4.2% ] [ SWP: ▏ 0.0% ] ╒════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╕ │ Processes: root@hydra4.rdct.som.ma │ │ GPU PID USER GPU-MEM %SM %CPU %MEM TIME COMMAND │ ╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡ │ 1 25414 C aljolje 10209MiB 84 100.3 0.5 25:02 /n/redta/rc043h/PYTORCHY/miniconda2/envs/py3.9_torch1.10_cuda11.3/bin/p.. │ ├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ 2 29172 C aljolje 10225MiB 86 100.0 0.5 20:35 /n/redta/rc043h/PYTORCHY/miniconda2/envs/py3.9_torch1.10_cuda11.3/bin/p.. │ ╘════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╛ root@hydra4 aide# pip3 install --upgrade py3nvml Requirement already up-to-date: py3nvml in /usr/local/lib/python3.8/dist-packages (0.2.7) Requirement already satisfied, skipping upgrade: xmltodict in /usr/local/lib/python3.8/dist-packages (from py3nvml) (0.12.0) root@hydra4 aide# pip3 install --upgrade pynvml Requirement already up-to-date: pynvml in /usr/local/lib/python3.8/dist-packages (11.4.1) ```
gitea-mirror 2026-05-05 03:22:33 -06:00
Author
Owner

@XuehaiPan commented on GitHub (Oct 20, 2022):

Hi, @lounicotra thanks for the feedback. This happens when the dependency package pynvml.py is corrupted. I will add a more informative message for this in a patch release.

Please reinstall nvitop and nvidia-ml-py as:

pip3 install --force-reinstall nvitop nvidia-ml-py

or install nvitop in a new clean virtual environment.


Version 0.10.0 complains about 'pynvml' has no attribute '_nvmlGetFunctionPointer' Here's sequence of working/not working. Servers have the latest versions of py3nvml and pynvml.

All of nvidia-ml-py, nvidia-ml-py3, and pynvml install module pynvml.py. So they are mutually in conflict with each other. You should uninstall pyvnml and force reinstall nvidia-ml-py. Otherwise, please install nvitop in a clean virtual environment (do not install nvidia-ml-py3 and pynvml). Then everything will work as expected.

<!-- gh-comment-id:1284901047 --> @XuehaiPan commented on GitHub (Oct 20, 2022): Hi, @lounicotra thanks for the feedback. This happens when the dependency package `pynvml.py` is corrupted. I will add a more informative message for this in a patch release. Please reinstall `nvitop` and `nvidia-ml-py` as: ```bash pip3 install --force-reinstall nvitop nvidia-ml-py ``` or install `nvitop` in a new clean virtual environment. ------ > Version 0.10.0 complains about 'pynvml' has no attribute '_nvmlGetFunctionPointer' Here's sequence of working/not working. Servers have the latest versions of py3nvml and pynvml. All of [`nvidia-ml-py`](https://pypi.org/project/nvidia-ml-py), [`nvidia-ml-py3`](https://pypi.org/project/nvidia-ml-py3), and [`pynvml`](https://pypi.org/project/pynvml) install module `pynvml.py`. So they are mutually in conflict with each other. You should uninstall `pyvnml` and force reinstall `nvidia-ml-py`. Otherwise, please install `nvitop` in a clean virtual environment (do not install [`nvidia-ml-py3`](https://pypi.org/project/nvidia-ml-py3) and [`pynvml`](https://pypi.org/project/pynvml)). Then everything will work as expected.
Author
Owner

@lounicotra commented on GitHub (Oct 20, 2022):

Thanks for looking into this Xuehai!

<!-- gh-comment-id:1285428215 --> @lounicotra commented on GitHub (Oct 20, 2022): Thanks for looking into this Xuehai!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/nvitop#29
No description provided.