[GH-ISSUE #127] [BUG] UTF-8 Error during decoding device name on R555 driver #81

Closed
opened 2026-05-05 03:24:46 -06:00 by gitea-mirror · 10 comments
Owner

Originally created by @kangkannnng on GitHub (May 26, 2024).
Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/127

Originally assigned to: @XuehaiPan on GitHub.

Required prerequisites

  • I have read the documentation https://nvitop.readthedocs.io.
  • I have searched the Issue Tracker that this hasn't already been reported. (comment there if it has.)
  • I have tried the latest version of nvitop in a new isolated virtual environment.

What version of nvitop are you using?

1.3.2

Operating system and version

Ubuntu 22.04 / WSL

NVIDIA driver version

555.42.03

NVIDIA-SMI

Sun May 26 15:50:45 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.03              Driver Version: 555.85         CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        On  |   00000000:01:00.0  On |                  N/A |
| 36%   42C    P8             31W /  370W |    1544MiB /  24576MiB |      6%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A        24      G   /Xwayland                                   N/A      |
+-----------------------------------------------------------------------------------------+

Python environment

3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:38:13) [GCC 12.3.0] linux
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.1.105

Problem description

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 0: invalid start byte

Steps to Reproduce

nvitop

Traceback

Traceback (most recent call last):
  File "/home/kang/miniconda3/envs/pytorch/bin/nvitop", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/cli.py", line 353, in main
    ui = UI(
         ^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/gui/ui.py", line 43, in __init__
    self.main_screen = MainScreen(
                       ^^^^^^^^^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/gui/screens/main/__init__.py", line 38, in __init__
    self.device_panel = DevicePanel(self.devices, compact, win=win, root=root)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/gui/screens/main/device.py", line 61, in __init__
    self.snapshots = self.take_snapshots()
                     ^^^^^^^^^^^^^^^^^^^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/cachetools/__init__.py", line 702, in wrapper
    v = func(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/gui/screens/main/device.py", line 142, in take_snapshots
    snapshots = [device.as_snapshot() for device in self.all_devices]
                 ^^^^^^^^^^^^^^^^^^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/gui/library/device.py", line 72, in as_snapshot
    self._snapshot = super().as_snapshot()
                     ^^^^^^^^^^^^^^^^^^^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/api/device.py", line 2146, in as_snapshot
    **{key: getattr(self, key)() for key in self.SNAPSHOT_KEYS},
            ^^^^^^^^^^^^^^^^^^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/api/device.py", line 868, in name
    self._name = libnvml.nvmlQuery('nvmlDeviceGetName', self.handle)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/api/libnvml.py", line 433, in nvmlQuery
    retval = func(*args, **kwargs)  # type: ignore[operator]
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/pynvml.py", line 1921, in wrapper
    return res.decode()
           ^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 0: invalid start byte

Logs

No response

Expected behavior

No response

Additional context

No response

Originally created by @kangkannnng on GitHub (May 26, 2024). Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/127 Originally assigned to: @XuehaiPan on GitHub. ### Required prerequisites - [X] I have read the documentation <https://nvitop.readthedocs.io>. - [X] I have searched the [Issue Tracker](https://github.com/XuehaiPan/nvitop/issues) that this hasn't already been reported. (comment there if it has.) - [X] I have tried the latest version of nvitop in a new isolated virtual environment. ### What version of nvitop are you using? 1.3.2 ### Operating system and version Ubuntu 22.04 / WSL ### NVIDIA driver version 555.42.03 ### NVIDIA-SMI ```text Sun May 26 15:50:45 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 555.42.03 Driver Version: 555.85 CUDA Version: 12.5 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3090 On | 00000000:01:00.0 On | N/A | | 36% 42C P8 31W / 370W | 1544MiB / 24576MiB | 6% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 24 G /Xwayland N/A | +-----------------------------------------------------------------------------------------+ ``` ### Python environment 3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:38:13) [GCC 12.3.0] linux nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu12==2.20.5 nvidia-nvjitlink-cu12==12.4.127 nvidia-nvtx-cu12==12.1.105 ### Problem description UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 0: invalid start byte ### Steps to Reproduce nvitop ### Traceback ```pytb Traceback (most recent call last): File "/home/kang/miniconda3/envs/pytorch/bin/nvitop", line 8, in <module> sys.exit(main()) ^^^^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/cli.py", line 353, in main ui = UI( ^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/gui/ui.py", line 43, in __init__ self.main_screen = MainScreen( ^^^^^^^^^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/gui/screens/main/__init__.py", line 38, in __init__ self.device_panel = DevicePanel(self.devices, compact, win=win, root=root) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/gui/screens/main/device.py", line 61, in __init__ self.snapshots = self.take_snapshots() ^^^^^^^^^^^^^^^^^^^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/cachetools/__init__.py", line 702, in wrapper v = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/gui/screens/main/device.py", line 142, in take_snapshots snapshots = [device.as_snapshot() for device in self.all_devices] ^^^^^^^^^^^^^^^^^^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/gui/library/device.py", line 72, in as_snapshot self._snapshot = super().as_snapshot() ^^^^^^^^^^^^^^^^^^^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/api/device.py", line 2146, in as_snapshot **{key: getattr(self, key)() for key in self.SNAPSHOT_KEYS}, ^^^^^^^^^^^^^^^^^^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/api/device.py", line 868, in name self._name = libnvml.nvmlQuery('nvmlDeviceGetName', self.handle) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/nvitop/api/libnvml.py", line 433, in nvmlQuery retval = func(*args, **kwargs) # type: ignore[operator] ^^^^^^^^^^^^^^^^^^^^^ File "/home/kang/miniconda3/envs/pytorch/lib/python3.12/site-packages/pynvml.py", line 1921, in wrapper return res.decode() ^^^^^^^^^^^^ UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 0: invalid start byte ``` ### Logs _No response_ ### Expected behavior _No response_ ### Additional context _No response_
gitea-mirror 2026-05-05 03:24:46 -06:00
Author
Owner

@XuehaiPan commented on GitHub (May 26, 2024):

Similar issues on other repos:

I cannot reproduce this on native Linux with 555.42.02 driver (the latest driver shipped with CUDA toolkit 12.5 at the time this comment is posted).

$ nvidia-smi
Sun May 26 22:40:20 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.02              Driver Version: 555.42.02      CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060        On  |   00000000:01:00.0 Off |                  N/A |
| 53%   45C    P8             14W /  170W |       2MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

It seems this is a bug that only occurs in WSL with 555.85 driver.

<!-- gh-comment-id:2132246976 --> @XuehaiPan commented on GitHub (May 26, 2024): Similar issues on other repos: - wookayin/gpustat#170 - gpuopenanalytics/pynvml#53 I cannot reproduce this on native Linux with 555.42.02 driver (the latest driver shipped with CUDA toolkit 12.5 at the time this comment is posted). ```console $ nvidia-smi Sun May 26 22:40:20 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 555.42.02 Driver Version: 555.42.02 CUDA Version: 12.5 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 Off | N/A | | 53% 45C P8 14W / 170W | 2MiB / 12288MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+ ``` It seems this is a bug that only occurs in WSL with 555.85 driver.
Author
Owner

@ssjjrrr commented on GitHub (Jun 14, 2024):

Same problem

$ nvidia-smi
Fri Jun 14 20:48:14 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.52.01              Driver Version: 555.99         CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 ...    On  |   00000000:01:00.0  On |                  N/A |
| 71%   72C    P0            279W /  285W |    6457MiB /  16376MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A     32210      C   /python3.8                                  N/A      |
+-----------------------------------------------------------------------------------------+
<!-- gh-comment-id:2167967011 --> @ssjjrrr commented on GitHub (Jun 14, 2024): Same problem ``` $ nvidia-smi Fri Jun 14 20:48:14 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 555.52.01 Driver Version: 555.99 CUDA Version: 12.5 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 4070 ... On | 00000000:01:00.0 On | N/A | | 71% 72C P0 279W / 285W | 6457MiB / 16376MiB | 100% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 32210 C /python3.8 N/A | +-----------------------------------------------------------------------------------------+ ```
Author
Owner

@Saya47 commented on GitHub (Jul 3, 2024):

Same issue for me as well. Windows 11, WSL2. Used to use nvitop just fine then stopped working with the UnicodeDecodeError error. Found this issue and I remembered I had also upgraded Nvidia Driver.

<!-- gh-comment-id:2205925168 --> @Saya47 commented on GitHub (Jul 3, 2024): Same issue for me as well. Windows 11, WSL2. Used to use nvitop just fine then stopped working with the UnicodeDecodeError error. Found this issue and I remembered I had also upgraded Nvidia Driver.
Author
Owner

@winkeylucky commented on GitHub (Jul 4, 2024):

Same problem, Windows 11, WSL2
| NVIDIA-SMI 555.58.02 Driver Version: 556.12 CUDA Version: 12.5

<!-- gh-comment-id:2208335921 --> @winkeylucky commented on GitHub (Jul 4, 2024): Same problem, Windows 11, WSL2 | NVIDIA-SMI 555.58.02 Driver Version: 556.12 CUDA Version: 12.5
Author
Owner

@winkeylucky commented on GitHub (Jul 4, 2024):

image
根据报错我把这个文件改了(~/anaconda3/lib/python3.12/site-packages/pynvml.py", line 1921),,然后就有这个效果
image

<!-- gh-comment-id:2208384042 --> @winkeylucky commented on GitHub (Jul 4, 2024): ![image](https://github.com/XuehaiPan/nvitop/assets/9068846/5488ce1c-b86c-4359-82c4-7a55da6decdc) 根据报错我把这个文件改了(~/anaconda3/lib/python3.12/site-packages/pynvml.py", line 1921),,然后就有这个效果 ![image](https://github.com/XuehaiPan/nvitop/assets/9068846/b8f6fb03-ff31-4de5-8030-2d3270a01ccb)
Author
Owner

@XuehaiPan commented on GitHub (Jul 4, 2024):

Could you try to use the latest version of nvitop and downgrade the nvidia-ml-py version?

pip3 install git+https://github.com/XuehaiPan/nvitop.git#egg=nvitop
pip3 install nvidia-ml-py==11.515.48
<!-- gh-comment-id:2208487357 --> @XuehaiPan commented on GitHub (Jul 4, 2024): Could you try to use the latest version of `nvitop` and downgrade the `nvidia-ml-py` version? ```bash pip3 install git+https://github.com/XuehaiPan/nvitop.git#egg=nvitop pip3 install nvidia-ml-py==11.515.48 ```
Author
Owner

@winkeylucky commented on GitHub (Jul 4, 2024):

install nvidia-ml-py==11.515.48

这边试过:
nvidia-ml-py 11.515.48
nvitop 1.3.3.dev20+g6bc8a8b
image
同样的问题,只是位置变了

<!-- gh-comment-id:2209341191 --> @winkeylucky commented on GitHub (Jul 4, 2024): > ```shell > install nvidia-ml-py==11.515.48 > ``` 这边试过: nvidia-ml-py 11.515.48 nvitop 1.3.3.dev20+g6bc8a8b ![image](https://github.com/XuehaiPan/nvitop/assets/9068846/8253390a-d0ef-4547-9f5f-96d75cf9d875) 同样的问题,只是位置变了
Author
Owner

@Gh0stExp10it commented on GitHub (Jul 20, 2024):

I was able to confirm that the problem is fixed with NVIDIA driver version 560.70.

https://github.com/wookayin/gpustat/issues/170#issuecomment-2241108111

<!-- gh-comment-id:2241113169 --> @Gh0stExp10it commented on GitHub (Jul 20, 2024): I was able to confirm that the problem is fixed with NVIDIA driver version **<mark>560.70</mark>**. https://github.com/wookayin/gpustat/issues/170#issuecomment-2241108111
Author
Owner

@kenvix commented on GitHub (Aug 11, 2024):

The latest nvidia driver has fixed this issue. Simply download the latest Windows driver from https://www.nvidia.cn/drivers/lookup/

<!-- gh-comment-id:2282682276 --> @kenvix commented on GitHub (Aug 11, 2024): The latest nvidia driver has fixed this issue. Simply download the latest **Windows** driver from https://www.nvidia.cn/drivers/lookup/
Author
Owner

@XuehaiPan commented on GitHub (Dec 29, 2024):

FYI, a new release has been made.

<!-- gh-comment-id:2564735511 --> @XuehaiPan commented on GitHub (Dec 29, 2024): FYI, a new release has been made.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/nvitop#81
No description provided.