[GH-ISSUE #136] [BUG] Windows Server Unexpectedly Shuts Down When Using Nvitop to Monitor GPU Usage #86

Open
opened 2026-05-05 03:25:01 -06:00 by gitea-mirror · 4 comments
Owner

Originally created by @NI-MingCheng on GitHub (Oct 23, 2024).
Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/136

Originally assigned to: @XuehaiPan on GitHub.

Required prerequisites

  • I have read the documentation https://nvitop.readthedocs.io.
  • I have searched the Issue Tracker that this hasn't already been reported. (comment there if it has.)
  • I have tried the latest version of nvitop in a new isolated virtual environment.

What version of nvitop are you using?

1.3.2

Operating system and version

Windows Server 2022 Datacenter

NVIDIA driver version

516.01

NVIDIA-SMI

Wed Oct 23 15:33:05 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 516.01       Driver Version: 516.01       CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A5000   WDDM  | 00000000:02:00.0 Off |                  Off |
| 30%   38C    P8    17W / 230W |    464MiB / 24564MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA RTX A5000   WDDM  | 00000000:21:00.0 Off |                  Off |
| 30%   34C    P8     6W / 230W |      0MiB / 24564MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA RTX A5000   WDDM  | 00000000:49:00.0 Off |                  Off |
| 30%   30C    P8     5W / 230W |      0MiB / 24564MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA RTX A5000   WDDM  | 00000000:4A:00.0 Off |                  Off |
| 30%   33C    P8     4W / 230W |      0MiB / 24564MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      9436    C+G   ...lPanel\SystemSettings.exe    N/A      |
|    0   N/A  N/A     12172    C+G   ...5n1h2txyewy\SearchApp.exe    N/A      |
|    0   N/A  N/A     15244    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A     17912    C+G   ...2txyewy\TextInputHost.exe    N/A      |
|    0   N/A  N/A     18180    C+G   ...y\ShellExperienceHost.exe    N/A      |
+-----------------------------------------------------------------------------+

Python environment

python -m pip freeze
(base) C:\Users\Administrator>python -m pip freeze
absl-py==2.1.0
accelerate==0.24.1
aiofiles==23.2.0
aiohttp==3.8.5
aiosignal==1.3.1
altair==5.1.2
anaconda-anon-usage @ file:///C:/b/abs_95v3x0wy8p/croot/anaconda-anon-usage_1697038984188/work
anaconda-client==1.12.0
anaconda-cloud-auth @ file:///C:/b/abs_410afndtyf/croot/anaconda-cloud-auth_1697462767853/work
anaconda-navigator @ file:///C:/b/abs_cfvv8k_j21/croot/anaconda-navigator_1704813334508/work
anaconda-project @ file:///C:/ci_311/anaconda-project_1676458365912/work
annotated-types==0.6.0
ansicon==1.89.0
antlr4-python3-runtime==4.9.3
anyio==3.7.1
archspec @ file:///croot/archspec_1709217642129/work
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
async-timeout==4.0.3
attrs==24.2.0
Babel==2.14.0
backports.functools-lru-cache @ file:///tmp/build/80754af9/backports.functools_lru_cache_1618170165463/work
backports.tempfile @ file:///home/linux1/recipes/ci/backports.tempfile_1610991236607/work
backports.weakref==1.0.post1
beautifulsoup4 @ file:///C:/b/abs_0agyz1wsr4/croot/beautifulsoup4-split_1681493048687/work
bleach==6.1.0
blessed==1.20.0
blinker==1.7.0
boltons @ file:///C:/ci_311/boltons_1677729932371/work
Brotli @ file:///C:/ci_311/brotli-split_1676435766766/work
cachetools==5.3.2
certifi @ file:///C:/b/abs_1fw_exq1si/croot/certifi_1725551736618/work/certifi
cffi @ file:///C:/b/abs_924gv1kxzj/croot/cffi_1700254355075/work
chardet @ file:///C:/ci_311/chardet_1676436134885/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
click @ file:///C:/b/abs_f9ihnt72pu/croot/click_1698129847492/work
clip==0.2.0
clyent==1.2.1
colorama @ file:///C:/ci_311/colorama_1676422310965/work
coloredlogs==15.0.1
comm==0.2.1
conda @ file:///C:/b/abs_85jnuwc__u/croot/conda_1729193917673/work
conda-build @ file:///C:/b/abs_3ed9gavxgz/croot/conda-build_1708025907525/work
conda-content-trust @ file:///tmp/build/80754af9/conda-content-trust_1617045594566/work
conda-libmamba-solver @ file:///croot/conda-libmamba-solver_1727775630457/work/src
conda-pack @ file:///tmp/build/80754af9/conda-pack_1611163042455/work
conda-package-handling @ file:///C:/b/abs_b9wp3lr1gn/croot/conda-package-handling_1691008700066/work
conda-repo-cli==1.0.75
conda-token @ file:///Users/paulyim/miniconda3/envs/c3i/conda-bld/conda-token_1662660369760/work
conda-verify @ file:///D:/bld/conda-verify_1667049856137/work
conda_index @ file:///croot/conda-index_1706633791028/work
conda_package_streaming @ file:///C:/b/abs_6c28n38aaj/croot/conda-package-streaming_1690988019210/work
contourpy==1.2.0
cpm-kernels==1.0.11
cryptography @ file:///C:/b/abs_531eqmhgsd/croot/cryptography_1707523768330/work
cycler==0.12.1
ddddocr==1.5.5
debugpy==1.8.1
decorator==5.1.1
defusedxml @ file:///tmp/build/80754af9/defusedxml_1615228127516/work
distro @ file:///C:/b/abs_a3uni_yez3/croot/distro_1701455052240/work
easydict==1.12
einops==0.7.0
executing==2.0.1
fastapi==0.104.1
fastjsonschema @ file:///C:/ci_311/python-fastjsonschema_1679500568724/work
ffmpy==0.3.1
filelock @ file:///C:/b/abs_f2gie28u58/croot/filelock_1700591233643/work
flatbuffers==24.3.25
fonttools==4.49.0
fqdn==1.5.1
frozendict @ file:///C:/b/abs_2alamqss6p/croot/frozendict_1713194885124/work
frozenlist==1.4.1
fsspec==2023.10.0
ftfy==6.1.3
future @ file:///C:/ci_311_rebuilds/future_1678998246262/work
gitdb==4.0.11
GitPython==3.1.40
gmpy2 @ file:///C:/ci_311/gmpy2_1677743390134/work
gpustat==1.1.1
gradio==3.39.0
gradio_client==0.7.0
grpcio==1.60.1
h11==0.14.0
httpcore==1.0.1
httpx==0.25.1
huggingface-hub==0.19.0
humanfriendly==10.0
idna @ file:///C:/ci_311/idna_1676424932545/work
importlib-metadata==6.8.0
ipykernel==6.29.2
ipython==8.21.0
ipywidgets==8.1.2
isoduration==20.11.0
jaraco.classes @ file:///tmp/build/80754af9/jaraco.classes_1620983179379/work
jedi==0.19.1
Jinja2 @ file:///C:/b/abs_f7x5a8op2h/croot/jinja2_1706733672594/work
jinxed==1.2.1
json5==0.9.14
jsonpatch @ file:///tmp/build/80754af9/jsonpatch_1615747632069/work
jsonpointer==2.1
jsonschema @ file:///C:/b/abs_d1c4sm8drk/croot/jsonschema_1699041668863/work
jsonschema-specifications @ file:///C:/b/abs_0brvm6vryw/croot/jsonschema-specifications_1699032417323/work
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.9.0
jupyter-lsp==2.2.2
jupyter_client==8.6.0
jupyter_core @ file:///C:/b/abs_c769pbqg9b/croot/jupyter_core_1698937367513/work
jupyter_server==2.12.5
jupyter_server_terminals==0.5.2
jupyterlab==4.1.1
jupyterlab-language-pack-zh-CN==4.0.post3
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.3
jupyterlab_widgets==3.0.10
keyring @ file:///C:/b/abs_dbjc7g0dh2/croot/keyring_1678999228878/work
kiwisolver==1.4.5
latex2mathml==3.76.0
libarchive-c @ file:///tmp/build/80754af9/python-libarchive-c_1617780486945/work
libmambapy @ file:///C:/b/abs_2euls_1a38/croot/mamba-split_1704219444888/work/libmambapy
linkify-it-py==2.0.3
Markdown==3.7
markdown-it-py==2.2.0
MarkupSafe @ file:///C:/b/abs_ecfdqh67b_/croot/markupsafe_1704206030535/work
matplotlib==3.8.3
matplotlib-inline==0.1.6
mdit-py-plugins==0.3.3
mdtex2html==1.2.0
mdurl==0.1.2
menuinst @ file:///C:/b/abs_099kybla52/croot/menuinst_1706732987063/work
mistune==3.0.2
mkl-fft @ file:///C:/b/abs_19i1y8ykas/croot/mkl_fft_1695058226480/work
mkl-random @ file:///C:/b/abs_edwkj1_o69/croot/mkl_random_1695059866750/work
mkl-service==2.4.0
more-itertools @ file:///C:/b/abs_36p38zj5jx/croot/more-itertools_1700662194485/work
mpmath @ file:///C:/b/abs_7833jrbiox/croot/mpmath_1690848321154/work
multidict==6.1.0
navigator-updater @ file:///C:/b/abs_895otdwmo9/croot/navigator-updater_1695210220239/work
nbclient==0.9.0
nbconvert==7.16.1
nbformat @ file:///C:/b/abs_5a2nea1iu2/croot/nbformat_1694616866197/work
nest-asyncio==1.6.0
networkx @ file:///C:/b/abs_e6gi1go5op/croot/networkx_1690562046966/work
notebook==7.1.0
notebook_shim==0.2.4
numpy @ file:///C:/b/abs_16b2j7ad8n/croot/numpy_and_numpy_base_1704311752418/work/dist/numpy-1.26.3-cp311-cp311-win_amd64.whl#sha256=5f2c4b54fd5d52b9fb18e32607c79b03cf14665cecce8a5a10e2950559df4651
nvidia-ml-py==12.535.161
nvitop==1.3.2
omegaconf==2.3.0
onnxruntime==1.19.2
opencv-python-headless==4.10.0.84
orjson==3.9.10
outcome==1.3.0.post0
overrides==7.7.0
packaging @ file:///C:/b/abs_28t5mcoltc/croot/packaging_1693575224052/work
pandas==2.0.3
pandocfilters==1.5.1
parso==0.8.3
pathlib @ file:///Users/ktietz/demo/mc3/conda-bld/pathlib_1629713961906/work
pillow @ file:///C:/b/abs_e22m71t0cb/croot/pillow_1707233126420/work
pkce @ file:///C:/b/abs_d0z4444tb0/croot/pkce_1690384879799/work
pkginfo @ file:///C:/b/abs_d18srtr68x/croot/pkginfo_1679431192239/work
platformdirs @ file:///C:/b/abs_b6z_yqw_ii/croot/platformdirs_1692205479426/work
pluggy @ file:///C:/ci_311/pluggy_1676422178143/work
ply==3.11
prettytable==3.9.0
prometheus_client==0.20.0
prompt-toolkit==3.0.43
propcache==0.2.0
protobuf==4.25.0
psutil @ file:///C:/ci_311_rebuilds/psutil_1679005906571/work
pure-eval==0.2.2
pyarrow==12.0.1
pycosat @ file:///C:/b/abs_31zywn1be3/croot/pycosat_1696537126223/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pydantic @ file:///C:/b/abs_9byjrk31gl/croot/pydantic_1695798904828/work
pydantic_core==2.10.1
pydeck==0.8.1b0
pydub==0.25.1
Pygments==2.17.2
PyJWT @ file:///C:/ci_311/pyjwt_1676438890509/work
pyparsing==3.1.1
PyQt5==5.15.10
PyQt5-sip @ file:///C:/b/abs_c0pi2mimq3/croot/pyqt-split_1698769125270/work/pyqt_sip
pyreadline3==3.5.4
PySocks @ file:///C:/ci_311/pysocks_1676425991111/work
python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work
python-dotenv @ file:///C:/ci_311/python-dotenv_1676455170580/work
python-json-logger==2.0.7
python-multipart==0.0.6
pytz @ file:///C:/b/abs_19q3ljkez4/croot/pytz_1695131651401/work
pywin32==305.1
pywin32-ctypes @ file:///C:/ci_311/pywin32-ctypes_1676427747089/work
pywinpty==2.0.12
PyYAML @ file:///C:/b/abs_782o3mbw7z/croot/pyyaml_1698096085010/work
pyzmq==25.1.2
qtconsole==5.5.1
QtPy @ file:///C:/b/abs_derqu__3p8/croot/qtpy_1700144907661/work
referencing @ file:///C:/b/abs_09f4hj6adf/croot/referencing_1699012097448/work
regex==2023.12.25
requests @ file:///C:/b/abs_474vaa3x9e/croot/requests_1707355619957/work
requests-mock==1.12.1
requests-toolbelt @ file:///C:/b/abs_2fsmts66wp/croot/requests-toolbelt_1690874051210/work
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.6.0
rpds-py @ file:///C:/b/abs_76j4g4la23/croot/rpds-py_1698947348047/work
ruamel-yaml-conda @ file:///C:/ci_311/ruamel_yaml_1676455799258/work
ruamel.yaml @ file:///C:/ci_311/ruamel.yaml_1676439214109/work
safetensors==0.3.3
selenium==4.25.0
semantic-version==2.10.0
semver @ file:///tmp/build/80754af9/semver_1603822362442/work
Send2Trash==1.8.2
sentencepiece==0.1.99
sip @ file:///C:/b/abs_edevan3fce/croot/sip_1698675983372/work
six @ file:///tmp/build/80754af9/six_1644875935023/work
smmap==5.0.1
sniffio==1.3.0
sortedcontainers==2.4.0
soupsieve @ file:///C:/b/abs_bbsvy9t4pl/croot/soupsieve_1696347611357/work
sse-starlette==1.6.5
stack-data==0.6.3
starlette==0.27.0
streamlit==1.28.1
sympy @ file:///C:/b/abs_82njkonm7f/croot/sympy_1701397685028/work
tenacity==8.2.3
tensorboard==2.16.2
tensorboard-data-server==0.7.2
termcolor==2.4.0
terminado==0.18.0
timm==0.9.12
tinycss2==1.2.1
tokenizers==0.13.3
toml==0.10.2
toolz==1.0.0
torch==2.2.0
torchaudio==2.2.0
torchvision==0.17.0
tornado @ file:///C:/b/abs_0cbrstidzg/croot/tornado_1696937003724/work
tqdm @ file:///C:/b/abs_f76j9hg7pv/croot/tqdm_1679561871187/work
traitlets @ file:///C:/ci_311/traitlets_1676423290727/work
transformers==4.30.2
trio==0.26.2
trio-websocket==0.11.1
truststore @ file:///C:/b/abs_55z7b3r045/croot/truststore_1695245455435/work
types-python-dateutil==2.8.19.20240106
typing_extensions==4.12.2
tzdata==2023.3
tzlocal==5.2
uc-micro-py==1.0.3
ujson @ file:///C:/ci_311/ujson_1676434714224/work
uri-template==1.3.0
urllib3 @ file:///C:/b/abs_4etpfrkumr/croot/urllib3_1707770616184/work
uvicorn==0.24.0.post1
validators==0.22.0
watchdog==3.0.0
wcwidth==0.2.13
webcolors==1.13
webdriver-manager==4.0.2
webencodings==0.5.1
websocket-client==1.8.0
websockets==11.0.3
Werkzeug==3.0.4
widgetsnbextension==4.0.10
win-inet-pton @ file:///C:/ci_311/win_inet_pton_1676425458225/work
windows-curses==2.3.3
wsproto==1.2.0
yarl==1.14.0
zipp @ file:///C:/b/abs_b0beoc27oa/croot/zipp_1704206963359/work
zstandard==0.19.0

Problem description

When monitoring GPU usage with nvitop on Windows Server systems, the system experiences unexpected shutdowns. This issue appears to be caused by compatibility conflicts between nvitop and Windows Server's hardware monitoring system.

日志名称:          System
来源:            Microsoft-Windows-Kernel-Power
日期:            2024/10/23 15:18:48
事件 ID:         41
任务类别:          (63)
级别:            关键
关键字:           (70368744177664),(2)
用户:            SYSTEM
计算机:           WIN-3I9RKHAQAH5
描述:
系统已在未先正常关机的情况下重新启动。如果系统停止响应、发生崩溃或意外断电,则可能会导致此错误。
事件 Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Microsoft-Windows-Kernel-Power" Guid="{331c3b3a-2005-44c2-ac5e-77220c37d6b4}" />
    <EventID>41</EventID>
    <Version>8</Version>
    <Level>1</Level>
    <Task>63</Task>
    <Opcode>0</Opcode>
    <Keywords>0x8000400000000002</Keywords>
    <TimeCreated SystemTime="2024-10-23T07:18:48.0040790Z" />
    <EventRecordID>210129</EventRecordID>
    <Correlation />
    <Execution ProcessID="4" ThreadID="8" />
    <Channel>System</Channel>
    <Computer>WIN-3I9RKHAQAH5</Computer>
    <Security UserID="S-1-5-18" />
  </System>
  <EventData>
    <Data Name="BugcheckCode">80</Data>
    <Data Name="BugcheckParameter1">0xffffdc8d2fdfa000</Data>
    <Data Name="BugcheckParameter2">0x2</Data>
    <Data Name="BugcheckParameter3">0xfffff8029b2505e6</Data>
    <Data Name="BugcheckParameter4">0x0</Data>
    <Data Name="SleepInProgress">0</Data>
    <Data Name="PowerButtonTimestamp">0</Data>
    <Data Name="BootAppStatus">0</Data>
    <Data Name="Checkpoint">0</Data>
    <Data Name="ConnectedStandbyInProgress">true</Data>
    <Data Name="SystemSleepTransitionsToOn">0</Data>
    <Data Name="CsEntryScenarioInstanceId">136</Data>
    <Data Name="BugcheckInfoFromEFI">false</Data>
    <Data Name="CheckpointStatus">0</Data>
    <Data Name="CsEntryScenarioInstanceIdV2">136</Data>
    <Data Name="LongPowerButtonPressDetected">false</Data>
  </EventData>
</Event>

Steps to Reproduce

Deep learning training using GPU first
Then use Nvitop to view GPU usage
Unexpected system shutdown

Traceback

None

Logs

None

Expected behavior

None

Additional context

None

Originally created by @NI-MingCheng on GitHub (Oct 23, 2024). Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/136 Originally assigned to: @XuehaiPan on GitHub. ### Required prerequisites - [X] I have read the documentation <https://nvitop.readthedocs.io>. - [X] I have searched the [Issue Tracker](https://github.com/XuehaiPan/nvitop/issues) that this hasn't already been reported. (comment there if it has.) - [X] I have tried the latest version of nvitop in a new isolated virtual environment. ### What version of nvitop are you using? 1.3.2 ### Operating system and version Windows Server 2022 Datacenter ### NVIDIA driver version 516.01 ### NVIDIA-SMI ```text Wed Oct 23 15:33:05 2024 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 516.01 Driver Version: 516.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA RTX A5000 WDDM | 00000000:02:00.0 Off | Off | | 30% 38C P8 17W / 230W | 464MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA RTX A5000 WDDM | 00000000:21:00.0 Off | Off | | 30% 34C P8 6W / 230W | 0MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 2 NVIDIA RTX A5000 WDDM | 00000000:49:00.0 Off | Off | | 30% 30C P8 5W / 230W | 0MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 3 NVIDIA RTX A5000 WDDM | 00000000:4A:00.0 Off | Off | | 30% 33C P8 4W / 230W | 0MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 9436 C+G ...lPanel\SystemSettings.exe N/A | | 0 N/A N/A 12172 C+G ...5n1h2txyewy\SearchApp.exe N/A | | 0 N/A N/A 15244 C+G C:\Windows\explorer.exe N/A | | 0 N/A N/A 17912 C+G ...2txyewy\TextInputHost.exe N/A | | 0 N/A N/A 18180 C+G ...y\ShellExperienceHost.exe N/A | +-----------------------------------------------------------------------------+ ``` ### Python environment <details> <summary>python -m pip freeze</summary> ```console (base) C:\Users\Administrator>python -m pip freeze absl-py==2.1.0 accelerate==0.24.1 aiofiles==23.2.0 aiohttp==3.8.5 aiosignal==1.3.1 altair==5.1.2 anaconda-anon-usage @ file:///C:/b/abs_95v3x0wy8p/croot/anaconda-anon-usage_1697038984188/work anaconda-client==1.12.0 anaconda-cloud-auth @ file:///C:/b/abs_410afndtyf/croot/anaconda-cloud-auth_1697462767853/work anaconda-navigator @ file:///C:/b/abs_cfvv8k_j21/croot/anaconda-navigator_1704813334508/work anaconda-project @ file:///C:/ci_311/anaconda-project_1676458365912/work annotated-types==0.6.0 ansicon==1.89.0 antlr4-python3-runtime==4.9.3 anyio==3.7.1 archspec @ file:///croot/archspec_1709217642129/work argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 arrow==1.3.0 asttokens==2.4.1 async-lru==2.0.4 async-timeout==4.0.3 attrs==24.2.0 Babel==2.14.0 backports.functools-lru-cache @ file:///tmp/build/80754af9/backports.functools_lru_cache_1618170165463/work backports.tempfile @ file:///home/linux1/recipes/ci/backports.tempfile_1610991236607/work backports.weakref==1.0.post1 beautifulsoup4 @ file:///C:/b/abs_0agyz1wsr4/croot/beautifulsoup4-split_1681493048687/work bleach==6.1.0 blessed==1.20.0 blinker==1.7.0 boltons @ file:///C:/ci_311/boltons_1677729932371/work Brotli @ file:///C:/ci_311/brotli-split_1676435766766/work cachetools==5.3.2 certifi @ file:///C:/b/abs_1fw_exq1si/croot/certifi_1725551736618/work/certifi cffi @ file:///C:/b/abs_924gv1kxzj/croot/cffi_1700254355075/work chardet @ file:///C:/ci_311/chardet_1676436134885/work charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work click @ file:///C:/b/abs_f9ihnt72pu/croot/click_1698129847492/work clip==0.2.0 clyent==1.2.1 colorama @ file:///C:/ci_311/colorama_1676422310965/work coloredlogs==15.0.1 comm==0.2.1 conda @ file:///C:/b/abs_85jnuwc__u/croot/conda_1729193917673/work conda-build @ file:///C:/b/abs_3ed9gavxgz/croot/conda-build_1708025907525/work conda-content-trust @ file:///tmp/build/80754af9/conda-content-trust_1617045594566/work conda-libmamba-solver @ file:///croot/conda-libmamba-solver_1727775630457/work/src conda-pack @ file:///tmp/build/80754af9/conda-pack_1611163042455/work conda-package-handling @ file:///C:/b/abs_b9wp3lr1gn/croot/conda-package-handling_1691008700066/work conda-repo-cli==1.0.75 conda-token @ file:///Users/paulyim/miniconda3/envs/c3i/conda-bld/conda-token_1662660369760/work conda-verify @ file:///D:/bld/conda-verify_1667049856137/work conda_index @ file:///croot/conda-index_1706633791028/work conda_package_streaming @ file:///C:/b/abs_6c28n38aaj/croot/conda-package-streaming_1690988019210/work contourpy==1.2.0 cpm-kernels==1.0.11 cryptography @ file:///C:/b/abs_531eqmhgsd/croot/cryptography_1707523768330/work cycler==0.12.1 ddddocr==1.5.5 debugpy==1.8.1 decorator==5.1.1 defusedxml @ file:///tmp/build/80754af9/defusedxml_1615228127516/work distro @ file:///C:/b/abs_a3uni_yez3/croot/distro_1701455052240/work easydict==1.12 einops==0.7.0 executing==2.0.1 fastapi==0.104.1 fastjsonschema @ file:///C:/ci_311/python-fastjsonschema_1679500568724/work ffmpy==0.3.1 filelock @ file:///C:/b/abs_f2gie28u58/croot/filelock_1700591233643/work flatbuffers==24.3.25 fonttools==4.49.0 fqdn==1.5.1 frozendict @ file:///C:/b/abs_2alamqss6p/croot/frozendict_1713194885124/work frozenlist==1.4.1 fsspec==2023.10.0 ftfy==6.1.3 future @ file:///C:/ci_311_rebuilds/future_1678998246262/work gitdb==4.0.11 GitPython==3.1.40 gmpy2 @ file:///C:/ci_311/gmpy2_1677743390134/work gpustat==1.1.1 gradio==3.39.0 gradio_client==0.7.0 grpcio==1.60.1 h11==0.14.0 httpcore==1.0.1 httpx==0.25.1 huggingface-hub==0.19.0 humanfriendly==10.0 idna @ file:///C:/ci_311/idna_1676424932545/work importlib-metadata==6.8.0 ipykernel==6.29.2 ipython==8.21.0 ipywidgets==8.1.2 isoduration==20.11.0 jaraco.classes @ file:///tmp/build/80754af9/jaraco.classes_1620983179379/work jedi==0.19.1 Jinja2 @ file:///C:/b/abs_f7x5a8op2h/croot/jinja2_1706733672594/work jinxed==1.2.1 json5==0.9.14 jsonpatch @ file:///tmp/build/80754af9/jsonpatch_1615747632069/work jsonpointer==2.1 jsonschema @ file:///C:/b/abs_d1c4sm8drk/croot/jsonschema_1699041668863/work jsonschema-specifications @ file:///C:/b/abs_0brvm6vryw/croot/jsonschema-specifications_1699032417323/work jupyter==1.0.0 jupyter-console==6.6.3 jupyter-events==0.9.0 jupyter-lsp==2.2.2 jupyter_client==8.6.0 jupyter_core @ file:///C:/b/abs_c769pbqg9b/croot/jupyter_core_1698937367513/work jupyter_server==2.12.5 jupyter_server_terminals==0.5.2 jupyterlab==4.1.1 jupyterlab-language-pack-zh-CN==4.0.post3 jupyterlab_pygments==0.3.0 jupyterlab_server==2.25.3 jupyterlab_widgets==3.0.10 keyring @ file:///C:/b/abs_dbjc7g0dh2/croot/keyring_1678999228878/work kiwisolver==1.4.5 latex2mathml==3.76.0 libarchive-c @ file:///tmp/build/80754af9/python-libarchive-c_1617780486945/work libmambapy @ file:///C:/b/abs_2euls_1a38/croot/mamba-split_1704219444888/work/libmambapy linkify-it-py==2.0.3 Markdown==3.7 markdown-it-py==2.2.0 MarkupSafe @ file:///C:/b/abs_ecfdqh67b_/croot/markupsafe_1704206030535/work matplotlib==3.8.3 matplotlib-inline==0.1.6 mdit-py-plugins==0.3.3 mdtex2html==1.2.0 mdurl==0.1.2 menuinst @ file:///C:/b/abs_099kybla52/croot/menuinst_1706732987063/work mistune==3.0.2 mkl-fft @ file:///C:/b/abs_19i1y8ykas/croot/mkl_fft_1695058226480/work mkl-random @ file:///C:/b/abs_edwkj1_o69/croot/mkl_random_1695059866750/work mkl-service==2.4.0 more-itertools @ file:///C:/b/abs_36p38zj5jx/croot/more-itertools_1700662194485/work mpmath @ file:///C:/b/abs_7833jrbiox/croot/mpmath_1690848321154/work multidict==6.1.0 navigator-updater @ file:///C:/b/abs_895otdwmo9/croot/navigator-updater_1695210220239/work nbclient==0.9.0 nbconvert==7.16.1 nbformat @ file:///C:/b/abs_5a2nea1iu2/croot/nbformat_1694616866197/work nest-asyncio==1.6.0 networkx @ file:///C:/b/abs_e6gi1go5op/croot/networkx_1690562046966/work notebook==7.1.0 notebook_shim==0.2.4 numpy @ file:///C:/b/abs_16b2j7ad8n/croot/numpy_and_numpy_base_1704311752418/work/dist/numpy-1.26.3-cp311-cp311-win_amd64.whl#sha256=5f2c4b54fd5d52b9fb18e32607c79b03cf14665cecce8a5a10e2950559df4651 nvidia-ml-py==12.535.161 nvitop==1.3.2 omegaconf==2.3.0 onnxruntime==1.19.2 opencv-python-headless==4.10.0.84 orjson==3.9.10 outcome==1.3.0.post0 overrides==7.7.0 packaging @ file:///C:/b/abs_28t5mcoltc/croot/packaging_1693575224052/work pandas==2.0.3 pandocfilters==1.5.1 parso==0.8.3 pathlib @ file:///Users/ktietz/demo/mc3/conda-bld/pathlib_1629713961906/work pillow @ file:///C:/b/abs_e22m71t0cb/croot/pillow_1707233126420/work pkce @ file:///C:/b/abs_d0z4444tb0/croot/pkce_1690384879799/work pkginfo @ file:///C:/b/abs_d18srtr68x/croot/pkginfo_1679431192239/work platformdirs @ file:///C:/b/abs_b6z_yqw_ii/croot/platformdirs_1692205479426/work pluggy @ file:///C:/ci_311/pluggy_1676422178143/work ply==3.11 prettytable==3.9.0 prometheus_client==0.20.0 prompt-toolkit==3.0.43 propcache==0.2.0 protobuf==4.25.0 psutil @ file:///C:/ci_311_rebuilds/psutil_1679005906571/work pure-eval==0.2.2 pyarrow==12.0.1 pycosat @ file:///C:/b/abs_31zywn1be3/croot/pycosat_1696537126223/work pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work pydantic @ file:///C:/b/abs_9byjrk31gl/croot/pydantic_1695798904828/work pydantic_core==2.10.1 pydeck==0.8.1b0 pydub==0.25.1 Pygments==2.17.2 PyJWT @ file:///C:/ci_311/pyjwt_1676438890509/work pyparsing==3.1.1 PyQt5==5.15.10 PyQt5-sip @ file:///C:/b/abs_c0pi2mimq3/croot/pyqt-split_1698769125270/work/pyqt_sip pyreadline3==3.5.4 PySocks @ file:///C:/ci_311/pysocks_1676425991111/work python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work python-dotenv @ file:///C:/ci_311/python-dotenv_1676455170580/work python-json-logger==2.0.7 python-multipart==0.0.6 pytz @ file:///C:/b/abs_19q3ljkez4/croot/pytz_1695131651401/work pywin32==305.1 pywin32-ctypes @ file:///C:/ci_311/pywin32-ctypes_1676427747089/work pywinpty==2.0.12 PyYAML @ file:///C:/b/abs_782o3mbw7z/croot/pyyaml_1698096085010/work pyzmq==25.1.2 qtconsole==5.5.1 QtPy @ file:///C:/b/abs_derqu__3p8/croot/qtpy_1700144907661/work referencing @ file:///C:/b/abs_09f4hj6adf/croot/referencing_1699012097448/work regex==2023.12.25 requests @ file:///C:/b/abs_474vaa3x9e/croot/requests_1707355619957/work requests-mock==1.12.1 requests-toolbelt @ file:///C:/b/abs_2fsmts66wp/croot/requests-toolbelt_1690874051210/work rfc3339-validator==0.1.4 rfc3986-validator==0.1.1 rich==13.6.0 rpds-py @ file:///C:/b/abs_76j4g4la23/croot/rpds-py_1698947348047/work ruamel-yaml-conda @ file:///C:/ci_311/ruamel_yaml_1676455799258/work ruamel.yaml @ file:///C:/ci_311/ruamel.yaml_1676439214109/work safetensors==0.3.3 selenium==4.25.0 semantic-version==2.10.0 semver @ file:///tmp/build/80754af9/semver_1603822362442/work Send2Trash==1.8.2 sentencepiece==0.1.99 sip @ file:///C:/b/abs_edevan3fce/croot/sip_1698675983372/work six @ file:///tmp/build/80754af9/six_1644875935023/work smmap==5.0.1 sniffio==1.3.0 sortedcontainers==2.4.0 soupsieve @ file:///C:/b/abs_bbsvy9t4pl/croot/soupsieve_1696347611357/work sse-starlette==1.6.5 stack-data==0.6.3 starlette==0.27.0 streamlit==1.28.1 sympy @ file:///C:/b/abs_82njkonm7f/croot/sympy_1701397685028/work tenacity==8.2.3 tensorboard==2.16.2 tensorboard-data-server==0.7.2 termcolor==2.4.0 terminado==0.18.0 timm==0.9.12 tinycss2==1.2.1 tokenizers==0.13.3 toml==0.10.2 toolz==1.0.0 torch==2.2.0 torchaudio==2.2.0 torchvision==0.17.0 tornado @ file:///C:/b/abs_0cbrstidzg/croot/tornado_1696937003724/work tqdm @ file:///C:/b/abs_f76j9hg7pv/croot/tqdm_1679561871187/work traitlets @ file:///C:/ci_311/traitlets_1676423290727/work transformers==4.30.2 trio==0.26.2 trio-websocket==0.11.1 truststore @ file:///C:/b/abs_55z7b3r045/croot/truststore_1695245455435/work types-python-dateutil==2.8.19.20240106 typing_extensions==4.12.2 tzdata==2023.3 tzlocal==5.2 uc-micro-py==1.0.3 ujson @ file:///C:/ci_311/ujson_1676434714224/work uri-template==1.3.0 urllib3 @ file:///C:/b/abs_4etpfrkumr/croot/urllib3_1707770616184/work uvicorn==0.24.0.post1 validators==0.22.0 watchdog==3.0.0 wcwidth==0.2.13 webcolors==1.13 webdriver-manager==4.0.2 webencodings==0.5.1 websocket-client==1.8.0 websockets==11.0.3 Werkzeug==3.0.4 widgetsnbextension==4.0.10 win-inet-pton @ file:///C:/ci_311/win_inet_pton_1676425458225/work windows-curses==2.3.3 wsproto==1.2.0 yarl==1.14.0 zipp @ file:///C:/b/abs_b0beoc27oa/croot/zipp_1704206963359/work zstandard==0.19.0 ``` </details> ### Problem description When monitoring GPU usage with nvitop on Windows Server systems, the system experiences unexpected shutdowns. This issue appears to be caused by compatibility conflicts between nvitop and Windows Server's hardware monitoring system. ``` 日志名称: System 来源: Microsoft-Windows-Kernel-Power 日期: 2024/10/23 15:18:48 事件 ID: 41 任务类别: (63) 级别: 关键 关键字: (70368744177664),(2) 用户: SYSTEM 计算机: WIN-3I9RKHAQAH5 描述: 系统已在未先正常关机的情况下重新启动。如果系统停止响应、发生崩溃或意外断电,则可能会导致此错误。 事件 Xml: <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"> <System> <Provider Name="Microsoft-Windows-Kernel-Power" Guid="{331c3b3a-2005-44c2-ac5e-77220c37d6b4}" /> <EventID>41</EventID> <Version>8</Version> <Level>1</Level> <Task>63</Task> <Opcode>0</Opcode> <Keywords>0x8000400000000002</Keywords> <TimeCreated SystemTime="2024-10-23T07:18:48.0040790Z" /> <EventRecordID>210129</EventRecordID> <Correlation /> <Execution ProcessID="4" ThreadID="8" /> <Channel>System</Channel> <Computer>WIN-3I9RKHAQAH5</Computer> <Security UserID="S-1-5-18" /> </System> <EventData> <Data Name="BugcheckCode">80</Data> <Data Name="BugcheckParameter1">0xffffdc8d2fdfa000</Data> <Data Name="BugcheckParameter2">0x2</Data> <Data Name="BugcheckParameter3">0xfffff8029b2505e6</Data> <Data Name="BugcheckParameter4">0x0</Data> <Data Name="SleepInProgress">0</Data> <Data Name="PowerButtonTimestamp">0</Data> <Data Name="BootAppStatus">0</Data> <Data Name="Checkpoint">0</Data> <Data Name="ConnectedStandbyInProgress">true</Data> <Data Name="SystemSleepTransitionsToOn">0</Data> <Data Name="CsEntryScenarioInstanceId">136</Data> <Data Name="BugcheckInfoFromEFI">false</Data> <Data Name="CheckpointStatus">0</Data> <Data Name="CsEntryScenarioInstanceIdV2">136</Data> <Data Name="LongPowerButtonPressDetected">false</Data> </EventData> </Event> ``` ### Steps to Reproduce Deep learning training using GPU first Then use Nvitop to view GPU usage Unexpected system shutdown ### Traceback ```pytb None ``` ### Logs ```text None ``` ### Expected behavior None ### Additional context None
gitea-mirror added the
bug
label 2026-05-05 03:25:01 -06:00
Author
Owner

@XuehaiPan commented on GitHub (Oct 23, 2024):

@NI-MingCheng thanks for the report, I wonder if the R515 driver can work with CUDA 11.7 on Windows. I found the latest production driver for WinServer 2022 for RTX A5000 is the R550 driver NVIDIA RTX Server Driver Release 550 R550 U10 (553.24) | Windows Server 2022.

It would be helpful if you could run the following Python code in a REPL (e.g. ipython or just type python in the terminal) manually:

>>> from nvitop import CudaDevice

>>> cuda0 = CudaDevice(0)
>>> print(cuda0.as_snapshot())

>>> cuda1 = CudaDevice(1)
>>> print(cuda1.as_snapshot())

>>> cuda2 = CudaDevice(2)
>>> print(cuda2.as_snapshot())

>>> cuda3 = CudaDevice(3)
>>> print(cuda3.as_snapshot())
<!-- gh-comment-id:2431252459 --> @XuehaiPan commented on GitHub (Oct 23, 2024): @NI-MingCheng thanks for the report, I wonder if the R515 driver can work with CUDA 11.7 on Windows. I found the latest production driver for WinServer 2022 for RTX A5000 is the R550 driver [NVIDIA RTX Server Driver Release 550 R550 U10 (553.24) | Windows Server 2022](https://www.nvidia.com/en-us/drivers/details/233143). It would be helpful if you could run the following Python code in a REPL (e.g. `ipython` or just type `python` in the terminal) manually: ```python >>> from nvitop import CudaDevice >>> cuda0 = CudaDevice(0) >>> print(cuda0.as_snapshot()) >>> cuda1 = CudaDevice(1) >>> print(cuda1.as_snapshot()) >>> cuda2 = CudaDevice(2) >>> print(cuda2.as_snapshot()) >>> cuda3 = CudaDevice(3) >>> print(cuda3.as_snapshot()) ```
Author
Owner

@NI-MingCheng commented on GitHub (Oct 23, 2024):

The results are as follows.

Xuehai Pan @.***> 于2024年10月23日周三 16:14写道:

@NI-MingCheng https://github.com/NI-MingCheng thanks for the report, I
wonder if the R515 driver can work with CUDA 11.7 on Windows. I found the
latest production driver for WinServer 2022 for RTX A5000 is the R550
driver NVIDIA RTX Server Driver Release 550 R550 U10 (553.24) | Windows
Server 2022 https://www.nvidia.com/en-us/drivers/details/233143.

It would be helpful if you could run the following Python code in a REPL
(e.g. ipython or just type python in the terminal) manually:

from nvitop import CudaDevice
cuda0 = CudaDevice(0)>>> print(cuda0.as_snapshot())
cuda1 = CudaDevice(1)>>> print(cuda0.as_snapshot())
cuda2 = CudaDevice(2)>>> print(cuda0.as_snapshot())
cuda3 = CudaDevice(3)>>> print(cuda0.as_snapshot())


Reply to this email directly, view it on GitHub
https://github.com/XuehaiPan/nvitop/issues/136#issuecomment-2431252459,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AS4UJTMQWZ3Z3NGUVR33H63Z45LH5AVCNFSM6AAAAABQOFDW2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZRGI2TENBVHE
.
You are receiving this because you were mentioned.Message ID:
@.***>

<!-- gh-comment-id:2431453537 --> @NI-MingCheng commented on GitHub (Oct 23, 2024): The results are as follows. Xuehai Pan ***@***.***> 于2024年10月23日周三 16:14写道: > @NI-MingCheng <https://github.com/NI-MingCheng> thanks for the report, I > wonder if the R515 driver can work with CUDA 11.7 on Windows. I found the > latest production driver for WinServer 2022 for RTX A5000 is the R550 > driver NVIDIA RTX Server Driver Release 550 R550 U10 (553.24) | Windows > Server 2022 <https://www.nvidia.com/en-us/drivers/details/233143>. > > It would be helpful if you could run the following Python code in a REPL > (e.g. ipython or just type python in the terminal) manually: > > >>> from nvitop import CudaDevice > >>> cuda0 = CudaDevice(0)>>> print(cuda0.as_snapshot()) > >>> cuda1 = CudaDevice(1)>>> print(cuda0.as_snapshot()) > >>> cuda2 = CudaDevice(2)>>> print(cuda0.as_snapshot()) > >>> cuda3 = CudaDevice(3)>>> print(cuda0.as_snapshot()) > > — > Reply to this email directly, view it on GitHub > <https://github.com/XuehaiPan/nvitop/issues/136#issuecomment-2431252459>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AS4UJTMQWZ3Z3NGUVR33H63Z45LH5AVCNFSM6AAAAABQOFDW2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZRGI2TENBVHE> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >
Author
Owner

@NI-MingCheng commented on GitHub (Oct 23, 2024):

倪明成 @.***> 于2024年10月23日周三 17:23写道:

The results are as follows.

Xuehai Pan @.***> 于2024年10月23日周三 16:14写道:

@NI-MingCheng https://github.com/NI-MingCheng thanks for the report, I
wonder if the R515 driver can work with CUDA 11.7 on Windows. I found the
latest production driver for WinServer 2022 for RTX A5000 is the R550
driver NVIDIA RTX Server Driver Release 550 R550 U10 (553.24) | Windows
Server 2022 https://www.nvidia.com/en-us/drivers/details/233143.

It would be helpful if you could run the following Python code in a REPL
(e.g. ipython or just type python in the terminal) manually:

from nvitop import CudaDevice
cuda0 = CudaDevice(0)>>> print(cuda0.as_snapshot())
cuda1 = CudaDevice(1)>>> print(cuda0.as_snapshot())
cuda2 = CudaDevice(2)>>> print(cuda0.as_snapshot())
cuda3 = CudaDevice(3)>>> print(cuda0.as_snapshot())


Reply to this email directly, view it on GitHub
https://github.com/XuehaiPan/nvitop/issues/136#issuecomment-2431252459,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AS4UJTMQWZ3Z3NGUVR33H63Z45LH5AVCNFSM6AAAAABQOFDW2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZRGI2TENBVHE
.
You are receiving this because you were mentioned.Message ID:
@.***>

<!-- gh-comment-id:2431465591 --> @NI-MingCheng commented on GitHub (Oct 23, 2024): 倪明成 ***@***.***> 于2024年10月23日周三 17:23写道: > The results are as follows. > > Xuehai Pan ***@***.***> 于2024年10月23日周三 16:14写道: > >> @NI-MingCheng <https://github.com/NI-MingCheng> thanks for the report, I >> wonder if the R515 driver can work with CUDA 11.7 on Windows. I found the >> latest production driver for WinServer 2022 for RTX A5000 is the R550 >> driver NVIDIA RTX Server Driver Release 550 R550 U10 (553.24) | Windows >> Server 2022 <https://www.nvidia.com/en-us/drivers/details/233143>. >> >> It would be helpful if you could run the following Python code in a REPL >> (e.g. ipython or just type python in the terminal) manually: >> >> >>> from nvitop import CudaDevice >> >>> cuda0 = CudaDevice(0)>>> print(cuda0.as_snapshot()) >> >>> cuda1 = CudaDevice(1)>>> print(cuda0.as_snapshot()) >> >>> cuda2 = CudaDevice(2)>>> print(cuda0.as_snapshot()) >> >>> cuda3 = CudaDevice(3)>>> print(cuda0.as_snapshot()) >> >> — >> Reply to this email directly, view it on GitHub >> <https://github.com/XuehaiPan/nvitop/issues/136#issuecomment-2431252459>, >> or unsubscribe >> <https://github.com/notifications/unsubscribe-auth/AS4UJTMQWZ3Z3NGUVR33H63Z45LH5AVCNFSM6AAAAABQOFDW2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZRGI2TENBVHE> >> . >> You are receiving this because you were mentioned.Message ID: >> ***@***.***> >> >
Author
Owner

@NI-MingCheng commented on GitHub (Mar 29, 2025):

from nvitop import CudaDevice
cuda0 = CudaDevice(0)
print(cuda0.as_snapshot())
CudaDeviceSnapshot(
    real=CudaDevice(cuda_index=0, nvml_index=0, name="NVIDIA RTX A5000", total_memory=23.99GiB),
    bus_id='00000000:02:00.0',
    clock_infos=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634),
    clock_speed_infos=ClockSpeedInfos(current=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), max=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950)),
    compute_mode='Default',
    cuda_compute_capability=(8, 6),
    cuda_index=0,
    current_driver_model='WDDM',
    decoder_utilization=0,
    display_active='Disabled',
    display_mode='Disabled',
    encoder_utilization=0,
    fan_speed=46,
    gpu_utilization=0,
    index=0,
    max_clock_infos=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950),
    memory_clock=7593,
    memory_free=342388736,
    memory_free_human='326.5MiB',
    memory_info=MemoryInfo(total=25757220864, free=342388736, used=25087676416),
    memory_percent=97.4,
    memory_total=25757220864,
    memory_total_human='23.99GiB',
    memory_usage='23.36GiB / 23.99GiB',
    memory_used=25087676416,
    memory_used_human='23.36GiB',
    memory_utilization=0,
    mig_mode='N/A',
    name='NVIDIA RTX A5000',
    pcie_rx_throughput=0,
    pcie_rx_throughput_human='0B/s',
    pcie_throughput=ThroughputInfo(tx=0, rx=0),
    pcie_tx_throughput=0,
    pcie_tx_throughput_human='0B/s',
    performance_state='P3',
    persistence_mode='N/A',
    physical_index=0,
    power_limit=230000,
    power_status='104W / 230W',
    power_usage=103949,
    sm_clock=1874,
    temperature=70,
    total_volatile_uncorrected_ecc_errors='N/A',
    utilization_rates=UtilizationRates(gpu=0, memory=0, encoder=0, decoder=0),
    uuid='GPU-1c954d72-e6d6-22b9-3afe-49d7b7592a47',
)
cuda1 = CudaDevice(1)
print(cuda0.as_snapshot())
CudaDeviceSnapshot(
    real=CudaDevice(cuda_index=0, nvml_index=0, name="NVIDIA RTX A5000", total_memory=23.99GiB),
    bus_id='00000000:02:00.0',
    clock_infos=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634),
    clock_speed_infos=ClockSpeedInfos(current=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), max=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950)),
    compute_mode='Default',
    cuda_compute_capability=(8, 6),
    cuda_index=0,
    current_driver_model='WDDM',
    decoder_utilization=0,
    display_active='Disabled',
    display_mode='Disabled',
    encoder_utilization=0,
    fan_speed=46,
    gpu_utilization=0,
    index=0,
    max_clock_infos=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950),
    memory_clock=7593,
    memory_free=342388736,
    memory_free_human='326.5MiB',
    memory_info=MemoryInfo(total=25757220864, free=342388736, used=25087676416),
    memory_percent=97.4,
    memory_total=25757220864,
    memory_total_human='23.99GiB',
    memory_usage='23.36GiB / 23.99GiB',
    memory_used=25087676416,
    memory_used_human='23.36GiB',
    memory_utilization=0,
    mig_mode='N/A',
    name='NVIDIA RTX A5000',
    pcie_rx_throughput=0,
    pcie_rx_throughput_human='0B/s',
    pcie_throughput=ThroughputInfo(tx=0, rx=0),
    pcie_tx_throughput=0,
    pcie_tx_throughput_human='0B/s',
    performance_state='P3',
    persistence_mode='N/A',
    physical_index=0,
    power_limit=230000,
    power_status='95W / 230W',
    power_usage=95486,
    sm_clock=1874,
    temperature=70,
    total_volatile_uncorrected_ecc_errors='N/A',
    utilization_rates=UtilizationRates(gpu=0, memory=0, encoder=0, decoder=0),
    uuid='GPU-1c954d72-e6d6-22b9-3afe-49d7b7592a47',
)
cuda2 = CudaDevice(2)
print(cuda0.as_snapshot())
CudaDeviceSnapshot(
    real=CudaDevice(cuda_index=0, nvml_index=0, name="NVIDIA RTX A5000", total_memory=23.99GiB),
    bus_id='00000000:02:00.0',
    clock_infos=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634),
    clock_speed_infos=ClockSpeedInfos(current=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), max=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950)),
    compute_mode='Default',
    cuda_compute_capability=(8, 6),
    cuda_index=0,
    current_driver_model='WDDM',
    decoder_utilization=0,
    display_active='Disabled',
    display_mode='Disabled',
    encoder_utilization=0,
    fan_speed=46,
    gpu_utilization=0,
    index=0,
    max_clock_infos=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950),
    memory_clock=7593,
    memory_free=342388736,
    memory_free_human='326.5MiB',
    memory_info=MemoryInfo(total=25757220864, free=342388736, used=25087676416),
    memory_percent=97.4,
    memory_total=25757220864,
    memory_total_human='23.99GiB',
    memory_usage='23.36GiB / 23.99GiB',
    memory_used=25087676416,
    memory_used_human='23.36GiB',
    memory_utilization=0,
    mig_mode='N/A',
    name='NVIDIA RTX A5000',
    pcie_rx_throughput=0,
    pcie_rx_throughput_human='0B/s',
    pcie_throughput=ThroughputInfo(tx=0, rx=0),
    pcie_tx_throughput=0,
    pcie_tx_throughput_human='0B/s',
    performance_state='P3',
    persistence_mode='N/A',
    physical_index=0,
    power_limit=230000,
    power_status='92W / 230W',
    power_usage=92390,
    sm_clock=1874,
    temperature=67,
    total_volatile_uncorrected_ecc_errors='N/A',
    utilization_rates=UtilizationRates(gpu=0, memory=0, encoder=0, decoder=0),
    uuid='GPU-1c954d72-e6d6-22b9-3afe-49d7b7592a47',
)
cuda3 = CudaDevice(3)
print(cuda0.as_snapshot())
CudaDeviceSnapshot(
    real=CudaDevice(cuda_index=0, nvml_index=0, name="NVIDIA RTX A5000", total_memory=23.99GiB),
    bus_id='00000000:02:00.0',
    clock_infos=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634),
    clock_speed_infos=ClockSpeedInfos(current=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), max=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950)),
    compute_mode='Default',
    cuda_compute_capability=(8, 6),
    cuda_index=0,
    current_driver_model='WDDM',
    decoder_utilization=0,
    display_active='Disabled',
    display_mode='Disabled',
    encoder_utilization=0,
    fan_speed=46,
    gpu_utilization=0,
    index=0,
    max_clock_infos=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950),
    memory_clock=7593,
    memory_free=342388736,
    memory_free_human='326.5MiB',
    memory_info=MemoryInfo(total=25757220864, free=342388736, used=25087676416),
    memory_percent=97.4,
    memory_total=25757220864,
    memory_total_human='23.99GiB',
    memory_usage='23.36GiB / 23.99GiB',
    memory_used=25087676416,
    memory_used_human='23.36GiB',
    memory_utilization=0,
    mig_mode='N/A',
    name='NVIDIA RTX A5000',
    pcie_rx_throughput=0,
    pcie_rx_throughput_human='0B/s',
    pcie_throughput=ThroughputInfo(tx=0, rx=0),
    pcie_tx_throughput=0,
    pcie_tx_throughput_human='0B/s',
    performance_state='P3',
    persistence_mode='N/A',
    physical_index=0,
    power_limit=230000,
    power_status='90W / 230W',
    power_usage=89738,
    sm_clock=1874,
    temperature=67,
    total_volatile_uncorrected_ecc_errors='N/A',
    utilization_rates=UtilizationRates(gpu=0, memory=0, encoder=0, decoder=0),
    uuid='GPU-1c954d72-e6d6-22b9-3afe-49d7b7592a47',
)

<!-- gh-comment-id:2763204910 --> @NI-MingCheng commented on GitHub (Mar 29, 2025): ```python from nvitop import CudaDevice ``` ```python cuda0 = CudaDevice(0) print(cuda0.as_snapshot()) ``` CudaDeviceSnapshot( real=CudaDevice(cuda_index=0, nvml_index=0, name="NVIDIA RTX A5000", total_memory=23.99GiB), bus_id='00000000:02:00.0', clock_infos=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), clock_speed_infos=ClockSpeedInfos(current=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), max=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950)), compute_mode='Default', cuda_compute_capability=(8, 6), cuda_index=0, current_driver_model='WDDM', decoder_utilization=0, display_active='Disabled', display_mode='Disabled', encoder_utilization=0, fan_speed=46, gpu_utilization=0, index=0, max_clock_infos=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950), memory_clock=7593, memory_free=342388736, memory_free_human='326.5MiB', memory_info=MemoryInfo(total=25757220864, free=342388736, used=25087676416), memory_percent=97.4, memory_total=25757220864, memory_total_human='23.99GiB', memory_usage='23.36GiB / 23.99GiB', memory_used=25087676416, memory_used_human='23.36GiB', memory_utilization=0, mig_mode='N/A', name='NVIDIA RTX A5000', pcie_rx_throughput=0, pcie_rx_throughput_human='0B/s', pcie_throughput=ThroughputInfo(tx=0, rx=0), pcie_tx_throughput=0, pcie_tx_throughput_human='0B/s', performance_state='P3', persistence_mode='N/A', physical_index=0, power_limit=230000, power_status='104W / 230W', power_usage=103949, sm_clock=1874, temperature=70, total_volatile_uncorrected_ecc_errors='N/A', utilization_rates=UtilizationRates(gpu=0, memory=0, encoder=0, decoder=0), uuid='GPU-1c954d72-e6d6-22b9-3afe-49d7b7592a47', ) ```python cuda1 = CudaDevice(1) print(cuda0.as_snapshot()) ``` CudaDeviceSnapshot( real=CudaDevice(cuda_index=0, nvml_index=0, name="NVIDIA RTX A5000", total_memory=23.99GiB), bus_id='00000000:02:00.0', clock_infos=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), clock_speed_infos=ClockSpeedInfos(current=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), max=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950)), compute_mode='Default', cuda_compute_capability=(8, 6), cuda_index=0, current_driver_model='WDDM', decoder_utilization=0, display_active='Disabled', display_mode='Disabled', encoder_utilization=0, fan_speed=46, gpu_utilization=0, index=0, max_clock_infos=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950), memory_clock=7593, memory_free=342388736, memory_free_human='326.5MiB', memory_info=MemoryInfo(total=25757220864, free=342388736, used=25087676416), memory_percent=97.4, memory_total=25757220864, memory_total_human='23.99GiB', memory_usage='23.36GiB / 23.99GiB', memory_used=25087676416, memory_used_human='23.36GiB', memory_utilization=0, mig_mode='N/A', name='NVIDIA RTX A5000', pcie_rx_throughput=0, pcie_rx_throughput_human='0B/s', pcie_throughput=ThroughputInfo(tx=0, rx=0), pcie_tx_throughput=0, pcie_tx_throughput_human='0B/s', performance_state='P3', persistence_mode='N/A', physical_index=0, power_limit=230000, power_status='95W / 230W', power_usage=95486, sm_clock=1874, temperature=70, total_volatile_uncorrected_ecc_errors='N/A', utilization_rates=UtilizationRates(gpu=0, memory=0, encoder=0, decoder=0), uuid='GPU-1c954d72-e6d6-22b9-3afe-49d7b7592a47', ) ```python cuda2 = CudaDevice(2) print(cuda0.as_snapshot()) ``` CudaDeviceSnapshot( real=CudaDevice(cuda_index=0, nvml_index=0, name="NVIDIA RTX A5000", total_memory=23.99GiB), bus_id='00000000:02:00.0', clock_infos=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), clock_speed_infos=ClockSpeedInfos(current=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), max=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950)), compute_mode='Default', cuda_compute_capability=(8, 6), cuda_index=0, current_driver_model='WDDM', decoder_utilization=0, display_active='Disabled', display_mode='Disabled', encoder_utilization=0, fan_speed=46, gpu_utilization=0, index=0, max_clock_infos=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950), memory_clock=7593, memory_free=342388736, memory_free_human='326.5MiB', memory_info=MemoryInfo(total=25757220864, free=342388736, used=25087676416), memory_percent=97.4, memory_total=25757220864, memory_total_human='23.99GiB', memory_usage='23.36GiB / 23.99GiB', memory_used=25087676416, memory_used_human='23.36GiB', memory_utilization=0, mig_mode='N/A', name='NVIDIA RTX A5000', pcie_rx_throughput=0, pcie_rx_throughput_human='0B/s', pcie_throughput=ThroughputInfo(tx=0, rx=0), pcie_tx_throughput=0, pcie_tx_throughput_human='0B/s', performance_state='P3', persistence_mode='N/A', physical_index=0, power_limit=230000, power_status='92W / 230W', power_usage=92390, sm_clock=1874, temperature=67, total_volatile_uncorrected_ecc_errors='N/A', utilization_rates=UtilizationRates(gpu=0, memory=0, encoder=0, decoder=0), uuid='GPU-1c954d72-e6d6-22b9-3afe-49d7b7592a47', ) ```python cuda3 = CudaDevice(3) print(cuda0.as_snapshot()) ``` CudaDeviceSnapshot( real=CudaDevice(cuda_index=0, nvml_index=0, name="NVIDIA RTX A5000", total_memory=23.99GiB), bus_id='00000000:02:00.0', clock_infos=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), clock_speed_infos=ClockSpeedInfos(current=ClockInfos(graphics=1874, sm=1874, memory=7593, video=1634), max=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950)), compute_mode='Default', cuda_compute_capability=(8, 6), cuda_index=0, current_driver_model='WDDM', decoder_utilization=0, display_active='Disabled', display_mode='Disabled', encoder_utilization=0, fan_speed=46, gpu_utilization=0, index=0, max_clock_infos=ClockInfos(graphics=2100, sm=2100, memory=8001, video=1950), memory_clock=7593, memory_free=342388736, memory_free_human='326.5MiB', memory_info=MemoryInfo(total=25757220864, free=342388736, used=25087676416), memory_percent=97.4, memory_total=25757220864, memory_total_human='23.99GiB', memory_usage='23.36GiB / 23.99GiB', memory_used=25087676416, memory_used_human='23.36GiB', memory_utilization=0, mig_mode='N/A', name='NVIDIA RTX A5000', pcie_rx_throughput=0, pcie_rx_throughput_human='0B/s', pcie_throughput=ThroughputInfo(tx=0, rx=0), pcie_tx_throughput=0, pcie_tx_throughput_human='0B/s', performance_state='P3', persistence_mode='N/A', physical_index=0, power_limit=230000, power_status='90W / 230W', power_usage=89738, sm_clock=1874, temperature=67, total_volatile_uncorrected_ecc_errors='N/A', utilization_rates=UtilizationRates(gpu=0, memory=0, encoder=0, decoder=0), uuid='GPU-1c954d72-e6d6-22b9-3afe-49d7b7592a47', ) ```python ```
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/nvitop#86
No description provided.