mirror of
https://github.com/XuehaiPan/nvitop.git
synced 2026-05-15 06:06:12 -06:00
[PR #210] Add mx-smi backend support for MetaX GPUs #211
Labels
No labels
api
bug
bug
cli / tui
dependencies
documentation
documentation
documentation
duplicate
enhancement
exporter
invalid
pull-request
pynvml
question
question
upstream
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/nvitop#211
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/XuehaiPan/nvitop/pull/210
Author: @mhson-kyle
Created: 4/29/2026
Status: 🔄 Open
Base:
main← Head:main📝 Commits (4)
a306d69Add mx-smi MetaX GPU backenddd9aeb7libmxsmi: cache mx-smi -L inventory separately with 60s TTL7336642device: replace is_available() in _nvml_probe() with shutil.which checkee8b997Merge pull request #1 from mhson-kyle/metax-mx-smi-support📊 Changes
5 files changed (+741 additions, -23 deletions)
View changed files
📝
nvitop/__init__.py(+2 -0)📝
nvitop/api/__init__.py(+2 -0)📝
nvitop/api/device.py(+228 -21)➕
nvitop/api/libmxsmi.py(+500 -0)📝
nvitop/tui/screens/main/panels/device.py(+9 -2)📄 Description
Issue Type
Improvement/feature implementation
Runtime Environment
Operating system and version: AlmaLinux 9.7
Terminal emulator and version: screen / remote shell
Python version: 3.9.25
NVML version (driver version): N/A for MetaX; mx-smi KMD driver 2.16.0, MACA runtime 3.0.0.8
nvitop version or commit: 1.6.3.dev11+ga306d69 /
a306d69python-ml-py version: nvidia-ml-py 13.595.45
Locale: en_US.UTF-8
Description
This adds support for MetaX GPUs through mx-smi, allowing nvitop to run on systems where NVIDIA NVML is unavailable but MetaX devices are present.
The change introduces an mx-smi backend that parses MetaX GPU inventory, utilization, memory, temperature, power, driver/runtime versions, and process information. The existing Device API now falls back to mx-smi when NVML is
unavailable, and the backend can also be forced with:
NVITOP_GPU_BACKEND=mx-smi
The TUI header was also updated to show MetaX-specific version labels, using KMD and MACA versions instead of NVIDIA driver/CUDA labels when the active backend is mx-smi.
Motivation and Context
nvitop currently assumes NVIDIA/NVML availability. On MetaX GPU servers, nvidia-smi/NVML is not available, while GPU information is exposed through mx-smi.
This allows users on MetaX systems to use the same nvitop interface for monitoring GPU status and GPU processes.
Testing
Tested on a MetaX C500 server with 8 GPUs and /usr/bin/mx-smi available.
Checks run:
/usr/bin/python3.9 -m py_compile nvitop/api/libmxsmi.py nvitop/api/device.py nvitop/api/init.py nvitop/init.py nvitop/tui/screens/main/panels/device.py
API smoke test verified:
Also tested:
CUDA_VISIBLE_DEVICES=1,0
to verify MetaX device filtering/order handling.
TUI smoke test:
nvitop --once
confirmed all 8 MetaX C500 devices render correctly.
Exporter smoke test also passed after installing nvitop-exporter.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.