mirror of
https://github.com/XuehaiPan/nvitop.git
synced 2026-05-15 14:15:55 -06:00
[GH-ISSUE #45] [Enhancement] Skip error gpus and show normal infos automatically #31
Labels
No labels
api
bug
bug
cli / tui
dependencies
documentation
documentation
documentation
duplicate
enhancement
exporter
invalid
pull-request
pynvml
question
question
upstream
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/nvitop#31
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @jue-jue-zi on GitHub (Oct 22, 2022).
Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/45
Originally assigned to: @XuehaiPan on GitHub.
Runtime Environment
nvitopversion or commit: 0.10.0nvidia-ml-pyversion: 11.515.75Current Behavior
There are four GPUs on our server. And one of those was overheated for some reasons, which make that GPU cannot be recognized. If run
nvidia-smicommand without any args to query all the GPUs, errorUnable to determine the device handle for GPU 0000:0C:00.0: Unknown Errorwill show without showing the remaining normal GPUs' infos. But if the command assigns the normal GPUs (nvidia-smi -i 0,1,3), all infos of the normal GPUs can be shown directly.And if I use
nvitopcommand to show the GPUs' infos,nvidia-ml-pywill throw exceptions like this below,Expected Behavior
I hope that with
nvitopcommand, all the GPUs with errors can be skipped automatically, and show the normal GPUs' infos. If possible, maybe the error GPUs' info can be shown as tips below the normal infos using red fonts for emphasizing.@XuehaiPan commented on GitHub (Oct 22, 2022):
@jue-jue-zi Thanks for the feedback! I'll add a quick fix soon.
@XuehaiPan commented on GitHub (Oct 22, 2022):
@jue-jue-zi I pushed a new commit to handle this. You can reinstall
nvitopfrom GitHub by:@jue-jue-zi commented on GitHub (Oct 22, 2022):
Thanks for fixing it so soon, but it seems that there still exist some problems,
@XuehaiPan commented on GitHub (Oct 22, 2022):
Fixed by the newest commit.
@jue-jue-zi commented on GitHub (Oct 22, 2022):
It works right now! Thanks, it is a really great project.
@jue-jue-zi commented on GitHub (Oct 22, 2022):
Maybe red fonts for errors would be better.