mirror of
https://github.com/XuehaiPan/nvitop.git
synced 2026-05-15 14:15:55 -06:00
[GH-ISSUE #106] [BUG][exporter] Process metrics still exist when the process is gone #66
Labels
No labels
api
bug
bug
cli / tui
dependencies
documentation
documentation
documentation
duplicate
enhancement
exporter
invalid
pull-request
pynvml
question
question
upstream
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/nvitop#66
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @caotangdaiduong on GitHub (Nov 22, 2023).
Original GitHub issue: https://github.com/XuehaiPan/nvitop/issues/106
Originally assigned to: @XuehaiPan on GitHub.
Required prerequisites
What version of nvitop are you using?
1.3.1
Operating system and version
Ubuntu 20.04.4 LTS
NVIDIA driver version
510.47.03
NVIDIA-SMI
Python environment
3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0] linux
nvidia-ml-py==12.535.133
nvitop==1.3.1
nvitop-exporter==1.3.1
Problem description
nvitop-exporter cache value
Metric values are retained and not refreshed
Steps to Reproduce
The Python snippets (if any):
Command lines:
Traceback
No response
Logs
No response
Expected behavior
No response
Additional context
No response
@XuehaiPan commented on GitHub (Nov 22, 2023):
Hi @caotangdaiduong, do you set up a
prometheusservice to retrieve the latest metrics automatically?@caotangdaiduong commented on GitHub (Nov 22, 2023):
And currently I'm using cron to restart the service every minute, this may sound crazy but the metric is completely accurate.
@XuehaiPan commented on GitHub (Nov 22, 2023):
@caotangdaiduong I can see the metrics are updating on my side. I'm running
watch --differences:The metrics for GPU processes are actively updated on my side.
I can confirm if the GPU process is gone, the gauge keys still exist. Do you mean you want to remove these keys if the corresponding processes are gone?
@XuehaiPan commented on GitHub (Nov 22, 2023):
@caotangdaiduong I can confirm this and opened a PR #107 to resolve this. You can try it via:
@caotangdaiduong commented on GitHub (Nov 23, 2023):
Hi @XuehaiPan
Thanks for your efforts, I tested it and it works as expected
TTLCacheusage to CLI-only #146