Commit graph

749 commits

Author SHA1 Message Date
Xuehai Pan
a876f6e711 chore(pre-commit): update pre-commit hooks 2024-10-26 20:38:40 +08:00
Xuehai Pan
878200d111 deps(nvidia-ml-py): add nvidia-ml-py 12.550.89 and 12.560.30 to support list 2024-10-04 03:07:56 +08:00
Xuehai Pan
4127c69a3a chore(pre-commit): update pre-commit hooks 2024-10-04 03:07:55 +08:00
Xuehai Pan
5522c9baf6 deps(docs): update documentation dependencies 2024-09-11 23:03:23 +08:00
Xuehai Pan
e57db4debb chore(pre-commit): update pre-commit hooks 2024-09-11 22:59:31 +08:00
Christian Bauer
80100c7c85
fix(collector): fix documentation for ResourceMetricCollector.clear() function (#132) 2024-08-07 19:16:31 +08:00
Xuehai Pan
61add4194a chore(pre-commit): update pre-commit hooks 2024-08-07 17:41:27 +08:00
Xuehai Pan
623b1e5360 chore(pre-commit): update pre-commit hooks 2024-07-31 15:36:15 +08:00
Xuehai Pan
51f08cd106 refactor: remove unnecessary global constants 2024-07-12 17:08:59 +00:00
Xuehai Pan
c00b80d3cf
feat(api): handle exceptions for function getpass.getuser() (#130) 2024-07-12 23:07:52 +08:00
Xuehai Pan
35ce4ad1a4 docs(README): add notes to set alias for pipx run nvitop 2024-07-12 12:04:36 +00:00
Xuehai Pan
56940d1b25 fix(api/libnvml): ignore UTF-8 decoding errors from pynvml 2024-07-12 11:38:13 +00:00
Xuehai Pan
6bc8a8bf10 fix(api/libnvml): gracefully ignore UTF-8 decoding errors 2024-07-04 16:51:14 +08:00
Xuehai Pan
4f46184441 refactor(setup): refactor setup scripts 2024-07-04 16:51:14 +08:00
Xuehai Pan
4833f201ad deps(nvidia-ml-py): add nvidia-ml-py 12.555.43 to support list 2024-06-01 17:01:16 +00:00
Xuehai Pan
2e15f86940 deps(nvidia-ml-py): add nvidia-ml-py 11.525.150 to support list 2024-06-01 17:01:16 +00:00
Xuehai Pan
e927bba14d chore(pre-commit): update pre-commit hooks 2024-06-01 17:01:15 +00:00
Xuehai Pan
8fb96531da feat(api/device): add aggregated total NVLink throughput 2024-05-09 02:38:32 +08:00
Xuehai Pan
52b44a77d8 chore(pre-commit): update pre-commit hooks 2024-05-09 02:36:30 +08:00
Xuehai Pan
29e0136934 deps(nvidia-ml-py): add nvidia-ml-py 12.550.52 to support list 2024-04-30 15:05:24 +00:00
Xuehai Pan
23ef17f655 deps(nvidia-ml-py): add nvidia-ml-py 12.535.161 to support list 2024-04-30 15:04:32 +00:00
Xuehai Pan
c5ba41dbdf chore(pre-commit): update pre-commit hooks 2024-04-30 15:04:10 +00:00
Xuehai Pan
b724d18fec chore(pre-commit): update pre-commit hooks 2024-04-21 15:50:39 +00:00
Xuehai Pan
201caef035 docs(.readthedocs.yaml): update read the docs config 2024-03-24 15:11:00 +00:00
Xuehai Pan
c2f7af4901 docs(.readthedocs.yaml): update read the docs config 2024-03-12 13:41:52 +00:00
Xuehai Pan
a0387867fb chore(pre-commit): update pre-commit hooks 2024-03-12 13:41:52 +00:00
Xuehai Pan
470245dc3d chore(pre-commit): update pre-commit hooks 2024-03-05 06:07:01 +00:00
Xuehai Pan
8e0c203a1d chore: update license header 2024-02-16 09:58:19 +00:00
Xuehai Pan
1710579c66 lint: update ruff rules 2024-02-16 09:43:47 +00:00
Xuehai Pan
64e35336cd chore(install-nvidia-driver): set LANGUAGE environment variable 2023-12-21 02:31:30 +08:00
dependabot[bot]
327223ef6e
deps(workflows): bump actions/download-artifact from 3 to 4 (#115)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-18 15:06:57 +08:00
dependabot[bot]
dca862eb9c
deps(workflows): bump actions/upload-artifact from 3 to 4 (#116)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-18 15:06:37 +08:00
Xuehai Pan
9c5d330076 ver: bump version to v1.3.2 2023-12-17 19:18:16 +08:00
Xuehai Pan
bff355bcc4
fix(callbacks/lightning): populate callback for lightning (#114) 2023-12-17 19:13:19 +08:00
dependabot[bot]
b50b83767b
deps(workflows): bump actions/setup-python from 4 to 5 (#111)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xuehai Pan <XuehaiPan@pku.edu.cn>
2023-12-11 14:26:02 +08:00
Xuehai Pan
8c8bc18ea0
feat(exporter): remove metrics if process is gone (#107) 2023-11-23 19:08:40 +08:00
Xuehai Pan
83f90f3fa1 chore(install-nvidia-driver): do not call nvidia-smi when the NVIDIA kernel modules are not loaded 2023-11-10 19:45:21 +08:00
Xuehai Pan
36e66bb43c chore(pre-commit): update pre-commit hooks 2023-11-10 19:45:21 +08:00
Xuehai Pan
4bcb6c92b3 style: miscellaneous style housekeeping 2023-11-05 16:05:38 +08:00
Xuehai Pan
1cff66bc03 deps(nvidia-ml-py): add nvidia-ml-py 12.535.133 to support list 2023-11-05 00:17:15 +08:00
Xuehai Pan
5cba62ffe1 chore(pre-commit): update pre-commit hooks 2023-10-25 23:11:17 +08:00
Xuehai Pan
2b3ec124d5 ver: bump version to v1.3.1 2023-10-05 20:04:13 +08:00
Xuehai Pan
a37e63fcf3
deps(python): add Python 3.12 classifiers (#101) 2023-10-05 20:01:13 +08:00
Xuehai Pan
9da41a5d12
fix(libcuda): fix cuDeviceGetUuid() when the UUID contains 0x00 (#100) 2023-10-05 19:48:41 +08:00
dependabot[bot]
49c164cf30
deps(workflows): bump actions/checkout from 3 to 4 (#96)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-11 15:17:47 +08:00
Xuehai Pan
e8dd7e26e2 chore(pre-commit): update pre-commit hooks 2023-09-09 16:39:48 +08:00
Xuehai Pan
ed10216f6a chore(.readthedocs.yaml): remove deprecated config entry 2023-09-09 16:33:25 +08:00
Xuehai Pan
46ea686c33 deps(nvidia-ml-py): add nvidia-ml-py 12.535.108 to support list 2023-08-31 17:15:11 +08:00
Xuehai Pan
410785e283 ver: bump version to 1.3.0 2023-08-26 17:39:40 +00:00
Xuehai Pan
daf72c7bf3
feat(exporter): add Prometheus exporter (#92) 2023-08-27 01:37:04 +08:00