Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix count of allocated/idle cores in metrics #377

Merged
merged 3 commits into from
Nov 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Add `metrics` > `restrict` parameter for the agent.
- Add `ui` > `templates`, `message_template`, `message_login` parameters for
the gateway.
- Select `alloc_cpus` and `alloc_idle_cpus` nodes fields on `slurmrestd`
`/slurm/*/nodes` and `/slurm/*/node/<node>` endpoints.
- Introduce service message template.
- show-conf: Introduce `slurm-web-show-conf` utility to dump current
configuration settings of gateway and agent components with their origin,
Expand Down
3 changes: 3 additions & 0 deletions conf/vendor/agent.yml
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ filters:
- state
- reason
- partitions
- alloc_cpus
- alloc_idle_cpus
doc: |
List of nodes fields selected in slurmrestd API, all other fields are
filtered out.
Expand All @@ -169,6 +171,7 @@ filters:
- reason
- partitions
- alloc_cpus
- alloc_idle_cpus
- alloc_memory
doc: |
List of invidual node fields selected in slurmrestd API, all other fields
Expand Down
6 changes: 6 additions & 0 deletions docs/modules/conf/examples/agent.ini
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,8 @@ ctldjob=
# - state
# - reason
# - partitions
# - alloc_cpus
# - alloc_idle_cpus
nodes=
name
cpus
Expand All @@ -217,6 +219,8 @@ nodes=
state
reason
partitions
alloc_cpus
alloc_idle_cpus

# List of invidual node fields selected in slurmrestd API, all other fields
# are filtered out.
Expand All @@ -236,6 +240,7 @@ nodes=
# - reason
# - partitions
# - alloc_cpus
# - alloc_idle_cpus
# - alloc_memory
node=
name
Expand All @@ -252,6 +257,7 @@ node=
reason
partitions
alloc_cpus
alloc_idle_cpus
alloc_memory

# List of partitions fields selected in slurmrestd API, all other fields are
Expand Down
6 changes: 6 additions & 0 deletions docs/modules/conf/partials/conf-agent.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,10 @@ filtered out.

* `partitions`

* `alloc_cpus`

* `alloc_idle_cpus`


|-

Expand Down Expand Up @@ -383,6 +387,8 @@ are filtered out.

* `alloc_cpus`

* `alloc_idle_cpus`

* `alloc_memory`


Expand Down
5 changes: 3 additions & 2 deletions slurmweb/slurmrestd/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,6 @@ def nodes_cores_states(self):
}
cores_states = {
"idle": 0,
"mixed": 0,
"allocated": 0,
"down": 0,
"drain": 0,
Expand All @@ -157,7 +156,9 @@ def nodes_cores_states(self):
cores = node["cpus"]
if "MIXED" in node["state"]:
nodes_states["mixed"] += 1
cores_states["mixed"] += cores
# Look at number of actually allocated/idle cores
cores_states["allocated"] += node["alloc_cpus"]
cores_states["idle"] += node["alloc_idle_cpus"]
elif "ALLOCATED" in node["state"]:
nodes_states["allocated"] += 1
cores_states["allocated"] += cores
Expand Down
Loading