Skip to content

updated user-mode data collection and support for flux resource manager #335

updated user-mode data collection and support for flux resource manager

updated user-mode data collection and support for flux resource manager #335

Workflow file for this run

name: System mode
on:
push:
branches: [ main, dev ]
pull_request:
branches: [ main, dev ]
jobs:
test:
name: System-level Omnistat
runs-on: ubuntu-22.04
strategy:
matrix:
execution: [ source, package ]
steps:
- name: Check out repository code
uses: actions/checkout@v4
- name: Comment out GPU devices (not available in GitHub)
run: sed -i "/devices:/,+2 s/^/#/" test/docker/slurm/compose.yaml
- name: Disable SMI collector (won't work in GitHub)
run: >
sed -i "s/enable_rocm_smi = True/enable_rocm_smi = False/" \
test/docker/slurm/omnistat-system.config
- name: Start containerized environment
env:
TEST_OMNISTAT_EXECUTION: ${{ matrix.execution }}
run: docker compose -f test/docker/slurm/compose.yaml up -d
- name: Wait for Prometheus
run: >
timeout 1m bash -c \
'until $(curl -o /dev/null --fail -s localhost:9090/metrics); do \
echo "Waiting for Prometheus..."; \
sleep 5; \
done'
- name: Wait for Omnistat
run: >
timeout 15m bash -c \
'until [[ $(curl -s -g "localhost:9090/api/v1/query?query=up{instance=\"node1:8001\"}>0" | jq ".data.result|length") != 0 ]]; do \
echo "Waiting for Omnistat..."; \
sleep 15; \
done'
- name: Display node logs
run: for i in node1 node2; do docker logs slurm-$i; done
- name: Install test dependencies
run: pip3 install -r test/requirements.txt
- name: Run tests
working-directory: ./test
run: pytest -v test_integration.py test_job_system.py