Skip to content

llm-swarm backend integration for slurm clusters #77

llm-swarm backend integration for slurm clusters

llm-swarm backend integration for slurm clusters #77

Triggered via pull request March 1, 2024 05:59
Status Failure
Total duration 20m 4s
Artifacts

test_cli_tensorrt_llm.yaml

on: pull_request
pull_image_and_run_cli_tensorrt_llm_tests
0s
pull_image_and_run_cli_tensorrt_llm_tests
Fit to window
Zoom out
Zoom in

Annotations

1 error
pull_image_and_run_cli_tensorrt_llm_tests
The self-hosted runner: hf-dgx-01 lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.