Skip to content

Fix issue while logging cuda device utilization to tensorboard (#3727) #11318

Fix issue while logging cuda device utilization to tensorboard (#3727)

Fix issue while logging cuda device utilization to tensorboard (#3727) #11318

Triggered via push October 13, 2023 14:08
Status Failure
Total duration 41m 11s
Artifacts 7

pytest.yml

on: push
Matrix: integration-tests
Matrix: pytest
Fit to window
Zoom out
Zoom in

Annotations

1 error and 15 warnings
integration_tests_e
Unhandled exception. System.IO.IOException: No space left on device : '/home/runner/runners/2.309.0/_diag/Worker_20231013-140824-utc.log' at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset) at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite() at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder) at System.Diagnostics.TextWriterTraceListener.Flush() at System.Diagnostics.TraceSource.Flush() at GitHub.Runner.Common.TraceManager.Dispose(Boolean disposing) at GitHub.Runner.Common.TraceManager.Dispose() at GitHub.Runner.Common.HostContext.Dispose(Boolean disposing) at GitHub.Runner.Common.HostContext.Dispose() at GitHub.Runner.Worker.Program.Main(String[] args) System.IO.IOException: No space left on device : '/home/runner/runners/2.309.0/_diag/Worker_20231013-140824-utc.log' at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset) at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite() at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder) at System.Diagnostics.TextWriterTraceListener.Flush() at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id) at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message) at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message) at GitHub.Runner.Worker.Worker.RunAsync(String pipeIn, String pipeOut) at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args) System.IO.IOException: No space left on device : '/home/runner/runners/2.309.0/_diag/Worker_20231013-140824-utc.log' at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset) at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite() at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder) at System.Diagnostics.TextWriterTraceListener.Flush() at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id) at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message) at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message) at GitHub.Runner.Common.Tracing.Error(Exception exception) at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)
py3.8, torch-1.13.0, distributed, ubuntu-latest, ray 2.2.0
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2, actions/cache@v2, actions/upload-artifact@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
Event File
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/upload-artifact@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
py3.9, torch-2.0.0, distributed, ubuntu-latest, ray 2.3.0
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2, actions/cache@v2, actions/upload-artifact@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
py3.10, torch-nightly, distributed, ubuntu-latest, ray 2.3.1
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2, actions/cache@v2, actions/upload-artifact@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
Test Minimal Install
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
integration_tests_b
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
integration_tests_a
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
integration_tests_c
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
py3.8, torch-1.13.0, not distributed, ubuntu-latest, ray 2.2.0
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2, actions/cache@v2, actions/upload-artifact@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
py3.10, torch-nightly, not distributed, ubuntu-latest, ray 2.3.1
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2, actions/cache@v2, actions/upload-artifact@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
py3.9, torch-2.0.0, not distributed, ubuntu-latest, ray 2.3.0
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2, actions/cache@v2, actions/upload-artifact@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
LLM Tests
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
Combinatorial Tests
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
integration_tests_d
The following actions uses node12 which is deprecated and will be forced to run on node16: actions/checkout@v2, actions/setup-python@v2. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/
integration_tests_e
You are running out of disk space. The runner will stop working when the machine runs out of disk space. Free space left: 93 MB

Artifacts

Produced during runtime
Name Size
Event File Expired
9.72 KB
Unit Test Results (Python 3.10 distributed) Expired
2.01 KB
Unit Test Results (Python 3.10 not distributed) Expired
2.53 KB
Unit Test Results (Python 3.8 distributed) Expired
2.01 KB
Unit Test Results (Python 3.8 not distributed) Expired
2.53 KB
Unit Test Results (Python 3.9 distributed) Expired
2.01 KB
Unit Test Results (Python 3.9 not distributed) Expired
2.53 KB