-
-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/home/jenkins/workspace/Grinder: No space left on device #2251
Comments
From |
Test_openjdk18_hs_extended.openjdk_x86-64_alpine-linux_testList_0 was using 43Gb on the Alpine 3.12 container. Similarly it was the Both have been cleared and the host now has about 131Gb available which should resole the problem. Therefore closing. |
I could see it happened again. |
test-docker-ubuntu1604-x64 similar issue: |
docker-packet-ubuntu2004-amd-1 similar issue: |
test-docker-ubuntu2010-x64-1 |
test-docker-fedora33-x64-1: |
test-docker-fedora33-x64-2: |
test-docker-ubuntu1804-x64-1 |
test-docker-fedora33-x64-1 https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/2875/console |
Not sure why the host is using so much space, but a Biggest uses of space ont eh Fedora box appear to have been these, so I've also clear them out
|
test-docker-ubuntu1804-x64-1 |
https://ci.adoptopenjdk.net/view/Test_grinder/job/Test_Job_Auto_Gen/277/ |
@Haroon-Khel As the new expert in the DockerStatic stuff, can you take a look and see what we can do with this please? We probably need some sort of automation (jenkins job or otherwise) that goes over the dockerhost machines and checks and if necessary reports any problems with:
Doing something with the output of something like these commands may be a good place to start: |
Ive created https://ci.adoptopenjdk.net/view/Tooling/job/DockerhostHealthStatus/ for now, which runs https://github.com/Haroon-Khel/openjdk-infrastructure/blob/dockerhosthealth/ansible/playbooks/AdoptOpenJDK_Unix_Playbook/roles/DockerStatic/scripts/dockerhosthealth.sh which is in its draft stage right now |
@Haroon-Khel The latest JDK11 release didn't appear to cause a filling up of the file system. I think you asserted that adoptium/aqa-tests#3326 hadn't taken effect, although that may be a result of using the
|
Fresh issue on test-docker-ubuntu2010-x64-2: https://ci.adoptopenjdk.net/view/work-in-progress/job/WIP_Test_Job_Auto_Gen/72/console
|
Adding to May 2022 plan (as it looks partly worked, and it does still affect releases) |
No current issues so removing from the May milestone. I'll keep it open for another month or so and then we can close if no more occurrences (Can always be reopened if required) |
test-docker-fedora34-x64-1:
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5012/console |
Hmmm cleared up some old volumes, although I thought I'd done a clearup on this host ealirer today so we'll see if it fills up again. If so we'll need to investigate what's using it up. I've only been able to reclaim 25% of the 400Gb volume, and it shouldn't be using anywhere near that amount. |
An extra 50Gb seems to have been used up overnight on the file system. That's not normal |
Could you list the top files/folders that use the most space? Maybe we can get some clues. |
It's not quite that simple when it's a load of docker containers on the host system unfortunately. |
Looks like this process might have been keeping a lot of space in use but with probably from deleted files which still had file handles open to them: |
https://ci.adoptopenjdk.net/job/SXA-processCheck/label=test-docker-fedora34-x64-1/295/console cannot complete on this machine due to the space issue. In the test Jenkins script, it detects the leftover processes. I think we should enforce the logic to kill the leftover processes before and after the test job. The ideal place for this logic should be in TKG. If that cannot be completed soon, maybe we should do it in the Jenkins script for now. |
I thinks it has been done in jenkins script https://github.com/adoptium/aqa-tests/blob/57c4bc2f4907cffdecedbd5387d4e7b6f6a33f9a/buildenv/jenkins/JenkinsfileBase#L854-L859 |
re #2251 (comment), the above code only lists the processes. |
test-sxa-armv7l-ubuntu2004-odroid-2 got |
Will cover this under #2829 |
I believe all of the problems related to the DockerHost systems have bow been resolved since we allocated more dedicated space to /var/lib/docker a few months ago so I'm going to close this issue now. If we have any further problems they can be opened in separate issues. |
/home/jenkins/workspace/Grinder: No space left on device, the error found on following docker ones:
test-docker-fedora33-x64-1
test-docker-fedora33-x64-2
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1027/
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1028/console
The text was updated successfully, but these errors were encountered: