-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test/e2e: libvirt: Try and reduce the resource usage of the kcli cluster #2117
Comments
https://cloudprice.net/vm/Standard_D4s_v4 Currently we run the tests on a 4 vCPU 16gb ram machine. |
That's very interesting as if it's a 4x16 machine then it's the same size as the github hosted runners (and might explain some of the libvirt ci flakiness as we try and squeeze 10 vCPUs and 20GB RAM out of a 4x16 box! Maybe we can try out libvirt e2e on a self-hosted runner now... (I'll be back with results) |
It failed (https://github.com/stevenhorsman/cloud-api-adaptor/actions/runs/11331308718/job/31511037253) with:
The GH runners have 14GB of storage, so maybe that isn't enough, so it might be another path to investigate |
currently we build the kbs client with rust, which can produce a surprisingly large target folder. we could either clean that up or download the kbs-client via oras? |
Yeah, I think that would be great. I'll try out the e2e tests without the KBS section and see if that helps and also re-run it once the caching PR is merged 😃 |
Ooh - cutting out the KBS deployment and test meant the gh-runner tests worked: https://github.com/stevenhorsman/cloud-api-adaptor/actions/runs/11331632234/job/31512084686 😃 |
that's great. if we change this line:
to oras pull "ghcr.io/confidential-containers/staged-images/kbs-client:sample_only-x86_64-linux-gnu-${KBS_SHA}"
chmod +x ./kbs-client we can also drop the rust toolchain installation |
Cool - I'll give that a try in my fork |
When I get a chance I'll try and re-create and debug locally |
hmm I've never tested it either, the whole kbs-client business is a bit of a black box to me, so the available binary might not work. an alternative would be to build the kbs-client in another job and pass it around as an artifact. upside: faster builds, b/c it can be built in parallel before the test; doesn't consume space on the test instance |
Ah - the kbs client is extracted to the wrong directory as it's build to targets/release. I'll try and fix that and see if that helps. I also want to understand why we aren't hitting errors in the e2e tests first trying to use a non-existing file? |
So we just ignore any errors thrown in the kbs client code. I thought I remembered fixing that, but https://github.com/confidential-containers/cloud-api-adaptor/pull/2055/files hasn't merged yet.
I think we just move the expectation for the client to be in |
At the moment in the libvirt testing we are using the default node size. This leads to the situation were each of the work and control-plane defaults uses 4 vCPU and 6GB RAM:
In an ideal world we'd like to reduce our test footprint to fit inside the github hosted runner, which is a 4x16GB machine.
Our peer pod VM is currently using 2x8GB of it's own, which we are working on reducing, but the 8 vCPU and 12 GB RAM that the kcli cluster uses is way to big. Actually reducing this shouldn't be too tricky as I think it's just editing the default parameters we pass in in
kcli_cluster.sh
, but the tricky bit is working out the minimum resources we can get away with without impacting the tests, so looking at the resource usage on an existing cluster might help there.The text was updated successfully, but these errors were encountered: