Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update k8s dependencies to 1.32 #2427

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

kale-amruta
Copy link

@kale-amruta kale-amruta commented Jan 25, 2025

What issue type does this pull request address? (keep at least one, remove the others)
/kind enhancement
/kind feature
/kind test

What does this pull request do? Which issues does it resolve? (use resolves #<issue_number> if possible)
resolves ENG-5389
ENG-5387

Please provide a short message that should be published in the vcluster release notes
Updated vcluster Kubernetes dependencies to v1.32

What else do we need to know?

This PR also fixes vcluster for failing conformance tests because of following issues:

  1. The field status.qosClass has been made immutable from k8s version 1.32 as done over here. When there is a mismatch in status.qosClass field in vcluster and host cluster and vlcuster tries to patch the field in vcluster, the patch fails with an error. This causes conformance tests to fail. The PR fixes tests by ignoring this field in the pod syncer patch updates.
  2. Test Job should allow to use a pod failure policy to ignore failure matching on DisruptionTarget condition [Conformance] was failing because the job controller on vcluster expected the pod to be in failed state before deleting it.
    In this case when the host pod is marked for deletion, the host pod gets into Failed state. But these status updates were not getting propogated to vcluster pod. This led to vcluster pod to be dangling state which caused the test failure. The PR adds a fix of propogating the status updates of host pod to vcluster pod when the host pod is marked for deletion.

Copy link

netlify bot commented Jan 25, 2025

Deploy Preview for vcluster-docs ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit 6e2b055
🔍 Latest deploy log https://app.netlify.com/sites/vcluster-docs/deploys/6794defeb33cb20008003f50
😎 Deploy Preview https://deploy-preview-2427--vcluster-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@kale-amruta kale-amruta marked this pull request as ready for review January 25, 2025 12:19
@kale-amruta kale-amruta changed the title Conformance test fix bump k8s version to 1.32 Jan 25, 2025
@kale-amruta kale-amruta changed the title bump k8s version to 1.32 Update k8s version to 1.32 Jan 25, 2025
@kale-amruta kale-amruta changed the title Update k8s version to 1.32 Update k8s dependencies to 1.32 Jan 25, 2025
@ApsTomar
Copy link

ApsTomar commented Jan 25, 2025

There are two more tests which are failing. These tests were recently added in conformance test suite (a couple of months back). Below are the tests and the RCA:

  1. should run through the lifecycle of a node
  2. should run through the lifecycle of a csinode

Why are these tests failing?

These tests perform the lifecycle operations (CRUD) for nodes and CSI nodes on vcluster respectively but since vcluster doesn't inherently support the node creation on the virtual cluster, so syncToHost method simply deletes the node that was created during the conformance test. Hence, the test can't perform any further action such as get or update and the test fails.

Probable Solutions:

Since this is a case of a feature not supported so ideally it should be skipped but that might impact the certification failure as mentioned in conformance test instructions.
In that case, I tried some workarounds which certainly work but they have some downsides as well. I've detailed them out, please take a look:

  1. The conformance test creates the nodes which have a certain prefix in the name and are unschedulable in nature. This can be used to identify the test nodes and we can skip their deletion during sync operation. Later the test clean-up will delete that node after completion. The caveat is that- a user can misuse this to create a similar node which vcluster won't be able to delete during the regular sync operation.

  2. We can add some gracePeriod to the node deletion in syncToHost method which will keep the node active till the tests are passed. The caveat is that- it'll impact all the node deletion process and the users might not like that the nodes are taking more time to get deleted now. Also, what should be the ideal grace period is also uncertain and might make the test flaky.

  3. Better solution might be a combination of both of the above points: we identify the node created during the test (similar to point-1) and then mark it for deletion with a decent grace period (30 to 60 sec). This will not only make sure that the code change won't impact vcluster's natural behaviour but also won't allow any user to misuse it because it'll eventually get deleted after the grace period runs out.

Let me know your opinions and based on that I will create the PR to fix these remaining 2 conformance tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants