Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Adds reconfigure control plane #11871

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Hector295
Copy link

What type of PR is this?
/kind feature

What this PR does / why we need it:

This PR introduces tasks and configurations that allow for the reconfiguration of the control plane (kube-apiserver, controller-manager, and scheduler) in Kubespray without requiring a full cluster reprovision.

Additionally, a new playbook, reconfigure-control-plane.yml, has been added. To reconfigure an existing cluster without performing an upgrade, run:

ansible-playbook -i <INVENTORY> reconfigure-control-plane.yml --skip-tags upgrade

Which issue(s) this PR fixes:

Fixes #11552

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Feature: Adds new tasks, templates, and a playbook (reconfigure-control-plane.yml) for control plane reconfiguration in Kubespray, allowing incremental updates without requiring a full cluster reprovision.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 8, 2025
@k8s-ci-robot
Copy link
Contributor

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Jan 8, 2025
Copy link

CLA Not Signed

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Jan 8, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Hector295
Once this PR has been reviewed and has the lgtm label, please assign yankay for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Welcome @Hector295!

It looks like this is your first PR to kubernetes-sigs/kubespray 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/kubespray has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @Hector295. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 8, 2025
@VannTen
Copy link
Contributor

VannTen commented Jan 9, 2025

Hum, could you explain the difference with ansible-playbook -l kube_control_plane,etcd upgrade-cluster.yml for instance ?

I'm not completely closed to the idea, but IMO this would be best implemented with limit and tags / a more intelligent playbook (== we probably restart some things a bit eagerly).

@Hector295
Copy link
Author

Hector295 commented Jan 9, 2025

Hi @VannTen , thank you for your comment and for taking the time to review this PR.

The command you mentioned (ansible-playbook -l kube_control_plane,etcd upgrade-cluster.yml) does not actually perform the necessary updates to the ConfigMap or the values that should be applied to the control plane pods (kube-apiserver, kube-controller-manager, kube-scheduler). The only change it makes is to the kubeadm-config.yaml file, but this does not propagate to the cluster configuration or the pods.

What I’ve implemented follows the official Kubernetes documentation, which outlines the correct process to ensure changes are properly reflected in the cluster.

Regarding your suggestion about limit, I agree that it could be a good option for scoping changes more precisely. However, I don’t believe that tags are strictly necessary in this context, as the main goal is to ensure the configuration changes are fully applied to the cluster.

Additionally, I tested this implementation by changing the value of kube_apiserver_node_port_range from 30000-32767 to 30001-32766. The results showed that with the command you mentioned, the changes were not fully reflected:

root@mistl-node-0:/etc/kubernetes# cat kubeadm-config.yaml | grep 3000
    value: "30001-32766"
root@mistl-node-0:/etc/kubernetes# kubectl get cm -n kube-system kubeadm-config -oyaml | grep 3000
        value: 30000-32767
root@mistl-node-0:/etc/kubernetes# kubectl describe pod -n kube-system kube-apiserver-mistl-node-0 | grep 3000
      --service-node-port-range=30000-32767

It’s also worth noting that the behavior of using the --config flag in kubeadm has changed. For additional context, you can refer to the following discussions:

@yankay
Copy link
Member

yankay commented Jan 10, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 10, 2025
@k8s-ci-robot
Copy link
Contributor

@Hector295: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubespray-yamllint c5055ea link true /test pull-kubespray-yamllint

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@VannTen
Copy link
Contributor

VannTen commented Jan 10, 2025

The command you mentioned (ansible-playbook -l kube_control_plane,etcd upgrade-cluster.yml) does not actually perform the necessary updates to the ConfigMap or the values that should be applied to the control plane pods (kube-apiserver, kube-controller-manager, kube-scheduler). The only change it makes is to the kubeadm-config.yaml file, but this does not propagate to the cluster configuration or the pods.

Is that recent ? Because it absolutely should reconfigure the control-plane, and it does on older releases (like, I did this yesterday on 1.25 to add OIDC parameters.
In that case, that is a regression, which we should fix.

Regarding your suggestion about limit, I agree that it could be a good option for scoping changes more precisely. However, I don’t believe that tags are strictly necessary in this context, as the main goal is to ensure the configuration changes are fully applied to the cluster.

I agree that tags are not strictly necessary. I was working from the assumption that you meant to minimize cluster disruption by only changing the configuration, but if upgrade-cluster is no longer updating configuration, I see where you're coming from.

It’s also worth noting that the behavior of using the --config flag in kubeadm has changed. For additional context, you can refer to the following discussions:

* [Comment on PR #11352](https://github.com/kubernetes-sigs/kubespray/pull/11352#issuecomment-2210283864)

* [Comment on kubeadm issue #3084](https://github.com/kubernetes/kubeadm/issues/3084#issuecomment-2209300846)

I was distantly aware of this issue, but I haven't had the time to focus on this yet.

I don't think a new playbook is the answer though, for several reasons:

  • the existing workflow use upgrade-cluster, this would be a new thing to be aware of.
  • this adds a non-trivial maintenance overhead

Is there a specific reason upgrade-cluster.yml can't be fixed instead ?

@Hector295
Copy link
Author

@VannTen the behavior you mentioned in 1.25 worked, but it was not the correct approach according to kubeadm's design. The upgrade-cluster functionality was being used to reconfigure the control plane, even though its actual purpose is to manage cluster version upgrades.
Reconfiguration of the control plne should not be done as part of a kubernetes upgrade because upgrades include additional tasks that are not relevant to reconfiguration.
The official Kubernetes documentation suggests manually updating the static manifests of the control plane located in /etc/kubernetes/manifests/. This PR introduces the reconfigure-control-plane.yml playbook, ensuring that changes are correctly applied to the control plane pods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

upgrade: variables modified in kubeadm-config.yaml are not reflected in static manifests anymore
4 participants