Skip to content

Commit

Permalink
WIP: write RFC about certificate handling
Browse files Browse the repository at this point in the history
Initial commit, not done yet, no proofreading

Signed-off-by: Flavio Castelli <[email protected]>
  • Loading branch information
flavio committed Aug 11, 2023
1 parent e92216b commit 307e72e
Showing 1 changed file with 199 additions and 0 deletions.
199 changes: 199 additions & 0 deletions rfc/0018-certificate-handling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
| | |
| :----------- | :------------------------------ |
| Feature Name | Certificate Handling |
| Start Date | Aug 11 2023 |
| Category | [Category] |
| RFC PR | [fill this in after opening PR] |
| State | **ACCEPTED** |

# Summary

[summary]: #summary

The goal of this RFC is to summarize the usage of certificates inside of the
Kubewarden stack and come up with a proposal about how to manage them without
having to resort to external depdendencies like CertManager.

# Motivation

The first goal of this RFC is to provide a picture of the current state of
certificate handling inside of Kubewarden. How many certificates are used,
by which components, which CA signed them, how have they been generated?

The second goal is to come up with an architecture that removes the external
depdendency against CertManager. The new solution should also take into account
certificate rotation.

## Examples / User Stories

> As a Kubernetes operator, I don't want to install CertManager
> in order to deploy Kubearden.
> As a Kubernetes operator,
> I want Kubewarden certificates to be automatically rotated
> before they reach their expiration date.
# Detailed design

## Components that make use of TLS certificates

All the certificates used by the Kubewarden stack are internal to the cluster,
meaning they are not visible outside of Kubernetes. All of them are used to
secure the communication between the API Server and different HTTPs endpoints.

These are the HTTP servers managed by Kubewarden that require TLS termination:

- kubewarden-controller: this is a Validating and Mutating webhook endpoint that
processes the Kubewarden CRDs (like `PolicyServer`, `ClusterAdmissionPolicy`
and `AdmissionPolicy`)
- Policy Server: each Deployment has its own dedicated certificate

> **Note:** all these endpoints are either Validating or Mutating webhook
> endpoints for Kubernetes
Currently, Kubewarden 1.6, the certificate used by the kubewarden-controller
is generated by CertManager. On the other hand, the certificates used by the
Policy Server instances are instead generated by our controller.

The controller generates a self-signed Certificate Authority (CA) that
is then used to sign all the Policy Server certificates.

## Webhook Configuration and Certificates

Both `ValidatingWebhookConfiguration` and `MutatingWebhookConfiguration`
resources have a
[`clientConfig`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#webhookclientconfig-v1-admissionregistration-k8s-io)
object inside of their definition.
The `clientConfig.caBundle` has the bundle of CA required to
verify the TLS termination of the webhook endpoint:

```yaml
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
webhooks:
- name: my-webhook.example.com
clientConfig:
caBundle: <CA_BUNDLE>
service:
namespace: my-service-namespace
name: my-service-name
path: /my-path
port: 1234
```
We have one `MutatingWebhookConfiguration` used by the kubewarden-controller. This
one has a reference to the CA used by CertManager.

Each `ClusterAdmissionPolicy` and `AdmissionPolicy` has a dedicated `(Validating|Mutating)WebhookConfiguration`
object defined. For all of them, the `clientConfig.caBundle` contains the certificate of
the CA generated by our controller.

## Proposed Design

We would like to get rid of CertManager. To do that we need to change how certificates are
managed for the `kubewarden-controller`.
Going forward, the controller will generate the certificate used by the kubewarden-controller. Moreover, this
certificate is going to be signed by the CA created by the controller.

```mermaid
flowchart TD
A[KW Internal CA]
A -->B(KW controller)
A -->C(Policy Server #1)
A -->D(Policy Server #2)
```

## Reconciliation Loops

This section aims to cover the different scenarios. For each one of them we start by defining an event
and then outline the different actions that have to be done.

### Kubewarden Fresh Install

The Kubewarden stack has just been installed. The following actions have to be done:

- Generate Certificate Authority
- Generate certificate for `kubewarden-controller` and sign it with our CA
- Start the Kubewarden controller HTTPs server using this certificate
- Register the webhook against the Kubernetes API, use our CA inside of `clientConfig.caBundle`

### New Policy Server Defined

A new `PolicyServer` is defined, the following actions have to be done:

- Generate new certificate, sign it with our CA
- Create Policy Server Deployment, ensure the certificate is mounted inside of the Pods

### New `AdmissionPolicy`/`ClusterAdmissionPolicy` is defined

A new policy is defined inside of the cluster, ensure the following action are done:

- Register the webhook against the Kubernetes API, use our CA inside of `clientConfig.caBundle`

> **Note:** it doesn't matter which Policy Server is going to host the policy. All the Policy Server
> instances use a certificate that is signed by our CA. Hence we always put our CA
> inside of the `clientConfig.caBundle`.

### Policy Server Certificate Renewed

From time to time need to renew the certificate issued to a Policy Server instance (to avoid
things like certificate expiration errors).

To renew a Policy Server certificate the following actions have to be done:

- Generate new certificate, sign it with our CA
- Force a rollout of the Policy Server deployment. This will ensure the HTTPs server uses the new certificate

> **Note:** there's no need to touch the webhook configuration objects related to the policies
> hosted by the Policy Server. The new certificate is still signed by the same CA, hence nothing has
> to be changed.

### Internal CA changes

The internal CA managed by Kubewarden might be recreated (for example, to avoid its expiration). This is the
most disruptive event that might happen, especially once some policies are deployed.

Once the new CA is generated, the following actions have to be performed:

- For each `PolicyServer` defined:
- Generate a new certificate, sign it with the new CA
- Force a rollout of the PolicyServer Deployment
- Kubewarden Controller:
- Generate a new certificate, sign it with the new CA
- Ensure the https server uses the new certificate
- For each webhook configuration managed by Kubewarden (meaning all the policies deployed, plus the kubewarden-controller):
- Update the `clientConfig.caBundle`: ensure it contains the new CA

This change is massive and could lead to connection errors for as long as the PolicyServer are using a certificate
signed by a CA that is not the one mentioned inside of the `clientConfig.caBundle`.

# Drawbacks

[drawbacks]: #drawbacks

<!---
Why should we **not** do this?

* obscure corner cases
* will it impact performance?
* what other parts of the product will be affected?
* will the solution be hard to maintain in the future?
--->

# Alternatives

[alternatives]: #alternatives

<!---
- What other designs/options have been considered?
- What is the impact of not doing this?
--->

# Unresolved questions

[unresolved]: #unresolved-questions

<!---
- What are the unknowns?
- What can happen if Murphy's law holds true?
--->

0 comments on commit 307e72e

Please sign in to comment.