diff --git a/docs/src/capi/explanation/index.md b/docs/src/capi/explanation/index.md index 155fe19bd..54e851440 100644 --- a/docs/src/capi/explanation/index.md +++ b/docs/src/capi/explanation/index.md @@ -15,9 +15,9 @@ Overview about installation-methods.md +capi-ck8s.md ingress load-balancer -capi-ck8s.md in-place-upgrades.md security ``` diff --git a/docs/src/capi/howto/custom-ck8s.md b/docs/src/capi/howto/custom-ck8s.md index 99bd26f56..437b19032 100644 --- a/docs/src/capi/howto/custom-ck8s.md +++ b/docs/src/capi/howto/custom-ck8s.md @@ -1,15 +1,15 @@ -# Install custom {{product}} on machines +# How to install custom {{product}} on machines -By default, the `version` field in the machine specifications will determine +By default, the `version` field in the machine specifications will determine which {{product}} **version** is downloaded from the `stable` risk level. -This guide walks you through the process of installing {{product}} +This guide walks you through the process of installing {{product}} with a specific **risk level**, **revision**, or from a **local path**. ## Prerequisites To follow this guide, you will need: -- A Kubernetes management cluster with Cluster API and providers installed +- A Kubernetes management cluster with Cluster API and providers installed and configured. - A generated cluster spec manifest @@ -20,8 +20,8 @@ This guide will call the generated cluster spec manifest `cluster.yaml`. ## Using the configuration specification -{{product}} can be installed on machines using a specific `channel`, -`revision` or `localPath` by specifying the respective field in the spec +{{product}} can be installed on machines using a specific `channel`, +`revision` or `localPath` by specifying the respective field in the spec of the machine. ```yaml @@ -38,14 +38,14 @@ spec: localPath: /path/to/snap/on/machine ``` -Note that for the `localPath` to work the snap must be available on the +Note that for the `localPath` to work the snap must be available on the machine at the specified path on boot. ## Overwrite the existing `install.sh` script Running the `install.sh` script is one of the steps that `cloud-init` performs -on machines and can be overwritten to install a custom {{product}} -snap. This can be done by adding a `files` field to the +on machines and can be overwritten to install a custom {{product}} +snap. This can be done by adding a `files` field to the `spec` of the machine with a specific `path`. ```yaml @@ -68,8 +68,8 @@ Now the new control plane nodes that are created using this manifest will have the `1.31-classic/candidate` {{product}} snap installed on them! ```{note} -[Use the configuration specification](#using-config-spec), -if you're only interested in installing a specific channel, revision, or +[Use the configuration specification](#using-config-spec), +if you're only interested in installing a specific channel, revision, or form the local path. ``` diff --git a/docs/src/capi/howto/external-etcd.md b/docs/src/capi/howto/external-etcd.md index 8e25f0364..d856d1305 100644 --- a/docs/src/capi/howto/external-etcd.md +++ b/docs/src/capi/howto/external-etcd.md @@ -1,4 +1,4 @@ -# Use external etcd with Cluster API +# How to use external etcd with Cluster API To replace the built-in datastore with an external etcd to manage the Kubernetes state in the Cluster API (CAPI) workload cluster follow diff --git a/docs/src/capi/howto/in-place-upgrades.md b/docs/src/capi/howto/in-place-upgrades.md index 7b20c9dab..f545f4058 100644 --- a/docs/src/capi/howto/in-place-upgrades.md +++ b/docs/src/capi/howto/in-place-upgrades.md @@ -1,4 +1,4 @@ -# Perform an in-place upgrade for a machine +# How to perform an in-place upgrade for a machine This guide walks you through the steps to perform an in-place upgrade for a Cluster API managed machine. @@ -35,7 +35,7 @@ kubectl --kubeconfig c1-kubeconfig.yaml get nodes -o wide ## Annotate the machine -In this first step, annotate the Machine resource with +In this first step, annotate the Machine resource with the in-place upgrade annotation. In this example, the machine is called `c1-control-plane-xyzbw`. @@ -49,7 +49,7 @@ kubectl annotate machine c1-control-plane-xyzbw "v1beta2.k8sd.io/in-place-upgrad e.g. `channel=1.30-classic/stable` * `revision=` which refreshes k8s to the given revision. e.g. `revision=123` -* `localPath=` which refreshes k8s with the snap file from +* `localPath=` which refreshes k8s with the snap file from the given absolute path. e.g. `localPath=full/path/to/k8s.snap` Please refer to the [ClusterAPI Annotations Reference][annotations-reference] diff --git a/docs/src/capi/howto/index.md b/docs/src/capi/howto/index.md index 3bf0cca3a..bf4926bee 100644 --- a/docs/src/capi/howto/index.md +++ b/docs/src/capi/howto/index.md @@ -14,13 +14,13 @@ Overview :glob: :titlesonly: +Install custom Canonical Kubernetes Use external etcd -rollout-upgrades -in-place-upgrades -upgrade-providers -migrate-management -custom-ck8s -refresh-certs +Upgrade the Kubernetes version +Perform an in-place upgrade +Upgrade the providers of a management cluster +Migrate the management cluster +Refresh workload cluster certificates ``` --- diff --git a/docs/src/capi/howto/migrate-management.md b/docs/src/capi/howto/migrate-management.md index e7c4113c1..b6a28f83f 100644 --- a/docs/src/capi/howto/migrate-management.md +++ b/docs/src/capi/howto/migrate-management.md @@ -1,18 +1,18 @@ -# Migrate the management cluster +# How to migrate the management cluster -Management cluster migration allows admins to move the management cluster +Management cluster migration allows admins to move the management cluster to a different substrate or perform maintenance tasks without disruptions. This guide walks you through the migration of a management cluster. ## Prerequisites -- A {{product}} CAPI management cluster with Cluster API and providers +- A {{product}} CAPI management cluster with Cluster API and providers installed and configured. ## Configure the target cluster -Before migrating a cluster, ensure that both the target and source management -clusters run the same version of providers (infrastructure, bootstrap, +Before migrating a cluster, ensure that both the target and source management +clusters run the same version of providers (infrastructure, bootstrap, control plane). Use `clusterctl init` to target the cluster:: ``` diff --git a/docs/src/capi/howto/refresh-certs.md b/docs/src/capi/howto/refresh-certs.md index f51b244fc..7b5f0e46a 100644 --- a/docs/src/capi/howto/refresh-certs.md +++ b/docs/src/capi/howto/refresh-certs.md @@ -1,4 +1,4 @@ -# Refreshing workload cluster certificates +# How to refresh workload cluster certificates This how-to will walk you through the steps to refresh the certificates for both control plane and worker nodes in your {{product}} Cluster API cluster. diff --git a/docs/src/capi/howto/rollout-upgrades.md b/docs/src/capi/howto/rollout-upgrades.md index 8fdc0b679..5a4648420 100644 --- a/docs/src/capi/howto/rollout-upgrades.md +++ b/docs/src/capi/howto/rollout-upgrades.md @@ -1,4 +1,4 @@ -# Upgrade the Kubernetes version of a cluster +# How to upgrade the Kubernetes version of a cluster This guide walks you through the steps to rollout an upgrade for a Cluster API managed Kubernetes cluster. The upgrade process includes updating @@ -21,7 +21,7 @@ This guide refers to the workload cluster as `c1` and its kubeconfig as `c1-kubeconfig.yaml`. ```{note} -Rollout upgrades are recommended for HA clusters. For non-HA clusters, please +Rollout upgrades are recommended for HA clusters. For non-HA clusters, please refer to the [in-place upgrade guide]. ``` @@ -127,4 +127,3 @@ kubectl get machines -A [getting-started]: ../tutorial/getting-started.md [in-place upgrade guide]: ./in-place-upgrades.md -``` diff --git a/docs/src/capi/howto/upgrade-providers.md b/docs/src/capi/howto/upgrade-providers.md index 03ed51afd..6b221dc0d 100644 --- a/docs/src/capi/howto/upgrade-providers.md +++ b/docs/src/capi/howto/upgrade-providers.md @@ -1,23 +1,23 @@ -# Upgrading the providers of a management cluster +# How to upgrade the providers of a management cluster -This guide will walk you through the process of upgrading the +This guide will walk you through the process of upgrading the providers of a management cluster. ## Prerequisites -- A {{product}} CAPI management cluster with installed and +- A {{product}} CAPI management cluster with installed and configured providers. ## Check for updates -Check whether there are any new versions of your running +Check whether there are any new versions of your running providers: ``` clusterctl upgrade plan ``` -The output shows the existing version of each provider as well +The output shows the existing version of each provider as well as the next available version: ```text diff --git a/docs/src/capi/reference/index.md b/docs/src/capi/reference/index.md index af706111e..773a7cdb7 100644 --- a/docs/src/capi/reference/index.md +++ b/docs/src/capi/reference/index.md @@ -12,7 +12,7 @@ Overview :titlesonly: releases annotations -Ports and Services +Ports and services Community configs diff --git a/docs/src/capi/tutorial/index.md b/docs/src/capi/tutorial/index.md index 63142d5f2..acbee5f49 100644 --- a/docs/src/capi/tutorial/index.md +++ b/docs/src/capi/tutorial/index.md @@ -11,7 +11,7 @@ Overview ```{toctree} :glob: :titlesonly: -getting-started +Getting started ``` --- diff --git a/docs/src/charm/explanation/index.md b/docs/src/charm/explanation/index.md index 2547997a1..0123e9641 100644 --- a/docs/src/charm/explanation/index.md +++ b/docs/src/charm/explanation/index.md @@ -17,7 +17,7 @@ channels ingress load-balancer security -Upgrading +Upgrades ``` This page covers both general and charm-related topics. diff --git a/docs/src/charm/howto/configure-cluster.md b/docs/src/charm/howto/configure-cluster.md index 3824c8651..e0c28251b 100644 --- a/docs/src/charm/howto/configure-cluster.md +++ b/docs/src/charm/howto/configure-cluster.md @@ -1,4 +1,4 @@ -# Configure a {{ product }} cluster using Juju +# How to configure a {{ product }} cluster using Juju This guide provides instructions for configuring a {{ product }} cluster using Juju. The DNS feature is used as an example to demonstrate the various diff --git a/docs/src/charm/howto/cos-lite.md b/docs/src/charm/howto/cos-lite.md index 91ccebb7f..060fccf7d 100644 --- a/docs/src/charm/howto/cos-lite.md +++ b/docs/src/charm/howto/cos-lite.md @@ -1,4 +1,4 @@ -# Integrating with COS Lite +# How to integrate with COS Lite It is often advisable to have a monitoring solution which will run whether the cluster itself is running or not. It may also be useful to integrate monitoring diff --git a/docs/src/charm/howto/custom-registry.md b/docs/src/charm/howto/custom-registry.md index c9d39b047..49d0e3deb 100644 --- a/docs/src/charm/howto/custom-registry.md +++ b/docs/src/charm/howto/custom-registry.md @@ -1,4 +1,4 @@ -# Configure a custom registry +# How to configure a custom registry The `k8s` charm can be configured to use a custom container registry for its container images. This is particularly useful if you have a private registry or @@ -12,7 +12,7 @@ charm to pull images from a custom registry. - Access to a custom container registry from the cluster (e.g., docker registry or Harbor). -## Configure the Charm +## Configure the charm To configure the charm to use a custom registry, you need to set the `containerd_custom_registries` configuration option. This options allows @@ -43,7 +43,7 @@ progress by running: juju status --watch 2s ``` -## Verify the Configuration +## Verify the configuration Once the charm is configured and active, verify that the custom registry is configured correctly by creating a new workload and ensuring that the images diff --git a/docs/src/charm/howto/etcd.md b/docs/src/charm/howto/etcd.md index 815de55ba..3405f81fd 100644 --- a/docs/src/charm/howto/etcd.md +++ b/docs/src/charm/howto/etcd.md @@ -17,9 +17,9 @@ post-deployment. Planning for your datastore needs ahead of time is crucial, particularly if you opt for an external datastore like **etcd**. ``` -## Preparing the Deployment +## Prepare the Deployment -1. **Creating the Deployment Model**: +1. **Create the Deployment model**: Begin by creating a Juju model specifically for your {{product}} cluster deployment. @@ -27,7 +27,7 @@ crucial, particularly if you opt for an external datastore like **etcd**. juju add-model my-cluster ``` -2. **Deploying Certificate Authority**: +2. **Deploy the Certificate Authority**: etcd requires a secure means of communication between its components. Therefore, we require a certificate authority such as [EasyRSA][easyrsa-charm] or [Vault][vault-charm]. Check the respective charm documentation for detailed @@ -38,9 +38,9 @@ crucial, particularly if you opt for an external datastore like **etcd**. juju deploy easyrsa ``` -## Deploying etcd +## Deploy etcd -- **Single Node Deployment**: +- **Single node Deployment**: - To deploy a basic etcd instance on a single node, use the command: ```bash @@ -50,7 +50,7 @@ crucial, particularly if you opt for an external datastore like **etcd**. This setup is straightforward but not recommended for production environments due to a lack of high availability. -- **High Availability Setup**: +- **High Availability setup**: - For environments where high availability is crucial, deploy etcd across at least three nodes: @@ -61,7 +61,7 @@ crucial, particularly if you opt for an external datastore like **etcd**. This ensures that your etcd cluster remains available even if one node fails. -## Integrating etcd with EasyRSA +## Integrate etcd with EasyRSA Now you have to integrate etcd with your certificate authority. This will issue the required certificates for secure communication between etcd and your @@ -71,7 +71,7 @@ the required certificates for secure communication between etcd and your juju integrate etcd easyrsa ``` -## Deploying {{product}} +## Deploy {{product}} Deploy the control plane units of {{product}} with the command: @@ -88,7 +88,7 @@ Remember to run `juju expose k8s`. This will open the required ports to reach your cluster from outside. ``` -## Integrating {{product}} with etcd +## Integrate {{product}} with etcd Now that we have both the etcd datastore deployed alongside our Canonical Kubernetes cluster, it is time to integrate our cluster with our etcd datastore. diff --git a/docs/src/charm/howto/index.md b/docs/src/charm/howto/index.md index c8de33704..66db80aa9 100644 --- a/docs/src/charm/howto/index.md +++ b/docs/src/charm/howto/index.md @@ -21,12 +21,12 @@ Integrate with etcd Integrate with ceph-csi Integrate with COS Lite Configure proxy settings -custom-registry +Configure a custom registry Upgrade minor version Upgrade patch version Validate the cluster Troubleshooting -contribute +Contribute to Canonical Kubernetes ``` --- diff --git a/docs/src/charm/howto/install/charm.md b/docs/src/charm/howto/install/charm.md index a1d7ec449..d2367632c 100644 --- a/docs/src/charm/howto/install/charm.md +++ b/docs/src/charm/howto/install/charm.md @@ -1,4 +1,4 @@ -# Install {{product}} from a charm +# How to install {{product}} from a charm {{product}} is packaged as a [charm], available from Charmhub for all supported platforms. diff --git a/docs/src/charm/howto/install/custom-workers.md b/docs/src/charm/howto/install/custom-workers.md index fd17b9a9b..a6126f677 100644 --- a/docs/src/charm/howto/install/custom-workers.md +++ b/docs/src/charm/howto/install/custom-workers.md @@ -1,4 +1,4 @@ -# Adding worker nodes with custom configurations +# How to add worker nodes with custom configurations This guide will walk you through how to deploy multiple `k8s-worker` applications with different configurations, to create node groups with specific @@ -7,6 +7,7 @@ capabilities or requirements. ## Prerequisites This guide assumes the following: + - A working Kubernetes cluster deployed with the `k8s` charm ## Example worker configuration @@ -24,6 +25,7 @@ your worker nodes. ``` 1. Workers for memory-intensive workloads (`worker-memory-config.yaml`): + ```yaml memory-workers: bootstrap-node-taints: "workload=memory:NoSchedule" @@ -31,6 +33,7 @@ memory-workers: ``` 2. Workers for GPU workloads (`worker-gpu-config.yaml`): + ```yaml gpu-workers: bootstrap-node-taints: "accelerator=nvidia:NoSchedule" diff --git a/docs/src/charm/howto/install/index.md b/docs/src/charm/howto/install/index.md index 67457520b..e5e50cd39 100644 --- a/docs/src/charm/howto/install/index.md +++ b/docs/src/charm/howto/install/index.md @@ -1,4 +1,4 @@ -# Installing {{product}} +# Install {{product}} ```{toctree} :hidden: @@ -12,9 +12,9 @@ the current How-to guides below. :glob: :titlesonly: -Install from a charm -Install in LXD -Install with Terraform -Install with a custom configuration -Install with custom worker configuration +... from a charm +... in LXD +... with Terraform +... with a custom configuration +... with custom worker configuration ``` diff --git a/docs/src/charm/howto/install/install-custom.md b/docs/src/charm/howto/install/install-custom.md index 2cdd69746..f0fccdaef 100644 --- a/docs/src/charm/howto/install/install-custom.md +++ b/docs/src/charm/howto/install/install-custom.md @@ -1,4 +1,4 @@ -# Installing with custom configuration +# How to install with custom configuration This guide will walk you through deploying {{product}} using Juju with custom configuration options. diff --git a/docs/src/charm/howto/install/install-lxd.md b/docs/src/charm/howto/install/install-lxd.md index 524260161..18b3deafb 100644 --- a/docs/src/charm/howto/install/install-lxd.md +++ b/docs/src/charm/howto/install/install-lxd.md @@ -1,4 +1,4 @@ -# Installing to localhost/LXD +# How to install to localhost/LXD The main [install instructions][install] cover most situations for installing {{product}} using a charm. However, there are two scenarios which diff --git a/docs/src/charm/howto/install/install-terraform.md b/docs/src/charm/howto/install/install-terraform.md index 88b5bff58..93ff56ca1 100644 --- a/docs/src/charm/howto/install/install-terraform.md +++ b/docs/src/charm/howto/install/install-terraform.md @@ -1,13 +1,13 @@ -# Installing with Terraform +# How to install with Terraform This guide walks you through the process of installing {{ product }} using -the [Terraform Juju Provider][juju-provider-tf]. +the [Terraform Juju Provider][juju-provider-tf]. ## Prerequisites This guide requires the following: -- A Juju controller and model +- A Juju controller and model - The Terraform cli, which can be installed via the [snap store][terraform] @@ -17,7 +17,7 @@ As a first step, authenticate the Terraform Juju Provider with the Juju controller. Choose one of the options outlined in the [provider documentation][auth]. -## Terraform Module creation +## Terraform module creation The Terraform deployment is done using a root module that specifies the Juju model to deploy the submodules to. The root module also references @@ -159,7 +159,7 @@ terraform plan terraform apply ``` -```{note} +```{note} Make sure the deployment [channel] is set with:  export TF_VAR_channel={{channel}} @@ -178,4 +178,3 @@ juju status --watch 5s [auth]: https://registry.terraform.io/providers/juju/juju/latest/docs#authentication [channel]: ../../explanation/channels.md [terraform]: https://snapcraft.io/terraform - diff --git a/docs/src/charm/howto/openstack.md b/docs/src/charm/howto/openstack.md index 7c4a84b97..a54244c64 100644 --- a/docs/src/charm/howto/openstack.md +++ b/docs/src/charm/howto/openstack.md @@ -1,4 +1,4 @@ -# Integrating with OpenStack +# How to integrate with OpenStack This guide explains how to integrate {{product}} with the OpenStack cloud platform. The `openstack-integrator` charm simplifies working with {{product}} @@ -49,7 +49,7 @@ relations: - [cinder-csi:openstack, openstack-integrator:clients] ``` -### Deploying the overlay template +### Deploy the overlay template Deploy the {{product}} bundle on OpenStack using the modified overlay: diff --git a/docs/src/charm/howto/proxy.md b/docs/src/charm/howto/proxy.md index 696ee59bd..db2e02671 100644 --- a/docs/src/charm/howto/proxy.md +++ b/docs/src/charm/howto/proxy.md @@ -1,10 +1,10 @@ -# Configuring proxy settings for K8s +# How to configure proxy settings for K8s {{product}} packages a number of utilities (for example curl, helm) which need to fetch resources they expect to find on the internet. In a constrained network environment, such access is usually controlled through proxies. -## Adding proxy configuration for the k8s charms +## Add proxy configuration for the k8s charms For the charm deployments of {{product}}, Juju manages proxy configuration through the [Juju model]. diff --git a/docs/src/charm/howto/troubleshooting.md b/docs/src/charm/howto/troubleshooting.md index 509ab0cb6..02feffd82 100644 --- a/docs/src/charm/howto/troubleshooting.md +++ b/docs/src/charm/howto/troubleshooting.md @@ -1,10 +1,16 @@ # How to troubleshoot {{product}} -Identifying issues in a Kubernetes cluster can be difficult, especially to new users. With {{product}} we aim to make deploying and managing your cluster as easy as possible. This how-to guide will walk you through the steps to troubleshoot your {{product}} cluster. +Identifying issues in a Kubernetes cluster can be difficult, especially to new +users. With {{product}} we aim to make deploying and managing your cluster as +easy as possible. This how-to guide will walk you through the steps to +troubleshoot your {{product}} cluster. ## Common issues -Maybe your issue has already been solved? Check out the [troubleshooting reference][charm-troubleshooting-reference] page to see a list of common issues and their solutions. Otherwise continue with this guide to help troubleshoot your {{product}} cluster. +Maybe your issue has already been solved? Check out the +[troubleshooting reference][charm-troubleshooting-reference] page to see a list +of common issues and their solutions. Otherwise continue with this guide to +help troubleshoot your {{product}} cluster. ## Check the cluster status @@ -15,6 +21,7 @@ juju status ``` You should see a command output similar to the following: + ``` Model Controller Cloud/Region Version SLA Timestamp k8s-testing localhost-localhost localhost/localhost 3.6.1 unsupported 09:06:50Z @@ -31,14 +38,20 @@ Machine State Address Inst id Base AZ Message 0 started 10.94.106.136 juju-380ff2-0 ubuntu@24.04 Running 1 started 10.94.106.154 juju-380ff2-1 ubuntu@24.04 Running ``` + Interpreting the Output: + - The `Workload` column shows the status of a given service. -- The `Message` section details the health of a given service in the cluster. +- The `Message` section details the health of a given service in the cluster. - The `Agent` column reflects any activity of the Juju agent. -During deployment and maintenance the workload status will reflect the node's activity. An example workload may display `maintenance` along with the message details: `Ensuring snap installation`. +During deployment and maintenance the workload status will reflect the node's +activity. An example workload may display `maintenance` along with the message +details: `Ensuring snap installation`. -During normal cluster operation the `Workload` column reads `active`, the `Agent` column shows `idle`, and the messages will either read `Ready` or another descriptive term. +During normal cluster operation the `Workload` column reads `active`, the +`Agent` column shows `idle`, and the messages will either read `Ready` or +another descriptive term. ## Test the API server health @@ -58,7 +71,8 @@ Verify that the API server is healthy and reachable by running: kubectl --kubeconfig cluster-kubeconfig.yaml get all ``` -This command lists resources that exist under the default namespace. If the API server is healthy you should see a command output similar to the following: +This command lists resources that exist under the default namespace. If the API +server is healthy you should see a command output similar to the following: ``` NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE @@ -84,15 +98,19 @@ juju exec --unit k8s/0 -- journalctl -u snap.k8s.kube-apiserver ``` A failure can mean that: + * The API server is not reachable due to network issues or firewall limitations * The API server on the particular node is unhealthy * The control-plane node that's being reached is down -Try reaching the API server on a different unit by retrieving the kubeconfig file with `juju run get-kubeconfig`. Please replace `#` with the desired unit's number. +Try reaching the API server on a different unit by retrieving the kubeconfig +file with `juju run get-kubeconfig`. Please replace `#` with the +desired unit's number. ## Check the cluster nodes' health -Confirm that the nodes in the cluster are healthy by looking for the `Ready` status: +Confirm that the nodes in the cluster are healthy by looking for the `Ready` +status: ``` kubectl --kubeconfig cluster-kubeconfig.yaml get nodes @@ -107,23 +125,27 @@ juju-380ff2-1 Ready worker 77s v1.32.0 ``` -## Troubleshooting an unhealthy node +## Troubleshoot an unhealthy node -Every healthy {{ product }} node has certain services up and running. The required services depend on the type of node. +Every healthy {{ product }} node has certain services up and running. The +required services depend on the type of node. Services running on both the control plane and worker nodes: + * `k8sd` * `kubelet` * `containerd` * `kube-proxy` Services running only on the control-plane nodes: + * `kube-apiserver` * `kube-controller-manager` * `kube-scheduler` * `k8s-dqlite` Services running only on the worker nodes: + * `k8s-apiserver-proxy` SSH into the unhealthy node by running: @@ -144,11 +166,13 @@ Check the logs of a failing service by executing: sudo journalctl -xe -u snap.k8s. ``` -If the issue indicates a problem with the configuration of the services on the node, examine the arguments used to run these services. +If the issue indicates a problem with the configuration of the services on the +node, examine the arguments used to run these services. -The arguments of a service on the failing node can be examined by reading the file located at `/var/snap/k8s/common/args/`. +The arguments of a service on the failing node can be examined by reading the +file located at `/var/snap/k8s/common/args/`. -## Investigating system pods' health +## Investigate system pods' health Check whether all of the cluster's pods are `Running` and `Ready`: @@ -156,9 +180,11 @@ Check whether all of the cluster's pods are `Running` and `Ready`: kubectl --kubeconfig cluster-kubeconfig.yaml get pods -n kube-system ``` -The pods in the `kube-system` namespace belong to {{product}}' features such as `network`. Unhealthy pods could be related to configuration issues or nodes not meeting certain requirements. +The pods in the `kube-system` namespace belong to {{product}}' features such as +`network`. Unhealthy pods could be related to configuration issues or nodes not +meeting certain requirements. -## Troubleshooting a failing pod +## Troubleshoot a failing pod Look at the events on a failing pod by running: @@ -172,13 +198,17 @@ Check the logs on a failing pod by executing: kubectl --kubeconfig cluster-kubeconfig.yaml logs -n ``` -You can check out the upstream [debug pods documentation][] for more information. +You can check out the upstream [debug pods documentation][] for more +information. -## Using the built-in inspection script +## Use the built-in inspection script -{{product}} ships with a script to compile a complete report on {{product}} and its underlying system. This is an essential tool for bug reports and for investigating whether a system is (or isn’t) working. +{{product}} ships with a script to compile a complete report on {{product}} and +its underlying system. This is an essential tool for bug reports and for +investigating whether a system is (or isn’t) working. -Inspection script can be executed on a specific unit by running the following commands: +Inspection script can be executed on a specific unit by running the following +commands: ``` juju exec --unit -- sudo /snap/k8s/current/k8s/scripts/inspect.sh /home/ubuntu/inspection-report.tar.gz @@ -186,6 +216,7 @@ juju scp :/home/ubuntu/inspection-report.tar.gz ./ ``` The command output is similar to the following: + ``` Collecting service information Running inspection on a control-plane node @@ -214,25 +245,37 @@ Building the report tarball SUCCESS: Report tarball is at /home/ubuntu/inspection-report.tar.gz ``` -Use the report to ensure that all necessary services are running and dive into every aspect of the system. +Use the report to ensure that all necessary services are running and dive into +every aspect of the system. -## Collecting debug information +## Collect debug information -To collect comprehensive debug output from your {{product}} cluster, install and run [juju-crashdump][] on a computer that has the Juju client installed. Please ensure that the current controller and model are pointing at your {{product}} deployment. +To collect comprehensive debug output from your {{product}} cluster, install +and run [juju-crashdump][] on a computer that has the Juju client installed. +Please ensure that the current controller and model are pointing at your +{{product}} deployment. ``` sudo snap install juju-crashdump --classic --channel edge juju-crashdump -a debug-layer -a config ``` -Running the `juju-crashdump` script will generate a tarball of debug information that includes [systemd][] unit status and logs, Juju logs, charm unit data, and Kubernetes cluster information. Please include the generated tarball when filing a bug. +Running the `juju-crashdump` script will generate a tarball of debug +information that includes [systemd][] unit status and logs, Juju logs, charm +unit data, and Kubernetes cluster information. Please include the generated +tarball when filing a bug. + +## Report a bug -## Reporting a bug -If you cannot solve your issue and believe that the fault may lie in {{product}}, please [file an issue on the project repository][]. +If you cannot solve your issue and believe that the fault may lie in +{{product}}, please [file an issue on the project repository][]. -Help us deal effectively with issues by including the report obtained from the inspect script, the tarball obtained from `juju-crashdump`, as well as any additional logs, and a summary of the issue. +Help us deal effectively with issues by including the report obtained from the +inspect script, the tarball obtained from `juju-crashdump`, as well as any +additional logs, and a summary of the issue. -You can check out the upstream [debug documentation][] for more details on troubleshooting a Kubernetes cluster. +You can check out the upstream [debug documentation][] for more details on +troubleshooting a Kubernetes cluster. diff --git a/docs/src/charm/howto/upgrade-minor.md b/docs/src/charm/howto/upgrade-minor.md index b3960c6da..3a3822051 100644 --- a/docs/src/charm/howto/upgrade-minor.md +++ b/docs/src/charm/howto/upgrade-minor.md @@ -1,4 +1,4 @@ -# How to Upgrade {{product}} to the next minor revision +# How to upgrade {{product}} to the next minor revision It is recommended that you keep your Kubernetes deployment updated to the latest available stable version. You should @@ -82,7 +82,7 @@ rely on deprecated APIs. ## Specific upgrade instructions -### Deciding if an upgrade is available +### Decide if an upgrade is available Juju will contact [charmhub] daily to find new revisions of charms deployed in your models. To see if the `k8s` or `k8s-worker` charms @@ -136,7 +136,7 @@ juju run k8s/leader pre-upgrade-check If no error appears, the `pre-upgrade-check` completed successfully. -### Refreshing charm applications +### Refresh charm applications #### Control Plane units (k8s) @@ -189,7 +189,7 @@ application using the k8s-worker charm, if multiple k8s-worker applications appear in the same model. ``` -## Verify an Upgrade +## Verify an upgrade Once an upgrade is complete, confirm the successful upgrade by running: diff --git a/docs/src/charm/howto/upgrade-patch.md b/docs/src/charm/howto/upgrade-patch.md index 9e25f366f..37aa3e169 100644 --- a/docs/src/charm/howto/upgrade-patch.md +++ b/docs/src/charm/howto/upgrade-patch.md @@ -1,4 +1,4 @@ -# How to Upgrade {{product}} to the next patch revision +# How to upgrade {{product}} to the next patch revision It is recommended that you keep your Kubernetes deployment updated to the latest available stable version. You should @@ -82,7 +82,7 @@ rely on deprecated APIs. ## Specific upgrade instructions -### Deciding if an upgrade is available +### Decide if an upgrade is available Juju will contact [charmhub] daily to find new revisions of charms deployed in your models. To see if the `k8s` or `k8s-worker` charms @@ -129,7 +129,7 @@ juju run k8s/leader pre-upgrade-check If no error appears, the `pre-upgrade-check` completed successfully. -### Refreshing charm applications +### Refresh charm applications #### Control Plane units (k8s) @@ -181,7 +181,7 @@ applications appear in the same model. ``` -## Verify an Upgrade +## Verify an upgrade Once an upgrade is complete, confirm the successful upgrade by running: diff --git a/docs/src/charm/howto/validate.md b/docs/src/charm/howto/validate.md index bef5ae8d3..6d206160e 100644 --- a/docs/src/charm/howto/validate.md +++ b/docs/src/charm/howto/validate.md @@ -1,20 +1,20 @@ -# How to Validate {{product}} +# How to validate {{product}} End-to-end (e2e) tests for **Kubernetes** provide a mechanism to test the behavior of the system. This is a useful indicator -that the cluster is performing properly, as well as a good +that the cluster is performing properly, as well as a good validation of any code changes. -For **{{product}}**, these tests are encapsulated +For **{{product}}**, these tests are encapsulated in an additional **Juju** charm which can be added to your cluster. Actual testing is then run through the charm's actions. -```{caution} Your cluster will need to have at least two +```{caution} Your cluster will need to have at least two running worker units for the `e2e` test to run properly. ``` -## Deploying the kubernetes-e2e charm +## Deploy the kubernetes-e2e charm Add the charm to your cluster: @@ -27,7 +27,7 @@ This relates to the installed version of Kubernetes. You can check which version your cluster is set to by running: ```bash -juju status k8s +juju status k8s ``` The output will be in the form of `version.number/risk`, e.g `1.31/stable`. You should set @@ -43,9 +43,10 @@ Finally we relate the charm to `k8s`: juju integrate kubernetes-e2e:kube-control k8s:kube-control ``` -It may take some moments for these relations to establish. Once the connections are made, the charm will update its status to "Ready to test." +It may take some moments for these relations to establish. Once the connections +are made, the charm will update its status to "Ready to test." -## Running the default tests +## Run the default tests The tests are configured as a **Juju** _action_. To run the default tests: @@ -53,7 +54,8 @@ The tests are configured as a **Juju** _action_. To run the default tests: juju run kubernetes-e2e/0 test --background ``` -The command will return with a number for that specific action operation. +The command will return with a number for that specific action operation. + ```console Scheduled operation 25 with task 26 Check operation status with 'juju show-operation 25' @@ -62,7 +64,7 @@ Check task status with 'juju show-task 26' See the section on [Test output](#test-output) below for details. -## Running specific tests +## Run specific tests The complete set of **Kubernetes** e2e tests is more fully described in the [upstream Kubernetes documentation][e2e-upstream]. In some cases you may wish @@ -78,7 +80,8 @@ juju run kubernetes-e2e/0 test skip='\[(Flaky|Slow|Serial)\]' --background Note that the brackets for the regex need to be escaped as shown. -Running this command will return a uuid for that specific test run, as with the default case. +Running this command will return a uuid for that specific test run, as with the +default case. ## Test output @@ -89,7 +92,7 @@ juju show-operation 25 ``` where `25` is the id of the scheduled operation when the test was initiated. -This will return YAML output indicating the current status, +This will return YAML output indicating the current status, which can be either `running`, `completed` or `failed`. ```yaml @@ -141,10 +144,10 @@ tasks: log: /home/ubuntu/26.log.tar.gz ``` -If the tests fail, or you want to look through the detail of each test, you can examine the -detailed log. +If the tests fail, or you want to look through the detail of each test, you can +examine the detailed log. -## Viewing test logs +## View test logs The test logfile is stored as a file on the test instance. The filename corresponds to the id of the action which created it, with a '.log' @@ -173,7 +176,7 @@ The uncompressed logs in particular can be very large and quickly fill up storage. ``` -## Upgrading the e2e tests +## Upgrade the e2e tests When an update is available, the `kubernetes-e2e` charm can be upgraded with the command: @@ -184,4 +187,3 @@ juju refresh kubernetes-e2e --channel=${release} [e2e-upstream]: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/e2e-tests.md - diff --git a/docs/src/charm/reference/index.md b/docs/src/charm/reference/index.md index 8bd552515..b29521719 100644 --- a/docs/src/charm/reference/index.md +++ b/docs/src/charm/reference/index.md @@ -14,7 +14,7 @@ Overview Charms Architecture Configuration options -Ports and Services +Ports and services proxy Actions troubleshooting diff --git a/docs/src/charm/reference/releases.md b/docs/src/charm/reference/releases.md index ebb1323e9..fa7990dbf 100644 --- a/docs/src/charm/reference/releases.md +++ b/docs/src/charm/reference/releases.md @@ -1,7 +1,7 @@ # Release notes -This is an index page for all the available releases of the {{product}} -charms. Each entry will take you to version specific information including +This is an index page for all the available releases of the {{product}} +charms. Each entry will take you to version specific information including new features, bug fixes and backwards-incompatible changes. ## Releases @@ -16,12 +16,12 @@ new features, bug fixes and backwards-incompatible changes. ## Release policy and schedule -Our release cadence and support window for all Kubernetes-related products are +Our release cadence and support window for all Kubernetes-related products are available on the main Ubuntu website, on the [release cycle page][]. -## {{product}} Releases +## {{product}} releases -{{product}} charms are Juju operators for the {{product}} snap. +{{product}} charms are Juju operators for the {{product}} snap. For the latest changes in the snap, see the [snap release page][]. diff --git a/docs/src/snap/explanation/cis.md b/docs/src/snap/explanation/cis.md index 2abcd4896..32fe3bb3b 100644 --- a/docs/src/snap/explanation/cis.md +++ b/docs/src/snap/explanation/cis.md @@ -1,20 +1,20 @@ -# CIS Hardening +# CIS hardening CIS Hardening refers to the process of implementing security configurations that align with the benchmarks set forth by the [Center for Internet Security] (CIS). These [benchmarks] are a set of best practices and guidelines designed to secure -various software and hardware systems, including Kubernetes clusters. The -primary goal of CIS hardening is to reduce the attack surface and enhance the +various software and hardware systems, including Kubernetes clusters. The +primary goal of CIS hardening is to reduce the attack surface and enhance the overall security posture of an environment by enforcing configurations that are known to protect against common vulnerabilities and threats. ## Why is CIS hardening important for Kubernetes? -Kubernetes, by its nature, is a complex system with many components interacting -in a distributed environment. This complexity can introduce numerous security -risks if not properly managed such as unauthorised access, data breaches and +Kubernetes, by its nature, is a complex system with many components interacting +in a distributed environment. This complexity can introduce numerous security +risks if not properly managed such as unauthorised access, data breaches and service disruption. CIS hardening for Kubernetes focuses on configuring various -components of a Kubernetes cluster to meet the security standards specified in +components of a Kubernetes cluster to meet the security standards specified in the [CIS Kubernetes Benchmark]. ## Apply CIS hardening to {{product}} @@ -22,7 +22,7 @@ the [CIS Kubernetes Benchmark]. If you would like to apply CIS hardening to your cluster see our [how-to guide]. -[benchmarks]: https://www.cisecurity.org/cis-benchmarks -[Center for Internet Security]: https://www.cisecurity.org/ +[benchmarks]: https://www.cisecurity.org/cis-benchmarks +[Center for Internet Security]: https://www.cisecurity.org/ [CIS Kubernetes Benchmark]: https://www.cisecurity.org/benchmark/kubernetes [how-to guide]: ../howto/cis-hardening.md diff --git a/docs/src/snap/explanation/index.md b/docs/src/snap/explanation/index.md index 654594931..3cf3ddf63 100644 --- a/docs/src/snap/explanation/index.md +++ b/docs/src/snap/explanation/index.md @@ -19,7 +19,7 @@ high-availability certificates ingress load-balancer -Upgrading +Upgrades epa security cis diff --git a/docs/src/snap/explanation/installation-methods.md b/docs/src/snap/explanation/installation-methods.md index cbf9c988d..b68bf156e 100644 --- a/docs/src/snap/explanation/installation-methods.md +++ b/docs/src/snap/explanation/installation-methods.md @@ -1,4 +1,4 @@ -# Choosing an installation method +# Choose an installation method {{ product }} can be installed in a variety of ways, depending on your needs and preferences. All installation methods provide a fully functional cluster with diff --git a/docs/src/snap/howto/backup-restore.md b/docs/src/snap/howto/backup-restore.md index dc54a9cab..bd429c802 100644 --- a/docs/src/snap/howto/backup-restore.md +++ b/docs/src/snap/howto/backup-restore.md @@ -1,4 +1,4 @@ -# Backup and restore +# How to backup and restore a {{product}} cluster [Velero][] is a popular open source backup solution for Kubernetes. Its core implementation is a controller running in the cluster that oversees the backup @@ -63,12 +63,12 @@ sudo k8s kubectl expose deployment nginx -n workloads --port 80 ## Install Velero -Download the Velero binary from the +Download the Velero binary from the [releases page on GitHub][releases] and place it in our `PATH`. In this case we install the v1.14.1 Linux binary for AMD64 under `/usr/local/bin`: ```bash -wget https://github.com/vmware-tanzu/velero/releases/download/v1.14.0/velero-v1.14.0-linux-amd64.tar.gz +wget https://github.com/vmware-tanzu/velero/releases/download/v1.14.0/velero-v1.14.0-linux-amd64.tar.gz tar -xzf velero-v1.14.0-linux-amd64.tar.gz chmod +x velero-v1.14.0-linux-amd64/velero sudo chown root:root velero-v1.14.0-linux-amd64/velero @@ -119,12 +119,12 @@ velero install \ ## Backup workloads To backup the `workloads` namespace we use the `--include-namespaces` argument: - + ```bash velero backup create workloads-backup --include-namespaces=workloads ``` -```{note} Please see the +```{note} Please see the [official Velero documentation](https://velero.io/docs/v1.14/file-system-backup/#to-back-up) for details on how to backup persistent volumes and the supported volume types. ``` @@ -133,10 +133,10 @@ To check the progress of a backup operation we use `describe`, providing the backup name: ```bash -velero backup describe workloads-backup +velero backup describe workloads-backup ``` -In the output you should see this operation completed: +In the output you should see this operation completed: ```bash Name: workloads-backup @@ -286,7 +286,7 @@ options it is also very easy to use. You are required to set up a backup strategy based on the backend that will hold the backups and the scheduling of the backups. The rest is taken care of by the tool itself. - + [Velero]: https://velero.io/ @@ -294,4 +294,4 @@ the backups. The rest is taken care of by the tool itself. [MinIO]: https://min.io/ [MinIO Charts]: https://charts.min.io/ [releases]: https://github.com/vmware-tanzu/velero/releases -[aws-plugin-matching]: https://github.com/vmware-tanzu/velero-plugin-for-aws?tab=readme-ov-file#compatibility \ No newline at end of file +[aws-plugin-matching]: https://github.com/vmware-tanzu/velero-plugin-for-aws?tab=readme-ov-file#compatibility diff --git a/docs/src/snap/howto/epa.md b/docs/src/snap/howto/epa.md index f595a486c..7da71528a 100644 --- a/docs/src/snap/howto/epa.md +++ b/docs/src/snap/howto/epa.md @@ -16,7 +16,7 @@ SR-IOV/DPDK. - Ubuntu instances **or** a MAAS environment to run {{product}} on -## Prepare the Environment +## Prepare the environment `````{tabs} @@ -329,9 +329,10 @@ EPA capabilities. :end-before: ``` -2. Create a file called *configuration.yaml* or download it {download}`here `. In this configuration file we let - the snap start with its default CNI (calico), with CoreDNS deployed and we - also point k8s to the external etcd. +2. Create a file called *configuration.yaml* or download it +{download}`here `. In this configuration file +we let the snap start with its default CNI (calico), with CoreDNS deployed and +we also point k8s to the external etcd. ```{literalinclude} /src/assets/configuration.yaml :language: yaml @@ -697,7 +698,7 @@ T: 0 ( 2965) P:80 I:1000 C: 241486 Min: 3 Act: 4 Avg: 3 Max: 18 ``` -### Test CPU Pinning and NUMA +### Test CPU pinning and NUMA First check if CPU Manager and NUMA Topology Manager is set up in the worker node: diff --git a/docs/src/snap/howto/index.md b/docs/src/snap/howto/index.md index afb3fb9c0..ff90877f3 100644 --- a/docs/src/snap/howto/index.md +++ b/docs/src/snap/howto/index.md @@ -20,14 +20,14 @@ storage/index security/index Use an external datastore Set up cluster observability -backup-restore -refresh-certs -restore-quorum +Back up and restore +Refresh Kubernetes Certificates +Recover a cluster after quorum loss two-node-ha -Managing upgrades +Manage upgrades Set up Enhanced Platform Awareness Manage images -contribute +Contribute to Canonical Kubernetes Get support Troubleshooting ``` diff --git a/docs/src/snap/howto/install/custom-bootstrap-config.md b/docs/src/snap/howto/install/custom-bootstrap-config.md index edda88b66..2b7c07266 100644 --- a/docs/src/snap/howto/install/custom-bootstrap-config.md +++ b/docs/src/snap/howto/install/custom-bootstrap-config.md @@ -1,14 +1,14 @@ -# Custom bootstrap configuration +# How to install {{product}} with a custom bootstrap configuration When creating a {{ product }} cluster that differs from the default configuration you can choose to use a custom bootstrap configuration. The CLI's interactive mode or a custom bootstrap configuration file allow you -to modify the configuration of the first node of your cluster. +to modify the configuration of the first node of your cluster. ## Configuration options Please consult the [reference page] for all of the -available configuration options and their defaults. +available configuration options and their defaults. ``` {note} Most of these configuration options are set during the initial bootstrapping @@ -47,7 +47,7 @@ The node will be 'Ready' to host workloads after the CNI is deployed successfull ## Bootstrap configuration file If your deployment requires a more fine-tuned configuration, use the bootstrap -configuration file. +configuration file. ``` {note} When using the custom configuration file on bootstrap, all features including diff --git a/docs/src/snap/howto/install/index.md b/docs/src/snap/howto/install/index.md index 24833b9db..7ea5eff64 100644 --- a/docs/src/snap/howto/install/index.md +++ b/docs/src/snap/howto/install/index.md @@ -1,4 +1,4 @@ -# Installing {{product}} +# Install {{product}} ```{toctree} :hidden: @@ -12,10 +12,10 @@ the current How-to guides below. :glob: :titlesonly: -Install from a snap -custom-bootstrap-config -multipass -Install in LXD -Install in air-gapped environments +... from a snap +... with a custom bootstrap configuration +... with Multipass +... in LXD +... in air-gapped environments Uninstall the snap ``` diff --git a/docs/src/snap/howto/install/lxd.md b/docs/src/snap/howto/install/lxd.md index 84dc55a67..6833ad5a7 100644 --- a/docs/src/snap/howto/install/lxd.md +++ b/docs/src/snap/howto/install/lxd.md @@ -1,10 +1,10 @@ -# Install {{product}} in LXD +# How to install {{product}} in LXD {{product}} can also be installed inside an LXD container. This is a great way, for example, to test out clustered {{product}} without the need for multiple physical hosts. -## Installing LXD +## Install LXD You can install [LXD] via snaps: @@ -178,7 +178,7 @@ address that we noted earlier. curl 10.122.174.30:32750 ``` -## Stop/Remove the container +## Stop/remove the container The `k8s` container you created will keep running in the background until it is either stopped or the host computer is shut down. You can stop the running diff --git a/docs/src/snap/howto/install/multipass.md b/docs/src/snap/howto/install/multipass.md index 23fcab92e..48aec79ce 100644 --- a/docs/src/snap/howto/install/multipass.md +++ b/docs/src/snap/howto/install/multipass.md @@ -1,4 +1,4 @@ -# Install with Multipass (Ubuntu/macOS/Windows) +# How to install {{product}} with Multipass (Ubuntu/macOS/Windows) [Multipass] provides an easy way to run Ubuntu in a virtual machine, regardless of your underlying operating system. It is the recommended way to run @@ -92,7 +92,7 @@ exit ...and you will be returned to the original terminal session. -## Stop/Remove the instance +## Stop/remove the instance The instance you created will keep running in the background until it is either stopped or the host computer is shut down. You can stop the running instance at diff --git a/docs/src/snap/howto/install/offline.md b/docs/src/snap/howto/install/offline.md index 7d93fc985..8ef698df4 100644 --- a/docs/src/snap/howto/install/offline.md +++ b/docs/src/snap/howto/install/offline.md @@ -1,4 +1,4 @@ -# Installing {{product}} in air-gapped environments +# How to install {{product}} in air-gapped environments There are situations where it is necessary or desirable to run {{product}} on a machine that is not connected to the internet. Based on different degrees @@ -55,7 +55,7 @@ ports used by {{product}}. --> #### Default Gateway -In cases where the air-gap environment does not have a default gateway, +In cases where the air-gap environment does not have a default Gateway, add a dummy default route on the `eth0` interface using the following command: ``` @@ -130,7 +130,7 @@ downloaded `k8s` snap is unsquashed. Please ensure that the images used by workloads are tracked as well. -#### Images Option A: via an HTTP proxy +#### Images option A: via an HTTP proxy In many cases, the nodes of the air-gap deployment may not have direct access to upstream registries, but can reach them through the [use of an HTTP @@ -140,7 +140,7 @@ The configuration of the proxy is out of the scope of this documentation. (private-registry)= -#### Images Option B: private registry mirror +#### Images option B: private registry mirror In case regulations and/or network constraints do not permit the cluster nodes @@ -195,7 +195,7 @@ this are described in [Side-load images](#side-load). (side-load)= -#### Images Option C: Side-load images +#### Images option C: Side-load images Image side-loading is the process of loading all required OCI images directly @@ -240,12 +240,12 @@ sudo snap ack k8s.assert && sudo snap install ./k8s.snap --classic Repeat the above for all nodes of the cluster. -### Step 2: Container Runtime +### Step 2: Container runtime The container runtime must be configured to fetch images properly. Choose one of the following options: -#### Container Runtime Option A: Configure HTTP proxy for registries +#### Container runtime option A: Configure HTTP proxy for registries Create or edit the `/etc/systemd/system/snap.k8s.containerd.service.d/http-proxy.conf` @@ -253,7 +253,7 @@ file on each node and set the appropriate `http_proxy`, `https_proxy` and `no_proxy` variables as described in the [adding proxy configuration section][proxy]. -#### Container Runtime Option B: Configure registry mirrors +#### Container runtime option B: Configure registry mirrors This requires having already set up a registry mirror, as explained in the preparation section on the private registry mirror. Complete the following @@ -285,7 +285,7 @@ capabilities = ["pull", "resolve"] ca = "/var/snap/k8s/common/etc/containerd/hosts.d/ghcr.io/ca.crt" ``` -#### Container Runtime Option C: Side-load images +#### Container runtime option C: Side-load images This is only required if choosing to [side-load images](#side-load). Make sure that the directory `/var/snap/k8s/common/images` directory exists, then copy diff --git a/docs/src/snap/howto/install/snap.md b/docs/src/snap/howto/install/snap.md index 7839075ba..8db2c959d 100644 --- a/docs/src/snap/howto/install/snap.md +++ b/docs/src/snap/howto/install/snap.md @@ -1,4 +1,4 @@ -# Install {{product}} from a snap +# How to install {{product}} from a snap {{product}} is packaged as a [snap], available from the snap store for all supported platforms. diff --git a/docs/src/snap/howto/networking/default-ingress.md b/docs/src/snap/howto/networking/default-ingress.md index c5a5bcb9e..9bfe8be3f 100644 --- a/docs/src/snap/howto/networking/default-ingress.md +++ b/docs/src/snap/howto/networking/default-ingress.md @@ -53,7 +53,7 @@ You should see three options: - `enable-proxy-protocol`: If set, proxy protocol will be enabled for the Ingress -### TLS Secret +### TLS secret You can create a TLS secret by following the official [Kubernetes documentation][kubectl-create-secret-TLS/]. @@ -68,7 +68,7 @@ sudo k8s set ingress.default-tls-secret= Replace `` with the desired value for your Ingress configuration. -### Proxy Protocol +### Proxy protocol Enabling the proxy protocol allows passing client connection information to the backend service. diff --git a/docs/src/snap/howto/networking/default-loadbalancer.md b/docs/src/snap/howto/networking/default-loadbalancer.md index 49a0126c2..da63f3c43 100644 --- a/docs/src/snap/howto/networking/default-loadbalancer.md +++ b/docs/src/snap/howto/networking/default-loadbalancer.md @@ -1,8 +1,8 @@ -# How to use the default load-balancer +# How to use the default load balancer -{{product}} includes a default load-balancer. As this is not an +{{product}} includes a default load balancer. As this is not an essential service for all deployments, it is not enabled by default. This guide -explains how to configure and enable the load-balancer. +explains how to configure and enable the `load-balancer`. ## Prerequisites @@ -14,17 +14,17 @@ This guide assumes the following: ## Check the status and configuration -Find out whether load-balancer is enabled or disabled with the following +Find out whether load balancer is enabled or disabled with the following command: ``` sudo k8s status ``` -The load-balancer is not enabled by default, it won't be listed on the status +The load balancer is not enabled by default, it won't be listed on the status output unless it has been subsequently enabled. -To check the current configuration of the load-balancer, run the following: +To check the current configuration of the `load-balancer`, run the following: ``` sudo k8s get load-balancer @@ -57,9 +57,9 @@ simultaneously. E.g. sudo k8s set load-balancer.bgp-mode=true load-balancer.bgp-local-asn=64512 load-balancer.bgp-peer-address=10.0.10.63 load-balancer.bgp-peer-asn=64512 load-balancer.bgp-peer-port=7012 ``` -## Enable the load-balancer +## Enable the load balancer -To enable the load-balancer, run: +To enable the load balancer, run: ``` sudo k8s enable load-balancer @@ -72,13 +72,13 @@ sudo k8s status ``` ```{important} -If you run `k8s status` soon after enabling the load-balancer in BGP mode, -`k8s status` might report errors. Please wait a few moments for the load-balancer to finish deploying and try again. +If you run `k8s status` soon after enabling the load balancer in BGP mode, +`k8s status` might report errors. Please wait a few moments for the load balancer to finish deploying and try again. ``` -## Disable the load-balancer +## Disable the load balancer -The default load-balancer can be disabled again with: +The default load balancer can be disabled again with: ``` sudo k8s disable load-balancer @@ -86,7 +86,7 @@ sudo k8s disable load-balancer ## Next Step -- Learn more in the [Load-Balancer] explanation page. +- Learn more in the [Load Balancer] explanation page. [CIDR]: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing diff --git a/docs/src/snap/howto/networking/dualstack.md b/docs/src/snap/howto/networking/dualstack.md index 227f59278..5e3269286 100644 --- a/docs/src/snap/howto/networking/dualstack.md +++ b/docs/src/snap/howto/networking/dualstack.md @@ -69,7 +69,7 @@ cluster bootstrap process. The key configuration parameters are: sudo k8s kubectl apply -f https://raw.githubusercontent.com/canonical/k8s-snap/main/docs/src/assets/how-to-dualstack-manifest.yaml ``` -1. **Check IPv6 Connectivity** +1. **Check IPv6 connectivity** Retrieve the service details and ensure that an IPv6 address is assigned: @@ -94,7 +94,7 @@ cluster bootstrap process. The key configuration parameters are: working. -## CIDR Size Limitations +## CIDR size limitations When setting up dual-stack networking, it is important to consider the limitations regarding CIDR size: diff --git a/docs/src/snap/howto/networking/index.md b/docs/src/snap/howto/networking/index.md index 15aa44450..5ec7f4590 100644 --- a/docs/src/snap/howto/networking/index.md +++ b/docs/src/snap/howto/networking/index.md @@ -14,7 +14,7 @@ how to configure and use key capabilities of {{product}}. Use default DNS Use default network Use default Ingress -Use default load-balancer +Use default load balancer Use an alternative CNI Enable Dual-Stack networking Set up an IPv6-only cluster diff --git a/docs/src/snap/howto/networking/ipv6.md b/docs/src/snap/howto/networking/ipv6.md index b4a5e7350..b6d74ceb9 100644 --- a/docs/src/snap/howto/networking/ipv6.md +++ b/docs/src/snap/howto/networking/ipv6.md @@ -1,4 +1,4 @@ -# How to set up an IPv6-Only Cluster +# How to set up an IPv6-only cluster An IPv6-only Kubernetes cluster operates exclusively using IPv6 addresses, without support for IPv4. This configuration is ideal for environments that @@ -17,7 +17,7 @@ to handle IPv6 traffic. - Any underlying infrastructure (e.g. cloud providers, bare metal setups) must be IPv6-compatible. -## Setting Up an IPv6-Only Cluster +## Set up an IPv6-only cluster The process of creating an IPv6-only cluster involves specifying only IPv6 CIDRs for pods and services during the bootstrap process. Unlike dual-stack, @@ -63,7 +63,7 @@ Specify the configuration file during the bootstrapping process: sudo k8s bootstrap --file bootstrap-config.yaml ``` -2. **Verify Pod and Service Creation** +2. **Verify Pod and Service creation** Once the cluster is up, verify that all pods are running: @@ -78,7 +78,7 @@ connectivity of the IPv6-only cluster: sudo k8s kubectl apply -f https://raw.githubusercontent.com/canonical/k8s-snap/main/docs/src/assets/how-to-ipv6-only-manifest.yaml ``` -3. **Verify IPv6 Connectivity** +3. **Verify IPv6 connectivity** Retrieve the service details to confirm an IPv6 address is assigned: @@ -102,9 +102,9 @@ curl http://[fd98::7534]/ A welcome message from the nginx web-server is displayed when IPv6 connectivity is set up correctly. -## IPv6-Only Cluster Considerations +## IPv6-only cluster considerations -**Service and Pod CIDR Sizing** +**Service and Pod CIDR sizing** Use `/108` as the maximum size for Service CIDRs. Larger ranges (e.g., `/64`) may lead to allocation errors or Kubernetes failing to initialise the IPv6 diff --git a/docs/src/snap/howto/networking/proxy.md b/docs/src/snap/howto/networking/proxy.md index 7b695155a..3fb98cafd 100644 --- a/docs/src/snap/howto/networking/proxy.md +++ b/docs/src/snap/howto/networking/proxy.md @@ -1,4 +1,4 @@ -# Configure proxy settings for K8s +# How to configure proxy settings for K8s {{product}} packages a number of utilities (for example curl, Helm) which need to fetch resources they expect to find on the internet. In a restricted diff --git a/docs/src/snap/howto/refresh-certs.md b/docs/src/snap/howto/refresh-certs.md index fb80f370f..f6284af2e 100644 --- a/docs/src/snap/howto/refresh-certs.md +++ b/docs/src/snap/howto/refresh-certs.md @@ -1,4 +1,4 @@ -# Refreshing Kubernetes Certificates +# How to refresh Kubernetes certificates To keep your {{product}} cluster secure and functional, it is essential to regularly refresh its certificates. Certificates in Kubernetes ensure @@ -16,7 +16,7 @@ nodes in your {{product}} cluster. initially set up with self-signed certificates during the bootstrap process. ``` -### Refresh Control Plane Node Certificates +### Refresh Control Plane node certificates 1. To refresh the certificates on control plane nodes, perform the following steps on each control plane node in your cluster: @@ -51,7 +51,7 @@ displayed in the command output: Certificates have been successfully refreshed, and will expire at 2025-08-27 21:00:00 +0000 UTC. ``` -### Refresh Worker Node Certificates +### Refresh Worker node certificates 1. To refresh the certificates on worker nodes, perform the following steps on each worker node in your cluster: diff --git a/docs/src/snap/howto/restore-quorum.md b/docs/src/snap/howto/restore-quorum.md index f08389699..992607854 100755 --- a/docs/src/snap/howto/restore-quorum.md +++ b/docs/src/snap/howto/restore-quorum.md @@ -1,4 +1,4 @@ -# Recovering a cluster after quorum loss +# How to recover a cluster after quorum loss Highly available {{product}} clusters can survive losing one or more nodes. [Dqlite], the default datastore, implements a [Raft] based protocol diff --git a/docs/src/snap/howto/security/cis-assessment.md b/docs/src/snap/howto/security/cis-assessment.md index 153657a9f..9ae535baa 100644 --- a/docs/src/snap/howto/security/cis-assessment.md +++ b/docs/src/snap/howto/security/cis-assessment.md @@ -1,4 +1,4 @@ -# CIS compliance +# How to assess CIS compliance CIS Hardening refers to the process of implementing security configurations that align with the benchmarks set by the [Center for Internet Security (CIS)]. diff --git a/docs/src/snap/howto/security/disa-stig-assessment.md b/docs/src/snap/howto/security/disa-stig-assessment.md index d3a97f651..2f77841eb 100644 --- a/docs/src/snap/howto/security/disa-stig-assessment.md +++ b/docs/src/snap/howto/security/disa-stig-assessment.md @@ -1,5 +1,4 @@ - -# Assess DISA STIG for {{product}} +# How to assess DISA STIG for {{product}} Security Technical Implementation Guides (STIGs) are developed by the Defense Information System Agency (DISA) for the U.S. Department of Defense (DoD). diff --git a/docs/src/snap/howto/security/hardening.md b/docs/src/snap/howto/security/hardening.md index 96fe7e59e..27950ea0f 100644 --- a/docs/src/snap/howto/security/hardening.md +++ b/docs/src/snap/howto/security/hardening.md @@ -1,4 +1,4 @@ -# {{product}} hardening guide +# How to harden your {{product}} cluster The {{product}} hardening guide provides actionable steps to enhance the security posture of your deployment. These steps are designed to help you align diff --git a/docs/src/snap/howto/storage/ceph.md b/docs/src/snap/howto/storage/ceph.md index baa4affe7..f0a0831dc 100644 --- a/docs/src/snap/howto/storage/ceph.md +++ b/docs/src/snap/howto/storage/ceph.md @@ -1,4 +1,4 @@ -# How to use Ceph storage with Canonical K8s +# How to use Ceph storage with {{product}} Distributed, redundant storage is a must-have when you want to develop reliable applications. [Ceph] is a storage solution which provides exactly that, and is @@ -48,7 +48,8 @@ capabilities to administer your Ceph cluster: ceph auth get-or-create client.kubernetes mon 'profile rbd' osd 'profile rbd pool=kubernetes' mgr 'profile rbd pool=kubernetes' ``` -For more information on user capabilities in Ceph, see the [authorisation capabilities page][] +For more information on user capabilities in Ceph, see the +[authorisation capabilities page][] ``` [client.kubernetes] @@ -130,8 +131,8 @@ Then apply: kubectl apply -f csi-kms-config-map.yaml ``` -If you do need to configure a KMS provider, an [example ConfigMap][] is available -in the Ceph repository. +If you do need to configure a KMS provider, an [example ConfigMap][] is +available in the Ceph repository. Create the `ceph-config-map.yaml` which will be stored inside a `ceph.conf` file in the CSI containers. This `ceph.conf` file will be used by Ceph daemons on @@ -311,9 +312,9 @@ kubectl apply -f pod.yaml ## Verify that the pod is using the RBD PV -To verify that the `csi-rbd-demo-pod` is indeed using a RBD Persistent Volume, run -the following commands, you should see information related to attached volumes -in both of their outputs: +To verify that the `csi-rbd-demo-pod` is indeed using a RBD Persistent Volume, +run the following commands, you should see information related to attached +volumes in both of their outputs: ``` kubectl describe pvc rbd-pvc diff --git a/docs/src/snap/howto/storage/cloud.md b/docs/src/snap/howto/storage/cloud.md index bbe309b9c..756ef5e85 100644 --- a/docs/src/snap/howto/storage/cloud.md +++ b/docs/src/snap/howto/storage/cloud.md @@ -13,7 +13,7 @@ This guide is for AWS and assumes the following: - You can create roles and policies in AWS -## Set IAM Policies +## Set IAM policies Your instance will need a few IAM policies to be able to communicate with the AWS APIs. The policies provided here are quite open and should be scoped down diff --git a/docs/src/snap/howto/storage/storage.md b/docs/src/snap/howto/storage/storage.md index db833deaf..e979d06e4 100644 --- a/docs/src/snap/howto/storage/storage.md +++ b/docs/src/snap/howto/storage/storage.md @@ -17,7 +17,7 @@ This guide assumes the following: - You have a bootstrapped {{product}} cluster (see the [getting-started-guide]) -## Enable Local Storage +## Enable local storage The storage feature is enabled by default when bootstrapping the snap. However, if you have used a custom bootstrap configuration or disabled the feature, you @@ -27,7 +27,7 @@ can enable it using the following command: sudo k8s enable local-storage ``` -## Configure Local Storage +## Configure local storage While the storage option comes with sensible defaults, you can customise it to meet your requirements. Obtain the current configuration by running: @@ -53,7 +53,7 @@ The local-storage feature provides the following configuration options: is not changed. If this flag is not set and the cluster does not have a default class set then the class from the local-storage becomes the default. -## Disable Local Storage +## Disable local storage The local storage option is only suitable for single-node clusters and development environments as it has no multi node data replication. For a diff --git a/docs/src/snap/howto/troubleshooting.md b/docs/src/snap/howto/troubleshooting.md index a7da7a979..1009a6319 100644 --- a/docs/src/snap/howto/troubleshooting.md +++ b/docs/src/snap/howto/troubleshooting.md @@ -1,10 +1,16 @@ # How to troubleshoot {{product}} -Identifying issues in a Kubernetes cluster can be difficult, especially to new users. With {{product}} we aim to make deploying and managing your cluster as easy as possible. This how-to guide will walk you through the steps to troubleshoot your {{product}} cluster. +Identifying issues in a Kubernetes cluster can be difficult, especially to new +users. With {{product}} we aim to make deploying and managing your cluster as +easy as possible. This how-to guide will walk you through the steps to +troubleshoot your {{product}} cluster. ## Common issues -Maybe your issue has already been solved? Check out the [troubleshooting reference][snap-troubleshooting-reference] page to see a list of common issues and their solutions. Otherwise continue with this guide to help troubleshoot your {{product}} cluster. +Maybe your issue has already been solved? Check out the +[troubleshooting reference][snap-troubleshooting-reference] page to see a list +of common issues and their solutions. Otherwise continue with this guide to +help troubleshoot your {{product}} cluster. ## Check the cluster status @@ -15,6 +21,7 @@ sudo k8s status ``` You should see a command output similar to the following: + ``` cluster status: ready control plane nodes: 10.94.106.249:6400 (voter), 10.94.106.208:6400 (voter), 10.94.106.99:6400 (voter) @@ -31,34 +38,43 @@ gateway enabled ## Test the API server health -Verify that the API server is healthy and reachable by running the following command on a control-plane node: +Verify that the API server is healthy and reachable by running the following +command on a control-plane node: ``` sudo k8s kubectl get all ``` -This command lists resources that exist under the default namespace. You should see a command output similar to the following if the API server is healthy: +This command lists resources that exist under the default namespace. You should +see a command output similar to the following if the API server is healthy: + ``` NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.152.183.1 443/TCP 29m ``` A typical error message may look like this if the API server can not be reached: + ``` The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? ``` -A failure can mean that the API server on the particular node is unhealthy. Check the status of the API server service: +A failure can mean that the API server on the particular node is unhealthy. +Check the status of the API server service: + ``` sudo systemctl status snap.k8s.kube-apiserver ``` Access the logs of the API server service by running the following command: + ``` sudo journalctl -xe -u snap.k8s.kube-apiserver ``` -If you are trying to reach the API server from a host that is not a control-plane node, a failure could mean that: +If you are trying to reach the API server from a host that is not a +control-plane node, a failure could mean that: + * The API server is not reachable due to network issues or firewall limitations * The API server is failing on the control-plane node that's being reached * The control-plane node that's being reached is down @@ -67,11 +83,13 @@ If you are trying to reach the API server from a host that is not a control-plan When running `sudo k8s config` on a control-plane node you retrieve the kubeconfig file that uses this node's IP address. ``` -Try reaching the API server on a different control-plane node by updating the IP address that's used in the kubeconfig file. +Try reaching the API server on a different control-plane node by updating the +IP address that's used in the kubeconfig file. ## Check the cluster nodes' health -Confirm that the nodes in the cluster are healthy by looking for the `Ready` status: +Confirm that the nodes in the cluster are healthy by looking for the `Ready` +status: ``` sudo k8s kubectl get nodes @@ -88,24 +106,29 @@ node-3 Ready control-plane,worker 6m21s v1.32.0 ## Troubleshooting an unhealthy node -Every healthy {{ product }} node has certain services up and running. The required services depend on the type of node. +Every healthy {{ product }} node has certain services up and running. The +required services depend on the type of node. Services running on both control plane and worker nodes: + * `k8sd` * `kubelet` * `containerd` * `kube-proxy` Services running only on control-plane nodes: + * `kube-apiserver` * `kube-controller-manager` * `kube-scheduler` * `k8s-dqlite` Services running only on worker nodes: + * `k8s-apiserver-proxy` -Check the status of these services on the failing node by running the following command: +Check the status of these services on the failing node by running the following +command: ``` sudo systemctl status snap.k8s. @@ -117,9 +140,11 @@ The logs of a failing service can be checked by running the following command: sudo journalctl -xe -u snap.k8s. ``` -If the issue indicates a problem with the configuration of the services on the node, examine the arguments used to run these services. +If the issue indicates a problem with the configuration of the services on the +node, examine the arguments used to run these services. -The arguments of a service on the failing node can be examined by reading the file located at `/var/snap/k8s/common/args/`. +The arguments of a service on the failing node can be examined by reading the +file located at `/var/snap/k8s/common/args/`. ## Investigating system pods' health @@ -129,7 +154,9 @@ Check whether all of the cluster's pods are `Running` and `Ready`: sudo k8s kubectl get pods -n kube-system ``` -The pods in the `kube-system` namespace belong to {{product}} features such as `network`. Unhealthy pods could be related to configuration issues or nodes not meeting certain requirements. +The pods in the `kube-system` namespace belong to {{product}} features such as +`network`. Unhealthy pods could be related to configuration issues or nodes not +meeting certain requirements. ## Troubleshooting a failing pod @@ -145,19 +172,24 @@ Check the logs on a failing pod by running the following command: sudo k8s kubectl logs -n ``` -You can check out the upstream [debug pods documentation][] for more information. +You can check out the upstream [debug pods documentation][] for more +information. ## Using the built-in inspection script -{{product}} ships with a script to compile a complete report on {{product}} and its underlying system. This is an essential tool for bug reports and for investigating whether a system is (or isn’t) working. +{{product}} ships with a script to compile a complete report on {{product}} and +its underlying system. This is an essential tool for bug reports and for +investigating whether a system is (or isn’t) working. -Run the inspection script, by entering the command (admin privileges are required to collect all the data): +Run the inspection script, by entering the command (admin privileges are +required to collect all the data): ``` sudo /snap/k8s/current/k8s/scripts/inspect.sh ``` The command output is similar to the following: + ``` Collecting service information Running inspection on a control-plane node @@ -186,14 +218,19 @@ Building the report tarball SUCCESS: Report tarball is at /root/inspection-report-20250109_132806.tar.gz ``` -Use the report to ensure that all necessary services are running and dive into every aspect of the system. +Use the report to ensure that all necessary services are running and dive into +every aspect of the system. + +## Report a bug -## Reporting a bug -If you cannot solve your issue and believe that the fault may lie in {{product}}, please [file an issue on the project repository][]. +If you cannot solve your issue and believe that the fault may lie in +{{product}}, please [file an issue on the project repository][]. -Help us deal effectively with issues by including the report obtained from the inspect script, any additional logs, and a summary of the issue. +Help us deal effectively with issues by including the report obtained from the +inspect script, any additional logs, and a summary of the issue. -You can check out the upstream [debug documentation][] for more details on troubleshooting a Kubernetes cluster. +You can check out the upstream [debug documentation][] for more details on +troubleshooting a Kubernetes cluster. diff --git a/docs/src/snap/howto/two-node-ha.md b/docs/src/snap/howto/two-node-ha.md index 51dad5090..3bda0aabb 100644 --- a/docs/src/snap/howto/two-node-ha.md +++ b/docs/src/snap/howto/two-node-ha.md @@ -145,8 +145,8 @@ sudo mkfs.ext4 /dev/drbd0 sudo drbdadm down r0 ``` -Add the DRBD device to the ``multipathd`` blacklist, ensuring that the `multipath` -service will not attempt to manage this device: +Add the DRBD device to the ``multipathd`` blacklist, ensuring that the +`multipath`service will not attempt to manage this device: ``` sudo cat <]` holds snap refreshes for a specified +To prevent automatic updates, the snap can be tied to a specific revision. +`snap refresh --hold[=]` holds snap refreshes for a specified duration (or forever, if no value is specified). ``` diff --git a/docs/src/snap/reference/certificates.md b/docs/src/snap/reference/certificates.md index 996d80f6c..05ebb9fc2 100644 --- a/docs/src/snap/reference/certificates.md +++ b/docs/src/snap/reference/certificates.md @@ -1,4 +1,4 @@ -# Cluster Certificates and Configuration Directories +# Cluster certificates and configuration directories This reference page provides an overview of certificate authorities (CAs), certificates and configuration directories in use by a {{ product }} cluster. @@ -37,7 +37,7 @@ their issuance. | `root@$hostname` | Client | `/var/snap/k8s/common/var/lib/k8s-dqlite/cluster.crt` | Communication between k8sd nodes | `self-signed` | -## Configuration Files for Kubernetes Components +## Configuration files for Kubernetes components The following tables provide an overview of the configuration files used to communicate with the cluster services. diff --git a/docs/src/snap/tutorial/add-remove-nodes.md b/docs/src/snap/tutorial/add-remove-nodes.md index 88b22fe39..e9442c46c 100644 --- a/docs/src/snap/tutorial/add-remove-nodes.md +++ b/docs/src/snap/tutorial/add-remove-nodes.md @@ -1,4 +1,4 @@ -# Adding and removing nodes +# Add and remove nodes Typical production clusters are hosted across multiple data centres and cloud environments, enabling them to leverage geographical distribution for improved