The RKE2 install script installs RKE2, suitable for both server and agent nodes, following the upstream RKE2 air-gapped install guide. The basic steps involved in our current script involve:
- Staging image tarballs: Image tarballs are downloaded and placed in the correct location for usage in an airgap (see here)
- Run the RKE2 install script from upstream: This is pulled directly from RKE2 docs here
The RKE2 Configuration script adds configurations for spinning up the RKE2 cluster in the STIG'd environment. Additionally, it injects a RKE2 Startup script that allows for simple, initial cluster bootstrapping via CLI.
The final portion of the build copies a few files into the image and ensures they have proper ownership for usage at runtime. The RKE2 STIG is the basis for these files. The files added are:
- An audit policy adhering to this STIG rule
- An RKE2 config file pre-configured to meet STIG rules (note that some STIG rules are met by default with RKE2 and not included in this configuration explicitly)
- A default pod security config - this allows full privileges for running pods and is added with the expectation that a policy enforcement engine like Kyverno or Gatekeeper is being used to restrict the same things, with exceptions as necessary
- A helper script for RKE2 startup - while RKE2 can certainly be run without this, this script can be used to add the RKE2 join address, token, and other properties to the RKE2 config file. It also corrects file permissions according to the STIG guide for files that do not exist until RKE2 startup has occurred.
Additionally the etcd user and a sysctl config are added for RKE2. This follows the process documented in the RKE2 CIS Hardening guide.
Finally, configuration of the cluster's networking and default services is provided to allow the RKE2 cluster to be compatible or replaced with the components and services setup by UDS Core.
Important
Due to an upstream error in RKE2 and K3s, containerd is misconfigured leading to image pull errors from 127.0.0.1 (local registry). Please see the rke2-startup.sh script for details, and the containerd CRI docs for more details.
RKE2 provides excellent tooling to build an RKE2 cluster, but when considering the STIG guides for RKE2 and deploying via IaC there is additional runtime configuration required. The RKE2 Startup script injected during OS preparation is not required for startup, but it abstracts away some setup complexity.
This script provides a number of optional parameters depending on your desired configuration:
-t <token>
: RKE2 uses a secret token to join nodes to the cluster securely. This can be generated with something like openssl to create a secure random string.-s <join address>
: RKE2 initializes on a "bootstrap" node. The '-s' argument is the IP address or hostname of the bootstrap node or cluster control plane and is used by new nodes to join the cluster. When this is either unset or matches the IP of the host RKE2 is being started on, RKE2 will initialize as the bootstrap node.-a
: RKE2 has server or agent nodes. Agent nodes are Kubernetes worker nodes and do not host critical services like etcd or control-plane deployments.-T <dns address>
: By default cluster generated certificate is only valid for the loopback address and private IPs it can find on interfaces. When accessing cluster from a hostname or public IP, they need to be provided so they can be added to the cluster certificate.
This script should be run on each node with a minimum of 3 server nodes for an HA setup, plus additional agent nodes as needed. Ideally you should also setup load-balancing for server nodes (at minimum round-robin with DNS) so that a single node failure does not cause access issues.
An example setup is provided below:
- Node1:
/root/rke2-startup.sh -t <token> -s <node1_ip> -T <rke2_lb_address>
- Node2:
/root/rke2-startup.sh -t <token> -s <rke2_lb_address> -T <rke2_dns_address>
- Node3:
/root/rke2-startup.sh -t <token> -s <rke2_lb_address> -T <rke2_dns_address>
- NodeN (agent nodes):
/root/rke2-startup.sh -t <token> -s <rke2_lb_address> -a
The containerd
configuration template is within the startup script. This configuration template adds options to the host's containerd
configuration, as seen in the example below:
# /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl
version = 2
[plugins."io.containerd.internal.v1.opt"]
path = "/var/lib/rancher/rke2/agent/containerd"
[plugins."io.containerd.grpc.v1.cri"]
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
enable_selinux = false
enable_unprivileged_ports = true
enable_unprivileged_icmp = true
sandbox_image = "registry1.dso.mil/ironbank/opensource/pause/pause:3.9"
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
disable_snapshot_annotations = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true