Skip to content

Commit

Permalink
docs: Improve security docs (#980)
Browse files Browse the repository at this point in the history
  • Loading branch information
HomayoonAlimohammadi authored Jan 17, 2025
1 parent 5a7cca2 commit 7ed5864
Show file tree
Hide file tree
Showing 5 changed files with 126 additions and 125 deletions.
23 changes: 12 additions & 11 deletions docs/src/snap/howto/restore-quorum.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Recovering a Cluster After Quorum Loss
# Recovering a cluster after quorum loss

Highly available {{product}} clusters can survive losing one or more
nodes. [Dqlite], the default datastore, implements a [Raft] based protocol
Expand All @@ -11,14 +11,14 @@ steps outlined in this document.

```{note}
This guide can be used to recover the default {{product}} datastore,
dqlite. Persistent volumes on the lost nodes are *not* recovered.
Dqlite. Persistent volumes on the lost nodes are *not* recovered.
```

## Dqlite Configuration
## Dqlite configuration

Be aware that {{product}} uses not one, but two dqlite databases:
Be aware that {{product}} uses not one, but two Dqlite databases:

* k8s-dqlite - used by Kubernetes itself
* k8s-dqlite - used by Kubernetes itself (as an ETCD replacement)
* k8sd - Kubernetes cluster management data

Each database has its own state directory:
Expand All @@ -29,8 +29,8 @@ Each database has its own state directory:
The state directory normally contains:

* ``info.yaml`` - the id, address and cluster role of this node
* ``cluster.yaml`` - the state of the cluster, as seen by this dqlite node.
It includes the same information as info.yaml, but for all cluster nodes.
* ``cluster.yaml`` - the state of the cluster, as seen by this Dqlite node.
It includes the same information as info.yaml, but for all cluster nodes
* ``00000abcxx-00000abcxx``, ``open-abc`` - database segments
* ``cluster.crt``, ``cluster.key`` - node certificates
* ``snapshot-abc-abc-abc.meta``
Expand All @@ -53,7 +53,7 @@ Dqlite cluster members have one of the following roles:
| 1 | stand-by | yes | no |
| 2 | spare | no | no |

## Stop {{product}} Services on All Nodes
## Stop {{product}} services on all nodes

Before recovering the cluster, all remaining {{product}} services
must be stopped. Use the following command on every node:
Expand All @@ -62,7 +62,7 @@ must be stopped. Use the following command on every node:
sudo snap stop k8s
```

## Recover the Database
## Recover the database

Choose one of the remaining alive cluster nodes that has the most recent
version of the Raft log.
Expand All @@ -73,7 +73,7 @@ Update the ``cluster.yaml`` files, changing the role of the lost nodes to
files were moved across nodes.

The following command guides us through the recovery process, prompting a text
editor with informative inline comments for each of the dqlite configuration
editor with informative inline comments for each of the Dqlite configuration
files.

```
Expand Down Expand Up @@ -112,11 +112,12 @@ sudo snap start k8s
```

Ensure that the services started successfully by using
``sudo snap services k8s``. Use ``k8s status --wait-ready`` to wait for the
``sudo snap services k8s``. Use ``sudo k8s status --wait-ready`` to wait for the
cluster to become ready.

You may notice that we have not returned to an HA cluster yet:
``high availability: no``. This is expected as we need to recover
the remaining nodes.

## Recover the remaining nodes

Expand Down
40 changes: 20 additions & 20 deletions docs/src/snap/howto/security/cis-assessment.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ CIS Hardening refers to the process of implementing security configurations that
align with the benchmarks set by the [Center for Internet Security (CIS)].
Out of the box {{product}} complies with the majority of the recommended
CIS security configurations. Since implementing all security recommendations
would comes at the expense of compatibility and/or performance we expect
would come at the expense of compatibility and/or performance we expect
cluster administrators to follow post deployment hardening steps based on their
needs. This guide covers:

Expand Down Expand Up @@ -38,7 +38,7 @@ Download the latest [kube-bench release] on your Kubernetes nodes. Make sure
to select the appropriate binary version.

For example, to download the Linux binary, use the following command. Replace
`KB` by the version listed in the releases page.
`KB` by the version listed in the releases page:

```
KB=8.0
Expand All @@ -54,15 +54,15 @@ tar -xvf kube-bench_0.$KB\_linux_amd64.tar.gz
sudo mv kube-bench /usr/local/bin/
```

Verify kube-bench installation.
Verify kube-bench installation:

```
kube-bench version
```

The output should list the version installed.

Install `kubectl` and configure it to interact with the cluster.
Install `kubectl` and configure it to interact with the cluster:

```{warning}
This will override your ~/.kube/config if you already have kubectl installed in your cluster.
Expand Down Expand Up @@ -144,9 +144,9 @@ and, when possible, provide information on how to comply with each
one manually. This can be used for manually auditing the CIS
hardening state of a cluster.

### Control Plane Security Configuration
### Control plane security configuration

#### Control Plane Node Configuration Files
#### Control plane node configuration files

##### CIS Control 1.1.1

Expand Down Expand Up @@ -1662,7 +1662,7 @@ AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES
_256_CBC_SHA,TLS_RSA_WITH_AES_256_GCM_SHA384
```

#### Controller Manager
#### Controller manager

##### CIS Control 1.3.1

Expand Down Expand Up @@ -1928,9 +1928,9 @@ and restart the kube-scheduler service
---bind-address=127.0.0.1
```

### Datastore Node Configuration
### Datastore node configuration

#### Datastore Node Configuration
#### Datastore node configuration

##### CIS Control 2.1

Expand Down Expand Up @@ -2065,9 +2065,9 @@ communication uses certificates
created upon cluster setup.


### Control Plane Configuration
### Control plane configuration

#### Authentication and Authorization
#### Authentication and authorization

##### CIS Control 3.1.1

Expand Down Expand Up @@ -2137,9 +2137,9 @@ is recommended
(the most basic level of logging).


### Worker Node Security Configuration
### Worker node security configuration

#### Worker Node Configuration Files
#### Worker node configuration files

##### CIS Control 4.1.1

Expand Down Expand Up @@ -2868,9 +2868,9 @@ H_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_
RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256
```

### Kubernetes Policies
### Kubernetes policies

#### RBAC and Service Accounts
#### RBAC and service accounts

##### CIS Control 5.1.1

Expand Down Expand Up @@ -2991,7 +2991,7 @@ Where possible, remove the impersonate, bind and escalate
rights from subjects.


#### Pod Security Standards
#### Pod security standards

##### CIS Control 5.2.1 / DISA STIG V-254800

Expand Down Expand Up @@ -3188,7 +3188,7 @@ workloads to restrict the
admission of containers which use `hostPort` sections.


#### Network Policies and CNI
#### Network policies and CNI

##### CIS Control 5.3.1

Expand Down Expand Up @@ -3219,7 +3219,7 @@ Follow the documentation and create NetworkPolicy objects as
you need them.


#### Secrets Management
#### Secrets management

##### CIS Control 5.4.1

Expand Down Expand Up @@ -3250,7 +3250,7 @@ cloud provider or a third-party
secrets management solution.


#### Extensible Admission Control
#### Extensible admission control

##### CIS Control 5.5.1

Expand All @@ -3266,7 +3266,7 @@ Follow the Kubernetes documentation and setup image
provenance.


#### General Policies
#### General policies

##### CIS Control 5.7.1

Expand Down
Loading

0 comments on commit 7ed5864

Please sign in to comment.