Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Adding NodeClaims concept page to docs #6470

Merged
merged 12 commits into from
Aug 7, 2024
2 changes: 1 addition & 1 deletion website/content/en/preview/concepts/disruption.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Disruption"
linkTitle: "Disruption"
weight: 4
weight: 50
description: >
Understand different ways Karpenter disrupts nodes
---
Expand Down
125 changes: 125 additions & 0 deletions website/content/en/preview/concepts/nodeclaims.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
---
title: "NodeClaims"
linkTitle: "NodeClaims"
weight: 30
description: >
Understand NodeClaims
---

Karpenter uses NodeClaims to perform node autoscaling based on the requests of unschedulable pods and the
chrisnegus marked this conversation as resolved.
Show resolved Hide resolved
requirements of existing [NodePool]({{<ref "./nodepools" >}}) and associated [NodeClasses]({{<ref "./nodeclasses" >}}).
While NodeClaims require no direct user input, as a Karpenter user you can monitor NodeClaims to keep track of
the status of your nodes in cases where something goes wrong or a node drifts from its intended state.

Karpenter uses NodeClaims as a merged representation of the cloud provider instance and the node on the cluster.
chrisnegus marked this conversation as resolved.
Show resolved Hide resolved
Karpenter creates NodeClaims in response to provisioning and disruption needs (pre-spin). Whenever Karpenter
creates a NodeClaim, it asks the cloud provider to create the instance (launch), register the created node
with the node claim (registration), and wait for the node and its resources to be ready (initialization).

This page describes how NodeClaims relate to other components of Karpenter and the related cloud provider.

If you want to learn more about the nodes being managed by Karpenter, depending on what you are interested in,
chrisnegus marked this conversation as resolved.
Show resolved Hide resolved
you can either look directly at the NodeClaim or at the nodes they are associated with:

* Checking NodeClaims: If something goes wrong in the process of creating a node, you can look at the NodeClaim
to see where the node creation process might have stalled. Using `kubectl get nodeclaims` you can see the NodeClaims
for the cluster and using `kubectl describe nodeclaim <nodeclaim>` you can see the status of a particular NodeClaim.
For example, if the node is not available, you might see statuses indicating that the NodeClaim failed to launch, register, or initialize.

* Checking nodes: Use commands such as `kubectl get node` and `kubectl describe node <nodename>` to see the actual resources,
labels, and other attributes associated with a particular node.

## NodeClaim roles in node creation

NodeClaims provide a critical role in the Karpenter workflow for creating instances and registering them as nodes, then later in responding to node disruptions.

The following diagram illustrates how NodeClaims interact with other components during Karpenter-driven node creation.

![nodeclaim-node-creation](/nodeclaims.png)

{{% alert title="Note" color="primary" %}}
Configure the `KARPENTER_NAMESPACE` environment variable to the namespace where you've installed Karpenter (`kube-system` is the default). Follow along with the Karpenter logs in your cluster and do the following:

```bash
export KARPENTER_NAMESPACE="kube-system"
kubectl logs -f -n "${KARPENTER_NAMESPACE}" \
-l app.kubernetes.io/name=karpenter -c controller
```
In a separate terminal, start some pods that would require Karpenter to create nodes to handle those pods.
For example, start up some inflate pods as described in [Scale up deployment]({{< ref "../getting-started/getting-started-with-karpenter/#6-scale-up-deployment" >}}).
{{% /alert %}}

As illustrated in the previous diagram, Karpenter interacts with NodeClaims and related components when creating a node:

1. Watches for pods and monitors NodePools and NodeClasses:
* Checks what the pod needs, such as requests for CPU, memory, architecture, and so on.
* Checks the constraints imposed by existing NodePools and NodeClasses, such as allowing pods to only run in specific zones, on certain architectures, or on particular operating systems.
Example of log messages at this stage:
```
{"level":"INFO","time":"2024-06-22T02:24:16.114Z","logger":"controller","message":
"found provisionable pod(s)","commit":"490ef94","controller":"provisioner",
"Pods":"default/inflate-66fb68585c-xvs86, default/inflate-66fb68585c-hpcdz,
default/inflate-66fb68585c-8xztf, default/inflate-66fb68585c-t29d8,
default/inflate-66fb68585c-nxflz","duration":"100.761702ms"}
```

2. Asks the Kubernetes API server to create a NodeClaim object to satisfy the pod and NodePool needs.
jmdeal marked this conversation as resolved.
Show resolved Hide resolved
Example of log messages at this stage:
```
{"level":"INFO","time":"2024-06-22T02:24:16.114Z","logger":"controller","message":
"computed new nodeclaim(s) to fit pod(s)","commit":"490ef94","controller":
"provisioner","nodeclaims":1,"pods":5}
```
3. Finds the new NodeClaim and checks its requirements (pre-spin)
jmdeal marked this conversation as resolved.
Show resolved Hide resolved
```
{"level":"INFO","time":"2024-06-22T02:24:16.128Z","logger":"controller","message": "created nodeclaim","commit":"490ef94","controller":"provisioner","NodePool":
{"name":"default"},"NodeClaim":{"name":"default-sfpsl"},"requests":
{"cpu":"5150m","pods":"8"},"instance-types":"c3.2xlarge, c4.2xlarge, c4.4xlarge,
c5.2xlarge, c5.4xlarge and 55 other(s)"}
```
4. Based on the NodeClaim’s requirements, directs the cloud provider to create an instance that meets those requirements (launch):
jmdeal marked this conversation as resolved.
Show resolved Hide resolved
Example of log messages at this stage:
```
{"level":"INFO","time":"2024-06-22T02:24:19.028Z","logger":"controller","message": "launched nodeclaim","commit":"490ef94","controller":"nodeclaim.lifecycle",
"controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","NodeClaim":
{"name":"default-sfpsl"},"namespace":"","name":"default-sfpsl","reconcileID":
"9c9dbc80-3f0f-43ab-b01d-faac6c29e979","provider-id":
"aws:///us-west-2b/i-08a3bf1cadb205c7e","instance-type":"c3.2xlarge","zone":
"us-west-2b","capacity-type":"spot","allocatable":{"cpu":"7910m",
"ephemeral-storage":"17Gi","memory":"13215Mi","pods":"58"}}
```

5. Gathers necessary metadata from the NodeClaim to check and setup the node.
jmdeal marked this conversation as resolved.
Show resolved Hide resolved
6. Checks that the Node exists, and prepares it for use. This includes:
* Waiting for the instance to return a provider ID, instance type, zone, capacity type,
and an allocatable status, indicating that the instance is ready.
* Checking to see if the node has been synced, adding a finalizer to the node (this provides the same
termination guarantees that all Karpenter nodes have), passing in labels, and updating the node owner references.
Example of log messages at this stage:
```
{"level":"INFO","time":"2024-06-22T02:24:52.642Z","logger":"controller","message":
"initialized nodeclaim","commit":"490ef94","controller":
"nodeclaim.lifecycle","controllerGroup":"karpenter.sh","controllerKind":
"NodeClaim","NodeClaim":{"name":"default-sfpsl"},"namespace":
"","name":"default-sfpsl","reconcileID":
"7e7a671d-887f-428d-bd79-ddf603290f0a",
"provider-id":"aws:///us-west-2b/i-08a3bf1cadb205c7e",
"Node":{"name":"ip-192-168-170-220.us-west-2.compute.internal"},
"allocatable":{"cpu":"7910m","ephemeral-storage":"18242267924",
"hugepages-2Mi":"0","memory":"14320468Ki","pods":"58"}}
```
7. Making sure that the Node is ready to use. This includes such things as seeing if resources are registered and start-up taints are removed.
jmdeal marked this conversation as resolved.
Show resolved Hide resolved
8. Checking the nodes for liveliness.
jmdeal marked this conversation as resolved.
Show resolved Hide resolved

At this point, the node is considered ready to go.

If a node doesn’t appear as registered after 15 minutes since it was first created, the NodeClaim is deleted.
Karpenter assumes there is a problem that isn’t going to be fixed.
No pods will be deployed there. After the node is deleted, Karpenter will try to create a new NodeClaim.
chrisnegus marked this conversation as resolved.
Show resolved Hide resolved

## NodeClaim drift and disruption

Although NodeClaims play a role in replacing nodes that drift or have been disrupted,
chrisnegus marked this conversation as resolved.
Show resolved Hide resolved
as a Karpenter user, NodeClaims should not be modified, and should only be a tool for monitoring.

For details on Karpenter disruption, see [Disruption]({{< ref "./disruption" >}}).
2 changes: 1 addition & 1 deletion website/content/en/preview/concepts/nodeclasses.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
---
---
title: "NodeClasses"
linkTitle: "NodeClasses"
weight: 2
Expand Down
2 changes: 1 addition & 1 deletion website/content/en/preview/concepts/nodepools.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "NodePools"
linkTitle: "NodePools"
weight: 1
weight: 10
description: >
Configure Karpenter with NodePools
---
Expand Down
2 changes: 1 addition & 1 deletion website/content/en/preview/concepts/scheduling.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Scheduling"
linkTitle: "Scheduling"
weight: 3
weight: 40
description: >
Learn about scheduling workloads with Karpenter
---
Expand Down
Binary file added website/static/nodeclaims.png
jmdeal marked this conversation as resolved.
Show resolved Hide resolved
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading