Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(chart): expose health port via svc #6037

Closed
wants to merge 3 commits into from

Conversation

owais-maswala
Copy link

@owais-maswala owais-maswala commented Apr 14, 2024

Description

Expose health port via Service

apiVersion: v1
kind: Service
metadata:
  name: karpenter
  namespace: dev
  labels:
    helm.sh/chart: karpenter-0.36.0
    app.kubernetes.io/name: karpenter
    app.kubernetes.io/instance: karpenter
    app.kubernetes.io/version: "0.36.0"
    app.kubernetes.io/managed-by: Helm
spec:
  type: ClusterIP
  ports:
    - name: http
      port: 8081
      targetPort: http
      protocol: TCP
    - name: http-metrics
      port: 8000
      targetPort: http-metrics
      protocol: TCP
  selector:
    app.kubernetes.io/name: karpenter
    app.kubernetes.io/instance: karpenter

tested on k8s cluster, allows reaching karpenter.dev.svc.cluster.local:8081/healthz

How was this change tested?

Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: #
  • No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@owais-maswala owais-maswala requested a review from a team as a code owner April 14, 2024 09:28
Copy link

netlify bot commented Apr 14, 2024

Deploy Preview for karpenter-docs-prod canceled.

Name Link
🔨 Latest commit 043ce05
🔍 Latest deploy log https://app.netlify.com/sites/karpenter-docs-prod/deploys/661fa7951cd5c10008f093ca

Copy link
Contributor

@engedaam engedaam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

Copy link
Contributor

@engedaam engedaam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/karpenter snapshot

@coveralls
Copy link

Pull Request Test Coverage Report for Build 8720520076

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 82.015%

Totals Coverage Status
Change from base Build 8713470070: 0.0%
Covered Lines: 5372
Relevant Lines: 6550

💛 - Coveralls

Copy link
Contributor

Snapshot successfully published to oci://021119463062.dkr.ecr.us-east-1.amazonaws.com/karpenter/snapshot/karpenter:0-043ce05b9e5b233c1e1f5b47e154efcda43c9b59.
To install you must login to the ECR repo with an AWS account:

aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 021119463062.dkr.ecr.us-east-1.amazonaws.com

helm upgrade --install karpenter oci://021119463062.dkr.ecr.us-east-1.amazonaws.com/karpenter/snapshot/karpenter --version "0-043ce05b9e5b233c1e1f5b47e154efcda43c9b59" --namespace "kube-system" --create-namespace \
  --set "settings.clusterName=${CLUSTER_NAME}" \
  --set "settings.interruptionQueue=${CLUSTER_NAME}" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait

@ellistarn
Copy link
Contributor

What use case does this meet? It's a bit odd to me that we would loadbalance a health port.

@owais-maswala
Copy link
Author

What use case does this meet? It's a bit odd to me that we would loadbalance a health port.

I'd like to monitor the health/availability of Karpenter using the Blackbox Exporter/Prometheus Probes

@ellistarn
Copy link
Contributor

Is it not possible to scrape the pods directly?

@owais-maswala
Copy link
Author

owais-maswala commented Apr 22, 2024

Is it not possible to scrape the pods directly?

Sure its possible, but pods are ephemeral, and would involve using Endpoints to dynamically update our Probe config at regular intervals

Probing the Service instead allows for a stable endpoint and abstraction away from individual pods. I am setting up SLO's as opposed to alerting (which we already have at a pod level)

We have the same setup for 20+ other services on our clusters (hoping for some consistency), but Karpenter is the exception as it does not currently have its health port exposed via the Service

Its worth noting that Prometheus Probe's only accept static or dynamic Ingress objects as targets so endpoints would not work in this scenario

@jonathan-innis
Copy link
Contributor

jonathan-innis commented Apr 23, 2024

We have the same setup for 20+ other services on our clusters (hoping for some consistency), but Karpenter is the exception as it does not currently have its health port exposed via the Service

Just so I have a little more context: Can you point to a few upstream processes that do exposure of their health ports through the service?

I'd like to monitor the health/availability of Karpenter using the Blackbox Exporter/Prometheus Probes

Can you do this through kube-state? Doesn't the health of the pod get exposed back up through the pod thorugh status conditions? Or is there something that you gain from the health probe port that doesn't get exposed through kube-state.

Copy link
Contributor

github-actions bot commented May 7, 2024

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants