Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installing Kueue by applying manifest fails on GKE #2675

Open
cortadocodes opened this issue Jan 21, 2025 · 1 comment
Open

Installing Kueue by applying manifest fails on GKE #2675

cortadocodes opened this issue Jan 21, 2025 · 1 comment
Assignees
Labels

Comments

@cortadocodes
Copy link

Terraform Version, Provider Version and Kubernetes Version

Terraform version: 1.10.4
Kubernetes provider version: 2.35.1
Kubernetes version: 1.30.8-gke.1051000
Kueue version: 0.10.1

Affected Resource(s)

  • kubernetes_manifest

Terraform Configuration Files

terraform {
  required_version = ">= 1.8.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~>6.12"
    }
    kubernetes = {
      source = "hashicorp/kubernetes"
      version = "~>2.35.1"
    }
    kubectl = {
      source  = "gavinbunney/kubectl"
      version = "~>1.19.0"
    }
  }

  cloud {
    REDACTED
  }
}


provider "google" {
  credentials = file(var.google_cloud_credentials_file)
  project     = var.google_cloud_project_id
  region      = var.google_cloud_region
}


data "google_client_config" "default" {}


provider "kubernetes" {
  host                   = "https://${google_container_cluster.primary.endpoint}"
  token                  = data.google_client_config.default.access_token
  cluster_ca_certificate = base64decode(google_container_cluster.primary.master_auth[0].cluster_ca_certificate)
}


provider "kubectl" {
  load_config_file       = false
  host                   = "https://${google_container_cluster.primary.endpoint}"
  token                  = data.google_client_config.default.access_token
  cluster_ca_certificate = base64decode(google_container_cluster.primary.master_auth[0].cluster_ca_certificate)
}


resource "google_project_iam_binding" "default_node_service_account" {
  project = var.google_cloud_project_id
  role    = "roles/container.defaultNodeServiceAccount"
  members = ["serviceAccount:${var.google_cloud_project_number}[email protected]"]
}


resource "google_container_cluster" "primary" {
  name     = "${terraform.workspace}-cluster"
  location = var.google_cloud_region
  enable_autopilot = true
  deletion_protection = var.deletion_protection
  depends_on = [time_sleep.wait_for_google_apis_to_enable, google_project_iam_binding.default_node_service_account]
}


resource "time_sleep" "wait_for_cluster_to_be_ready" {
  depends_on = [google_container_cluster.primary]
  create_duration = "2m"
}


# Get the Kueue installation manifests.
data "http" "kueue_installation_manifests" {
  url = "https://github.com/kubernetes-sigs/kueue/releases/download/${var.kueue_version}/manifests.yaml"
}


# Split the multi-document YAML manifest into separate manifests.
locals {
  kueue_manifests = provider::kubernetes::manifest_decode_multi(data.http.kueue_installation_manifests.response_body)
}


# Install Kueue on the cluster by applying the installation manifests.
resource "kubernetes_manifest" "install_kueue" {
  for_each = {
    for manifest in local.kueue_manifests :
    "${manifest.kind}--${manifest.metadata.name}" => manifest
  }
  manifest = each.value
  depends_on = [time_sleep.wait_for_cluster_to_be_ready]
}

Debug Output

https://gist.github.com/cortadocodes/da6e5e55a5590224f085f9e17abecae8

Panic Output

N/A

Steps to Reproduce

  1. terraform apply
  2. Approve apply

Expected Behavior

All of the YAML documents in the Kueue v0.10.1 manifest should have been applied.

Actual Behavior

All successfully applied apart from:

  • MutatingWebhookConfiguration (kueue-mutating-webhook-configuration)
  • Deployment (kueue-controller-manager)

Important Factoids

Normally Kueue is installed using one multi-document YAML manifest. To make this work in Terraform, I've had to split this manifest into single-document manifests.

References

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@cortadocodes
Copy link
Author

For reference, I've now got this working by swapping the kubernetes_manifest block for a kubectl_manifest block:

# Install Kueue on the cluster by applying the installation manifests.
resource "kubectl_manifest" "install_kueue" {
  for_each = {
    for manifest in local.kueue_manifests :
    "${manifest.kind}--${manifest.metadata.name}" => manifest
  }
  yaml_body = yamlencode(each.value)
  server_side_apply = true
  depends_on = [time_sleep.wait_for_cluster_to_be_ready]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants