Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Cluster reference field in the ProviderServiceAccount API type #2284

Open
srm09 opened this issue Aug 27, 2023 · 14 comments · May be fixed by #2623
Open

Introduce Cluster reference field in the ProviderServiceAccount API type #2284

srm09 opened this issue Aug 27, 2023 · 14 comments · May be fixed by #2623
Assignees
Labels
area/supervisor Issues or PRs related to the supervisor mode kind/feature Categorizes issue or PR as related to a new feature.

Comments

@srm09
Copy link
Contributor

srm09 commented Aug 27, 2023

/kind feature

Describe the solution you'd like
The ProviderServiceAccount API type uses the VSphereCluster ref to sync the service account secret back to the workload cluster. Before ClusterClass, the VSphereCluster reference was easy to determine since it was expected to be created by an end user. With the advent and popularity of ClusterClass, this name generation is done dynamically by the topology controller and not easily determinable.
Since the internals of the controller use the VSphereCluster reference to obtain the name of the Cluster object itself, and use the remote client of the cluster to sync the service account secret back to the workload cluster, it makes sense to update the API type to reference the name of the Cluster instead of the VSphereCluster.

Anything else you would like to add:

  • ProviderServiceAccount API type referencing the VSphereCluster object:
  • ProviderServiceAccount controller logic which deduces the name of the Cluster object from the VSphereCluster object reference:
    cluster, err := clusterutilv1.GetClusterFromMetadata(r, r.Client, vsphereCluster.ObjectMeta)
    if err != nil {
    r.Logger.Info("unable to get capi cluster from vsphereCluster", "err", err)
    return reconcile.Result{}, nil
    }
    // Pause reconciliation if entire vSphereCluster or Cluster is paused
    if annotations.IsPaused(cluster, vsphereCluster) {
    r.Logger.V(4).Info("VSphereCluster %s/%s linked to a cluster that is paused",
    vsphereCluster.Namespace, vsphereCluster.Name)
    return reconcile.Result{}, nil
    }
    // We cannot proceed until we are able to access the target cluster. Until
    // then just return a no-op and wait for the next sync. This will occur when
    // the Cluster's status is updated with a reference to the secret that has
    // the Kubeconfig data used to access the target cluster.
    guestClient, err := r.remoteClusterCacheTracker.GetClient(clusterContext, client.ObjectKeyFromObject(cluster))
    if err != nil {
    if errors.Is(err, remote.ErrClusterLocked) {
    r.Logger.V(5).Info("Requeuing because another worker has the lock on the ClusterCacheTracker")
    return ctrl.Result{Requeue: true}, nil
    }
    clusterContext.Logger.Info("The control plane is not ready yet", "err", err)
    return reconcile.Result{RequeueAfter: clusterNotReadyRequeueTime}, nil
    }

The proposal is to update the API to include

// ProviderServiceAccountSpec defines the desired state of ProviderServiceAccount.
type ProviderServiceAccountSpec struct {
	// Ref specifies the reference to the VSphereCluster for which the ProviderServiceAccount needs to be realized.
	//
	// Deprecated: Use the ClusterRef instead referencing the CAPI `Cluster` object
	Ref *corev1.ObjectReference `json:"ref"`

	// ClusterRef specifies the reference to the Cluster object for which the ProviderServiceAccount needs to be realized.
	ClusterRef *corev1.ObjectReference `json:"clusterRef"` // <<======== new field

	// Rules specifies the privileges that need to be granted to the service account.
	Rules []rbacv1.PolicyRule `json:"rules"`

	// TargetNamespace is the namespace in the target cluster where the secret containing the generated service account
	// token needs to be created.
	TargetNamespace string `json:"targetNamespace"`

	// TargetSecretName is the name of the secret in the target cluster that contains the generated service account
	// token.
	TargetSecretName string `json:"targetSecretName"`
}

Add a new webhook to allow setting either Ref or ClusterRef in the ProviderServiceAccount object. We could go one step further and introduce logic to automatically set the ClusterRef when the Ref is set in the Spec. This would allow the transition to the new proposed API field to be much smoother.

Environment:

  • Cluster-api-provider-vsphere version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):
@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 27, 2023
@sbueringer
Copy link
Member

sbueringer commented Aug 28, 2023

Sounds good to me overall.

I would:

  • with v1beta1: (now)
    • add the new clusterRef field
    • deprecate the ref field (assuming we want to)
    • add the webhook to allow only setting one of them
  • with v1beta1:
    • remove the ref field
    • implement conversion to convert from ref to clusterRef
      • I think this probably requires one of:
        • changes to controller-runtime as ConvertTo / ConvertFrom can today only be implemented on the apiType directly (so we don't have a client available to retrieve VSphereCluster)
        • alternatively some other way to map a ProviderServiceAccount to a Cluster, are we setting the cluster-name label? (xref: CAPV created resources should have the CAPI cluster-name label #2218)

@srm09
Copy link
Contributor Author

srm09 commented Aug 28, 2023

implement conversion to convert from ref to clusterRef

My original thought was to have it as part of the conversion webhook, but since it isn't possible, I was proposing of doing this as part of the controller reconcile logic itself.

@srm09
Copy link
Contributor Author

srm09 commented Aug 28, 2023

alternatively some other way to map a ProviderServiceAccount to a Cluster, are we setting the cluster-name label?

We are not setting the cluster label, just like we do not for VSphereVM right now. It should be straight forward to setup this label as well, and IMO it is a good idea to do it. Since there is no such CAPI contract that exists, it is the responsibility of the provider to set the label.

Do you want to create an issue to track adding the CAPI cluster.x-k8s.io/cluster-name label to both the CRs mentioned above?

@sbueringer
Copy link
Member

sbueringer commented Aug 29, 2023

Do you want to create an issue to track adding the CAPI cluster.x-k8s.io/cluster-name label to both the CRs mentioned above?

Let's add it to the issue you already created for other CRDs?

My original thought was to have it as part of the conversion webhook, but since it isn't possible, I was proposing of doing this as part of the controller reconcile logic itself.

Yup conversion only works between API versions. I think the problem with rewriting it in the controller is that it's easy to end up in infinite reconciles between CAPV and other controllers (eg gitops or whoever creates the objects). Also means that our controller would own the field (managed fields / SSA)

Also not sure how it makes the transition smoother 😀. Folks have to change it on their side anyway and we would support the old field in v1beta1 in any case (and dropping support in v1beta2)

@sbueringer
Copy link
Member

@srm09 If you want we can now also return a warning via the validation webhook when the ref instead of the clusterRef is set. But I think that is optional.

@srm09
Copy link
Contributor Author

srm09 commented Nov 15, 2023

/retitle Introduce Cluster reference field in the ProviderServiceAccount API type

@k8s-ci-robot k8s-ci-robot changed the title Introduce Cluster reference filed in the ProviderServiceAccount API type Introduce Cluster reference field in the ProviderServiceAccount API type Nov 15, 2023
@zhanggbj
Copy link
Contributor

zhanggbj commented Jan 5, 2024

/assign
I would like to work on it if this is still valid:-)

@sbueringer
Copy link
Member

@zhanggbj Please wait a bit with this issue. I think in general it is still valid I just recently noticed a bit of weirdness in the corresponding controller and would like to refactor it first. I should get around do it in the next few days.

I think this refactoring will have impact on this change.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 4, 2024
@sbueringer
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 4, 2024
@sbueringer sbueringer added the area/supervisor Issues or PRs related to the supervisor mode label Apr 18, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 17, 2024
@chrischdi
Copy link
Member

/remove-lifecycle stale

Will revisit this soon

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 18, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 16, 2024
@sbueringer
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/supervisor Issues or PRs related to the supervisor mode kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
6 participants