Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workload Identity reconciler integration using SPIFFE #809

Merged
merged 6 commits into from
Feb 18, 2025

Conversation

PrimalPimmy
Copy link
Contributor

@PrimalPimmy PrimalPimmy commented Sep 17, 2024

This PR initiates the work to implement workload Identity in the nephio ecosystem. More documentation can be found here:
Design Document: https://docs.google.com/document/d/1k8Hcd7tJKPIsyiYZX6hpRECuJ4IIxVnaESghU5bLNVQ/edit?usp=sharing
User Story: https://docs.google.com/document/d/1nkh7tTItwii1bY877PfzjFCBtmRos4IDh5EOJxWXRdg/edit?usp=sharing
Updating-Kubeconfigs

Copy link
Contributor

nephio-prow bot commented Sep 17, 2024

Hi @PrimalPimmy. Thanks for your PR.

I'm waiting for a nephio-project member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@liamfallon
Copy link
Member

/ok-to-test

@liamfallon
Copy link
Member

/assign @tliron @efiacor @kispaljr

@liamfallon
Copy link
Member

@PrimalPimmy I think maybe we should take another look at this in SIG-Auto. Would you be able to schedule it on the agenda for one of the upcoming meetings?

@PrimalPimmy
Copy link
Contributor Author

Sure @liamfallon . Where do I post about this to schedule it?

cc: @nyrahul

@liamfallon
Copy link
Member

Please enter an item on the agenda for a forthcoming meeting, the agenda document is here:
https://docs.google.com/document/d/1SW4acc0950QdDNEvmeHArfNsgKPeY_EIwHBqGBky2CY/edit#heading=h.k7dq727kte8c

@efiacor
Copy link
Collaborator

efiacor commented Jan 22, 2025

@PrimalPimmy is this PR ready for review?
We will need to add some testing, remove any dead/commented out code and get the presubmit jobs passing.

@liamfallon
Copy link
Member

/retest

@PrimalPimmy
Copy link
Contributor Author

I'm not sure why the CI is unable to check go.sum for the latest spiffe packages, even though I had updated using go mod tidy in the controllers/pkg directory
image

@efiacor
Copy link
Collaborator

efiacor commented Feb 11, 2025

I'm not sure why the CI is unable to check go.sum for the latest spiffe packages, even though I had updated using go mod tidy in the controllers/pkg directory image

Some of the make targets should be executed from the root of the project.

make tidy
make fmt
etc

Can you run "make test" successfully on local?

@PrimalPimmy
Copy link
Contributor Author

Thanks @efiacor , the tests pass now

Copy link
Member

@liamfallon liamfallon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@nephio-prow nephio-prow bot added the approved label Feb 11, 2025
@efiacor efiacor added enhancement New feature or request sig/automation labels Feb 12, 2025
Copy link
Collaborator

@efiacor efiacor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should have at least some degree of unit testing. At least on code that doesn't require heavy mocking/stubbing.

@liamfallon
Copy link
Member

/remove-approve

@nephio-prow nephio-prow bot removed the approved label Feb 18, 2025
@PrimalPimmy
Copy link
Contributor Author

PrimalPimmy commented Feb 18, 2025

A rough flow of how it is working now:

  1. Make sure when a new cluster is formed, make sure this secret nephio/optional/spire-restrictedSA/Secret.yaml (https://github.com/nephio-project/catalog/pull/84/files#diff-884d3b1eb631f5d33f1346355a00a9fc38767dc982ec4d00ba7017d4022bbab6) and all of it's RoleBindings are applied.
  2. Using the above SA token created, the reconciler will check for it, and will generated a Kubeconfig configmap. This kubeconfig configmap will then be used for cluster/node attestation.

return ctrl.Result{RequeueAfter: 30 * time.Second}, errors.Wrap(err, msg)
}
if !ready {
log.Info("cluster not ready")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log error instead of Info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following this:
https://github.com/nephio-project/nephio/blob/main/controllers/pkg/reconcilers/bootstrap-secret/reconciler.go#L129

, and other reconcilers didn't use log.error for cluster readiness either.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, leave it at Info so.

Another quick one, I see that most of the following steps we only log the error but continue on. Should we be exiting if any of the "logged" errors below occur?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think we should exit them. Do we exit with return ctrl.Result{}, errors.Wrap(err, msg)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure tbh. That would be down to the design. Do you want to result.RequeueAfter or let the default backoff handle it. Question for someone with more k8s knowledge.

token := string(secret.Data["token"])

// Retrieve the cluster's CA certificate
configMap, err := clientset.CoreV1().ConfigMaps("kube-system").Get(ctx, "kube-root-ca.crt", metav1.GetOptions{})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the inputs here are hardcoded. This is bad practice.
All the inputs should be configurable via configmap or args.

// Get the ConfigMap
cm := &v1.ConfigMap{}
if err := r.Get(ctx, types.NamespacedName{
Namespace: "spire",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, hardcoded inputs should be avoided.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can prob go beside the reconciler tbh. No real use putting it in the resources dir.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

func CreateSpireAgentConfigMap(name string, namespace string, cluster string, serverAddress string, serverPort string) (*v1.ConfigMap, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions should be relatively easy to unit test.

Signed-off-by: PrimalPimmy <[email protected]>
Signed-off-by: PrimalPimmy <[email protected]>
Signed-off-by: PrimalPimmy <[email protected]>
Signed-off-by: PrimalPimmy <[email protected]>
Signed-off-by: PrimalPimmy <[email protected]>
@PrimalPimmy
Copy link
Contributor Author

/test presubmit-nephio-golangci-lint

@liamfallon
Copy link
Member

If you add a comment with a reference to issue with the refactoring actions for R5 as a comment here,w e can merge this PR.

@PrimalPimmy
Copy link
Contributor Author

Future work in R5 related to SPIFFE:
#859

These are the immediate things to be handled, might add more in the future.

@efiacor
Copy link
Collaborator

efiacor commented Feb 18, 2025

/approve
/lgtm

@nephio-prow nephio-prow bot added the lgtm label Feb 18, 2025
Copy link
Contributor

nephio-prow bot commented Feb 18, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: efiacor

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nephio-prow nephio-prow bot added the approved label Feb 18, 2025
@nephio-prow nephio-prow bot merged commit c034605 into nephio-project:main Feb 18, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants