Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint adding observability feature #1143

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

david-martin
Copy link
Member

@david-martin david-martin commented Jan 29, 2025

Closes #1065

TODO:

  • Verify pattern for setting up the reconcile function and monitoring client
  • Create Gateway PodMonitor (istio)
  • Refactor duplicate creation blocks
  • Add controller test

Copy link
Contributor

@eguzki eguzki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks like going in a correct direction.

Few notes to think about:

  • Have you considered using a regular dynamic client to do the monitoring resources reconciliation (create, update, delete)?? I do not see any real need to use the monitoring client. I might be missing somehting, though
  • When implementing the reconciliation, try to answer the following questions: What happens when the user changes the switch between enabled: true to enabled: false and vice versa? What if somebody externally updates the monitoring resources manually (or some external controller)?? Should kuadrant controller watch the reconciled resources and revert the changes to make sure it is what kuadrant wants to be? For instance, somebody (let's hope accidentally) removes the montoring resources

I will keep watching this PR and providing review, but I will defer the approval to the current experts @Boomatang and @KevFan as I am no longer as familiar as them with the current codebase.

@@ -54,6 +54,11 @@ func (p *Kuadrant) GetLocator() string {

// KuadrantSpec defines the desired state of Kuadrant
type KuadrantSpec struct {
Observability Observability `json:"observability,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this pointer, so you can discriminate between
a) not set
b) set to false
c) set to true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood.
From the reconcile function point of view, I'm wondering if there would be a difference between a and b.
I think the behaviour would be the same for both i.e. remove the SerivceMonitors/PodMonitors if they exist.

controllers/observability_reconciler.go Outdated Show resolved Hide resolved
Copy link
Contributor

@Boomatang Boomatang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hay Dav, I have left a few comments around the use of the policy machinery and the structure the code should be in. If there is any questions, please feel free to ask. The policy machinery works differently to the controller runtime and I have a lot of assumed knowledge, so some of my comments might be vague.

// Check if observability is enabled
if !kObj.Spec.Observability.Enable {
logger.Info("observability enable is false")
// TODO: Remove monitors if they exist
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this TODO would not remove service monitors that the user may have created.

As a note, the linking function will add the ServiceMonitors to the topology linked to the parent, which I assume would be the Kuadrant CR. So the questions becomes what do you do if the ServiceMonitor has no parent in the topology, as in kuadrant CR was removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this TODO would not remove service monitors that the user may have created.

That would be the idea, yes.
Targeting specific labels should be sufficient.

what do you do if the ServiceMonitor has no parent in the topology, as in kuadrant CR was removed

I'm not sure yet, but a couple of thoughts:

  • Would it matter if we can remove the ServiceMonitor based on a label when the Kuadrant CR is removed?
  • Is this a problem already when the Kuadrant CR is removed, and it was linked to other resources like authorino and limitador? (I haven't dived into that code yet)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One why the topology can change the clean up code is based around the parents of a resource. If the monitors are children of resources, let it be the kuadrant CR or autorino, limitador, what you can do is scan the root of the topology for any monitors. The clean up question becomes has this resource got any parents? If no, then it can be removed from the cluster.

controllers/observability_reconciler.go Outdated Show resolved Hide resolved
controllers/state_of_the_world.go Outdated Show resolved Hide resolved
controllers/observability_reconciler.go Show resolved Hide resolved
MonClient monclient.Interface
}

func NewObservabilityReconciler(client *dynamic.DynamicClient, rm meta.RESTMapper, mc monclient.Interface) *ObservabilityReconciler {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: when it comes to naming, are we using this initialism as a standard in this code base?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean the param names like rm for the RESTMapper param?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Observability Configuration Phase 1: Create ServiceMonitors for components in the kuadrant operator
4 participants