Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSD-22337 route some customer alerts to null receiver #342

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion build/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ COPY . .
RUN make go-build

####
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.10-1052.1724178568
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.10-1086

ENV USER_UID=1001 \
USER_NAME=configure-alertmanager-operator
Expand Down
2 changes: 1 addition & 1 deletion build/Dockerfile.olm-registry
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ COPY ${SAAS_OPERATOR_DIR} manifests
RUN initializer --permissive

# ubi-micro does not work for clusters with fips enabled unless we make OpenSSL available
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.10-1052.1724178568
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.10-1086

COPY --from=builder /bin/registry-server /bin/registry-server
COPY --from=builder /bin/grpc_health_probe /bin/grpc_health_probe
Expand Down
7 changes: 7 additions & 0 deletions controllers/secret_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -457,6 +457,13 @@ func createSubroutes(namespaceList []string, receiver receiverType) *alertmanage
// Route ClusterOperatorDown for insights to null receiver https://issues.redhat.com/browse/OSD-19800
// Also needs to be silenced for FedRAMP until its made available in the environment https://issues.redhat.com/browse/OSD-13685
{Receiver: receiverNull, Match: map[string]string{"alertname": "ClusterOperatorDown", "name": "insights"}},
// Route some customer defined alerts to null receiver
// https://issues.redhat.com/browse/OSD-22337
{Receiver: receiverNull, Match: map[string]string{"alertname": "memory-request"}},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a great long term solution to filter on alertname, is there a better way to identify these "user defined" alerts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a short term solution. There is no easy way to identify those "user defined" alerts. We had several discussion about this. There is a doc about our finding https://docs.google.com/document/d/1OscbdlZ-aBuwY7YKJsU5URyDvfS9A4-VV4yBhpowfy4/edit#heading=h.i7a6b1h441g

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use some better labels to distinguish between Red Hat defined alert and user defined alert. But that requires more discussion and nearly impact all the existing alerts. That's why we put a short term solution here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a great long term solution to filter on alertname, is there a better way to identify these "user defined" alerts?

Agree, looks this PR’s current logic relies on matching specific alertname values to identify user-defined alerts and route them to a null. However, if any change in the alert name or new customer-defined alerts not explicitly added in this list can bypass the receiever.

probably use a label user_alert=true or something similar to match alerts instead of relying on alertnames?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I would prefer long term solution which is adding some common labels so we could easily distinguish user defined alerts and Red Hat defined alerts. That would be lovely and this makes our life easier. Unfortunately, we don't have this type of definition in any our docs, so there is no standard labels to be used ATM. It requires a lot of effort to discuss with BU and wider teams about what should be the labels used and make a agreement.

{Receiver: receiverNull, Match: map[string]string{"alertname": "cpu-request"}},
{Receiver: receiverNull, Match: map[string]string{"alertname": "daemonset-availability"}},
{Receiver: receiverNull, Match: map[string]string{"alertname": "sonarqube-ErrorBudgetBurn"}},
{Receiver: receiverNull, Match: map[string]string{"alertname": "DnsDefaultPodRestarted"}},
}

if !config.IsFedramp() {
Expand Down