Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus query to determine if a MonoVertex has healthy data movement #2361

Open
juliev0 opened this issue Jan 24, 2025 · 5 comments
Open
Assignees
Labels
enhancement New feature or request

Comments

@juliev0
Copy link
Contributor

juliev0 commented Jan 24, 2025

Summary

Assuming Numaplane uses Prometheus querying, we need a query to determine if a MonoVertex has healthy data movement. It should return "healthy" state if there's no data flowing into the source.


Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

@juliev0
Copy link
Contributor Author

juliev0 commented Feb 14, 2025

Received query from @whynowy:

increase(numaflow_monovtx_ack_total{namespace="oss-analytics-sampledataflow-usw2-prd", mvtx_name="numa-endurance", mvtx_replica="0"}[1m])
(check this for > 0)

so I will close this

@juliev0 juliev0 closed this as completed Feb 14, 2025
@whynowy
Copy link
Member

whynowy commented Feb 14, 2025

Received query from @whynowy:

increase(numaflow_monovtx_ack_total{namespace="oss-analytics-sampledataflow-usw2-prd", mvtx_name="numa-endurance", mvtx_replica="0"}[1m]) (check this for > 0)

so I will close this

Be aware this is just an example query, not the final one.

@juliev0
Copy link
Contributor Author

juliev0 commented Feb 14, 2025

Received query from @whynowy:
increase(numaflow_monovtx_ack_total{namespace="oss-analytics-sampledataflow-usw2-prd", mvtx_name="numa-endurance", mvtx_replica="0"}[1m]) (check this for > 0)
so I will close this

Be aware this is just an example query, not the final one.

ahh, got it, thanks - in that case I will reopen this

@juliev0 juliev0 reopened this Feb 14, 2025
@whynowy
Copy link
Member

whynowy commented Feb 16, 2025

Received query from @whynowy:
increase(numaflow_monovtx_ack_total{namespace="oss-analytics-sampledataflow-usw2-prd", mvtx_name="numa-endurance", mvtx_replica="0"}[1m]) (check this for > 0)
so I will close this

Be aware this is just an example query, not the final one.

ahh, got it, thanks - in that case I will reopen this

@juliev0 - Be aware those queries should be configurable in Numaplane, so it really doesn't matter what they are, as long as all the queries are executed and the conditions are met, Numaplane can consider it's successful.

@juliev0
Copy link
Contributor Author

juliev0 commented Feb 16, 2025

Received query from @whynowy:
increase(numaflow_monovtx_ack_total{namespace="oss-analytics-sampledataflow-usw2-prd", mvtx_name="numa-endurance", mvtx_replica="0"}[1m]) (check this for > 0)
so I will close this

Be aware this is just an example query, not the final one.

ahh, got it, thanks - in that case I will reopen this

@juliev0 - Be aware those queries should be configurable in Numaplane, so it really doesn't matter what they are, as long as all the queries are executed and the conditions are met, Numaplane can consider it's successful.

Yes, of course it's an abstraction. But if we're going to do end to end testing, we need for it to work as well of course. As long as the query we have only runs the risk of false negatives and not much false positive (for detecting errors), then that meets our need. In any case, we have some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants