Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to continuously keep retrying a step as long as it does not pass? #613

Closed
sudhindra95 opened this issue Dec 20, 2023 · 8 comments
Closed

Comments

@sudhindra95
Copy link

sudhindra95 commented Dec 20, 2023

While testing a kafka consume operation in dev/ prod environment, the kafka topic of interest might be getting several messages. I want to keep retrying my assertions as long as it fails. Specifically, I would like to implement this without specifying a maximum number of attempts (max:n property) in the retry block.
Please help me how can I do this?

@a1shadows
Copy link
Collaborator

Would it not be better to use some variation of seek here?
#619 might be useful here

@authorjapps
Copy link
Owner

without specifying a maximum number of attempts (max:n property)

Without specifying bit sounds bit unclear @sudhindra95 . In this case your test(consumer) will be hanging. Just think of your CI pipeline, it goes into a infinite loop of retry, will never finish that CI job.
imo it is better to keep a definite(deterministic) limit.

What's your thought @sudhindra95 ?

@sudhindra95
Copy link
Author

sudhindra95 commented Feb 11, 2024

Yeah. I agree. The goal of using Zerocode in my org is to identify the failure of any component in the data pipeline. But sometimes while testing in dev environment, it takes a lot of time to see the messages in Kafka topics if there is any lag due to high volume of data. I thought it was better to try something like to keep retrying .

@a1shadows
Copy link
Collaborator

@sudhindra95 won't putting a sufficiently high number of retries also achieve what you are suggesting?

Having unbounded/infinite retries is not a generally good practice and has the potential of blocking whatever CI this test is running on indefinitely. No matter what test it is, it should have a bounded termination for all scenarios it is run on.

@authorjapps what do you think?

@sudhindra95
Copy link
Author

sudhindra95 commented Feb 15, 2024

I was thinking, if I am sure that my test case will definitely pass why not keep retrying? I think high number of retries might solve it.

@authorjapps
Copy link
Owner

authorjapps commented Feb 15, 2024

I was thinking, if I am sure that my test case will definitely pass why not keep retrying? I think high number of retries might solve it.

@sudhindra95 , if you're sure it will pass, there is no need for testing it. You can directly put to live :).

Also on the other hand, how are you sure it'll even pass without testing it 1st?

I mean, your test might have other variable parameters which you might not have considered or thought of until you see the outcome in a CI job.

Have a thought once again on what are you trying to solve,
or
can you try to explain the problem in bit more details which you are trying to solve?

@sudhindra95
Copy link
Author

@authorjapps, In my org, Zerocode is mainly used to test the data pipelines which are already live wherein send some data for end-to-end testing. If there is any lag in the pipeline due to very high volume of data being processed lag happens in the pipelines due to which I am not able to see the expected data soon.
However, I am sure that the data will reach the point where I am testing. So I was thinking why not keep retrying at that point indefinitely as long as I don't see the data.

One quick fix is as @a1shadows suggested I can try using a bigger number for retry.

@nirmalchandra
Copy link
Collaborator

In my org, Zerocode is mainly used to test the data pipelines which are already live wherein send some data for end-to-end testing.

@sudhindra95, sounds like you are using the framework for slightly different purpose i.e. you're trying to do a bit of Live testing (basically kind of post-Live operational capability testing ), which is totally fine.
For this, actually you can use the filtering mechanism of the Zerocode on the consumed messages and assert(optional) on that/those messages (check the documentation or kakfa-testing module in the main repo for some example scenarios). Or simply observe the response in the console or log(instead of asserting) which might help in your usecase.

One quick fix is as @a1shadows suggested I can try using a bigger number for retry.

Or imo you can use the seekTimestamp mechanism which @a1shadows recently implemented.

And, if the above proposals still doesn't help, then it will be great if you can exaplain with an example scenario JSON here step by step in terms of what you want to achieve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants