Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retryable operation failure should crash the app on startup as well #595

Open
4TT1L4 opened this issue May 15, 2023 · 1 comment
Open

Retryable operation failure should crash the app on startup as well #595

4TT1L4 opened this issue May 15, 2023 · 1 comment

Comments

@4TT1L4
Copy link
Contributor

4TT1L4 commented May 15, 2023

Problem:

  • When Oura cannot connect to the Cardano Node (Source type: N2C), oura is not crashing on startup after the last retry fails. [NOK]

Solution:

  • After the retryable operation fails ultimatily after the last try on startup, Oura should be crashing, so it could try to recover from the error state. At the moment this is not the case, but Oura freezes and simply stops processing. [NOK]

Notes:

  • When Oura loses the connection to the Cardano Node when it is already operating, the application crashes. This is good, since Oura could try to recover on the next start. However: after the restart, if the Cardano Node is still not available after the last retry, Oura just stops processing and nothing happens.
  • Workaround: change the Oura config to retry long enough, so the Cardano Node has time to recover.
  • Version information: v1.8.1

Logs:

image

retry_policy config:

[source]
type = "N2C"

 < < < some more config > > >

[sink.retry_policy]
max_retries = 5
backoff_unit = 1000
backoff_factor = 2
max_backoff = 100000
@4TT1L4
Copy link
Contributor Author

4TT1L4 commented Jul 1, 2023

This issue is still present. When the connection to the Node cannot be established right after start, Oura gets stuck after the last retry.

It would be nice, if the node cannot be established, there is some retry logic in place, but this is not configurable:

oura/src/sources/n2c.rs

Lines 138 to 140 in 2fb43c4

let mut peer_session = NodeClient::connect(&stage.config.socket_path, stage.chain.magic)
.await
.or_retry()?;

It would be nice, if this could be configured, so Oura would be try to connect longer and if it is faling, then Oura should be crashing.

This is not the case at the moment. The binary keeps on running after the last retry. Instead of this it would be useful if the binary would crash.

This would make Oura more resilient against Cardano Node outages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant