[PECO-729] Improve retry behavior #230

kravets-levko · 2024-02-14T21:26:45Z

PECO-729

Respect Retry-After header and update backoff algorithm
Update list of HTTP status code that could be retried
Retry only idempotent requests (a restricted set of Thrift operations)
Add/update tests

Should be easier (hopefully) to review commit by commit

Note: this PR doesn't include logic for retry on network errors. That part will be covered in a follow-up

…algorithm Signed-off-by: Levko Kravets <[email protected]>

Signed-off-by: Levko Kravets <[email protected]>

…of Thrift operations) Signed-off-by: Levko Kravets <[email protected]>

Signed-off-by: Levko Kravets <[email protected]>

codecov-commenter · 2024-02-19T18:14:30Z

Codecov Report

Attention: Patch coverage is 96.66667% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 93.19%. Comparing base (17112c7) to head (7303b9a).

Files	Patch %	Lines
lib/connection/connections/ThriftHttpConnection.ts	95.23%	1 Missing ⚠️
lib/hive/Commands/BaseCommand.ts	66.66%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #230      +/-   ##
==========================================
+ Coverage   93.09%   93.19%   +0.09%     
==========================================
  Files          62       63       +1     
  Lines        1478     1513      +35     
  Branches      256      262       +6     
==========================================
+ Hits         1376     1410      +34     
- Misses         40       41       +1     
  Partials       62       62

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

andrefurlan-db · 2024-03-08T19:49:05Z

tests/unit/connection/connections/HttpRetryPolicy.test.js

+      expect(result.retryAfter).to.equal(200);
+    });
+
+    it('should use backoff when `Retry-After` header is missing', async () => {


@benc-db , was there an issue in serverless when the retry-after was too short and we ended up running out of retries waiting for cluster to start up?

Yeah, the issue we were hitting was that Retry-After is always either 1s for serverless or 5s for classic. For pysql, we moved back to always using this Retry-After as a floor, and otherwise using exponential backoff, because 2.5 minutes (or 30s for serverless!) is often not enough time for compute to become available.

@andrefurlan-db @benc-db I updated code so now Retry-After is used as a lower bound for backoff algorithm (not instead of backoff) - eac7d77 Please take one more look. Thank you!

Signed-off-by: Levko Kravets <[email protected]>

…ks/databricks-sql-nodejs into PECO-729-improve-retry-strategy

kravets-levko added 4 commits February 14, 2024 20:48

[PECO-729] Respect Retry-After header with falling back to backoff …

08cce04

…algorithm Signed-off-by: Levko Kravets <[email protected]>

[PECO-729] Extend list of HTTP status codes that could be retried

9c947ef

Signed-off-by: Levko Kravets <[email protected]>

Pass Request object in addition to Response to HttpRetryPolicy

cdadcff

Signed-off-by: Levko Kravets <[email protected]>

[PECO-729] Retry only idempotent requests (HTTP GET + restricted set …

fe3a798

…of Thrift operations) Signed-off-by: Levko Kravets <[email protected]>

kravets-levko temporarily deployed to azure-prod February 14, 2024 21:26 — with GitHub Actions Inactive

Merge branch 'main' into PECO-729-improve-retry-strategy

0236034

kravets-levko temporarily deployed to azure-prod February 19, 2024 15:08 — with GitHub Actions Inactive

databricks deleted a comment from codecov-commenter Feb 19, 2024

Update HttpRetryPolicy logic; add/update tests

8e4aa41

Signed-off-by: Levko Kravets <[email protected]>

kravets-levko temporarily deployed to azure-prod February 19, 2024 17:51 — with GitHub Actions Inactive

Reduce max retry attempts to 5

dad5cfd

Signed-off-by: Levko Kravets <[email protected]>

kravets-levko had a problem deploying to azure-prod February 19, 2024 18:04 — with GitHub Actions Failure

kravets-levko temporarily deployed to azure-prod February 19, 2024 18:11 — with GitHub Actions Inactive

kravets-levko marked this pull request as ready for review February 19, 2024 18:15

kravets-levko requested review from arikfr, superdupershant, yunbodeng-db, andrefurlan-db and rcypher-databricks as code owners February 19, 2024 18:15

kravets-levko requested a review from benc-db February 28, 2024 19:50

Merge branch 'main' into PECO-729-improve-retry-strategy

dc17031

kravets-levko temporarily deployed to azure-prod March 5, 2024 17:54 — with GitHub Actions Inactive

andrefurlan-db reviewed Mar 8, 2024

View reviewed changes

kravets-levko added 2 commits March 11, 2024 18:46

Use Retry-After as a base for backoff, not instead of it

eac7d77

Signed-off-by: Levko Kravets <[email protected]>

Merge branch 'PECO-729-improve-retry-strategy' of github.com:databric…

ea28604

…ks/databricks-sql-nodejs into PECO-729-improve-retry-strategy

kravets-levko temporarily deployed to azure-prod March 11, 2024 16:47 — with GitHub Actions Inactive

Merge branch 'main' into PECO-729-improve-retry-strategy

44f7253

kravets-levko temporarily deployed to azure-prod March 11, 2024 16:52 — with GitHub Actions Inactive

Merge branch 'main' into PECO-729-improve-retry-strategy

7303b9a

kravets-levko requested a review from jackyhu-db as a code owner March 12, 2024 19:27

kravets-levko temporarily deployed to azure-prod March 12, 2024 19:27 — with GitHub Actions Inactive

kravets-levko requested a review from andrefurlan-db March 14, 2024 18:22

benc-db approved these changes Mar 18, 2024

View reviewed changes

kravets-levko merged commit 6673660 into main Mar 18, 2024
8 checks passed

kravets-levko deleted the PECO-729-improve-retry-strategy branch March 18, 2024 22:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PECO-729] Improve retry behavior #230

[PECO-729] Improve retry behavior #230

kravets-levko commented Feb 14, 2024 •

edited

Loading

codecov-commenter commented Feb 19, 2024 •

edited

Loading

andrefurlan-db Mar 8, 2024

benc-db Mar 8, 2024

kravets-levko Mar 11, 2024

[PECO-729] Improve retry behavior #230

[PECO-729] Improve retry behavior #230

Conversation

kravets-levko commented Feb 14, 2024 • edited Loading

codecov-commenter commented Feb 19, 2024 • edited Loading

Codecov Report

andrefurlan-db Mar 8, 2024

Choose a reason for hiding this comment

benc-db Mar 8, 2024

Choose a reason for hiding this comment

kravets-levko Mar 11, 2024

Choose a reason for hiding this comment

kravets-levko commented Feb 14, 2024 •

edited

Loading

codecov-commenter commented Feb 19, 2024 •

edited

Loading