Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: wrap future streams also in io_runtime #3162

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ion-elgreco
Copy link
Collaborator

@alamb pointed out that the get stream would be not executed on the IO runtime unless it was wrapped. So thanks to his example PR I've added this now for the other streams :) Thanks @alamb!

Also addressed some other missing areas where I should have enabled the IO runtime:

  • DeltaFileSystemHandler
  • Creating a table
  • Is Delta Table check
  • Writing a new table (used by pyarrow engine)
  • Writing a new table (rust engine)

I've also made it possible to pass an optional runtime in DeltaOps::try_from_uri, since this was missing but it seems a common used entry point to open a delta table

@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate labels Jan 25, 2025
@ion-elgreco
Copy link
Collaborator Author

@Tom-Newton can you give this a spin?

Copy link

codecov bot commented Jan 25, 2025

Codecov Report

Attention: Patch coverage is 12.57143% with 153 lines in your changes missing coverage. Please review.

Project coverage is 72.01%. Comparing base (5113aea) to head (e25b6e4).

Files with missing lines Patch % Lines
crates/core/src/storage/mod.rs 0.00% 133 Missing ⚠️
crates/core/src/operations/mod.rs 40.00% 7 Missing and 2 partials ⚠️
python/src/lib.rs 0.00% 7 Missing ⚠️
crates/benchmarks/src/bin/merge.rs 0.00% 2 Missing ⚠️
crates/core/src/protocol/checkpoints.rs 66.66% 1 Missing ⚠️
python/src/filesystem.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3162      +/-   ##
==========================================
- Coverage   72.22%   72.01%   -0.22%     
==========================================
  Files         138      138              
  Lines       45251    45381     +130     
  Branches    45251    45381     +130     
==========================================
- Hits        32683    32680       -3     
- Misses      10498    10622     +124     
- Partials     2070     2079       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Ion Koutsouris <[email protected]>
@alamb
Copy link
Contributor

alamb commented Jan 27, 2025

I plan to give this a look later today

@Tom-Newton
Copy link
Contributor

@Tom-Newton can you give this a spin?

I gave it a try, but I can definitely still reproduce _internal.DeltaError: Failed to parse parquet: External: Generic MicrosoftAzure error: error decoding response body when running at scale, cross region. I would guess that the cause is network or Azure flakiness, where object-store is not retrying, rather than a tokio runtime issue. I'm quite interested in https://github.com/apache/arrow-rs/pull/6519/files.

@ion-elgreco
Copy link
Collaborator Author

@Tom-Newton what is your timeout set to?

@Tom-Newton
Copy link
Contributor

@Tom-Newton what is your timeout set to?

deltalake.DeltaTable(path, storage_options={"timeout": "120s"})

@ion-elgreco
Copy link
Collaborator Author

@Tom-Newton can you set it to a very large interval, 30 minutes or something?

@Tom-Newton
Copy link
Contributor

Tom-Newton commented Jan 28, 2025

@Tom-Newton can you set it to a very large interval, 30 minutes or something?

I gave it a try with 30 minutes. The success rate is not noticeably better but we do get a different error message.

      File "/redacted/pip-svc_deltalake/site-packages/deltalake/table.py", line 1270, in update_incremental
        self._table.update_incremental()
    OSError: Generic MicrosoftAzure error: Error after 0 retries in 993.500001046s, max_retries:10, retry_timeout:180s, source:error sending request for url (https://redacted.blob.core.windows.net/redacted/_delta_log/00000000000000146433.json)

Potentially this is useful information, but I have not yet had a chance to investigate further. It sounds like in this case Azure is returning an error response, which would likely be informative.

@ion-elgreco
Copy link
Collaborator Author

@Tom-Newton this is during reading I presume?

Copy link
Member

@rtyler rtyler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to hold off on this change being merged for the time being. I need to understand the consequences of the changes to the core storage module here.

The 🍝 we have floating around to accommodate asynchronous behaviors with Python is creeping further into core here, and that gives me a lot of concern so I need to spend a fair bit more time evaluating this change

@@ -124,7 +124,7 @@ Let’s write 100 hours worth of data to the Delta table.

=== "Rust"
```rust
let mut table = DeltaOps::try_from_uri("observation_data")
let mut table = DeltaOps::try_from_uri("observation_data", IORuntime::default())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this and the other documentation change are supposed to be wrapped in a Some

@ion-elgreco
Copy link
Collaborator Author

I would like to hold off on this change being merged for the time being. I need to understand the consequences of the changes to the core storage module here.

The 🍝 we have floating around to accommodate asynchronous behaviors with Python is creeping further into core here, and that gives me a lot of concern so I need to spend a fair bit more time evaluating this change

Yeah I'm fine with that, the discussion where this started from is also still ongoing: apache/datafusion#14286 (comment) so I don't think there is a full consensus on how to solve this topic yet, right @alamb?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants