-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: wrap future streams also in io_runtime #3162
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Ion Koutsouris <[email protected]>
@Tom-Newton can you give this a spin? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3162 +/- ##
==========================================
- Coverage 72.22% 72.01% -0.22%
==========================================
Files 138 138
Lines 45251 45381 +130
Branches 45251 45381 +130
==========================================
- Hits 32683 32680 -3
- Misses 10498 10622 +124
- Partials 2070 2079 +9 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Ion Koutsouris <[email protected]>
36fccee
to
01da20b
Compare
I plan to give this a look later today |
I gave it a try, but I can definitely still reproduce |
@Tom-Newton what is your timeout set to? |
|
@Tom-Newton can you set it to a very large interval, 30 minutes or something? |
I gave it a try with 30 minutes. The success rate is not noticeably better but we do get a different error message.
Potentially this is useful information, but I have not yet had a chance to investigate further. It sounds like in this case Azure is returning an error response, which would likely be informative. |
@Tom-Newton this is during reading I presume? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to hold off on this change being merged for the time being. I need to understand the consequences of the changes to the core storage module here.
The 🍝 we have floating around to accommodate asynchronous behaviors with Python is creeping further into core here, and that gives me a lot of concern so I need to spend a fair bit more time evaluating this change
@@ -124,7 +124,7 @@ Let’s write 100 hours worth of data to the Delta table. | |||
|
|||
=== "Rust" | |||
```rust | |||
let mut table = DeltaOps::try_from_uri("observation_data") | |||
let mut table = DeltaOps::try_from_uri("observation_data", IORuntime::default()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this and the other documentation change are supposed to be wrapped in a Some
Yeah I'm fine with that, the discussion where this started from is also still ongoing: apache/datafusion#14286 (comment) so I don't think there is a full consensus on how to solve this topic yet, right @alamb? |
@alamb pointed out that the get stream would be not executed on the IO runtime unless it was wrapped. So thanks to his example PR I've added this now for the other streams :) Thanks @alamb!
Also addressed some other missing areas where I should have enabled the IO runtime:
I've also made it possible to pass an optional runtime in DeltaOps::try_from_uri, since this was missing but it seems a common used entry point to open a delta table