-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ingesting objects that support the Arrow PyCapsule API #498
Comments
Yes, though it does require the user to have a relatively recent version of pyarrow. Let me know if I can help with pyo3-arrow at all! I've only published a version support |
Thanks for chiming in @kylebarron
Yeah, I need to update DataFusion and Arrow soon anyway.
Thats correct
Thanks for the call out, that's a good point. |
I started a PR for polars pycapsule export here: pola-rs/polars#17676 |
If you pointed me to where the arrow ingest happens, I could probably make a PR for this if you'd like |
Thanks for the offer! Here's is where the pyarrow tables are imported vegafusion/vegafusion-common/src/data/table.rs Lines 270 to 286 in 007bd44
This is invoked from the PyO3 Rust code in: vegafusion/vegafusion-python-embed/src/lib.rs Lines 189 to 196 in 007bd44
I'm imagining there would be a If this is blocked by updating arrow-rs, I can ping this thread once that's done. |
I think that's primarily a question of whether you're ok vendoring the relevant PyCapsule code (on top of arrow-rs' FFI code). It's a relatively small amount of code (polars pr for reference), and then you don't have to add a dependency on pyo3-arrow if you don't want. |
We could support ingesting objects that implement the Arrow PyCapsule API.
Compared to the current support for the DataFrame Interchange Protocol, accepting objects that implement the Arrow PyCapsule API wouldn't require pyarrow, and wouldn't require converting to a pyarrow Table on the Python side.
I think we could use @kylebarron's new pyo3-arrow crate for this (since it doesn't require the pyarrow dependency). In fact, I think we could drop pyarrow as a hard dependency using this approach, since pyarrow itself supports the PyCapsule API.
cc @MarcoGorelli based on comment in vega/altair#3452 (comment)
In order for VegaFusion (which powers Vega-Altair's optional
"vegafusion"
data transformer) to support polars without pyarrow (so that operations like Vega-Altair's histogram binning and aggregation are performed in the Python kernel rather than in the browser), I think we'll need polars to support the PyCapsule API as discussed in pola-rs/polars#12530.The text was updated successfully, but these errors were encountered: