-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ibis backend #53
Comments
😄 thanks for the suggestion, but I'd rather look into converting to substrait directly. Ibis is too heavy, I'd rather avoid it even as an optional dependency |
closing then, but I appreciate your interest! |
hello! just to add a bit more flavor on this from the Ibis perspective:
we would be interested in what we've called "API skins" on top of Ibis for pandas, PySpark, and Polars. the main issue here is time and effort. with pandas, you're always going to end up with an operation support matrix and, but there's already a pretty good start with BigQuery dataframes -- a pandas clone built on top of Ibis. one could take that project and build it work for any generic backend for PySpark/Polars, the approach would be similar. the Polars API has been relatively unstable and this probably wouldn't make sense to do until 1.0, though we'd welcome any contributions in this direction! it's just not asked frequently enough for us to put in the effort for any of these "skins". you could also probably get a long way by simply aliasing a few things, like
the "heavy" installation size of Ibis is nearly all from a few packages:
Narwhals has (or would have) the same issues today -- if you're running it on pandas, you must install pandas and numpy (and probably should install PyArrow) so your size is about the same. for any other backend, you have to install that backend and its dependencies so sizes may vary Ibis has been working to make pandas/pyarrow optional dependencies, and that work is pretty close to done. if you or anyone are eager to see that work get over the finish line, we welcome contributions! but "heavy" installation isn't a very frequent complaint. pandas/pyarrow dependencies are all over the place and it's not a big issue for most
we can already go from Ibis to Substrait -- the main issue here is Substrait is still fairly nascent, and unsupported by most backends. it does look promising that in a few years we will simply need dataframe API -> Substrait -> backend, but now if Narwhals wants to support 20+ backends I'm afraid it'll end up duplicating most of the work done in Ibis |
Thanks for your input! I have a client for whom I'm running the following on AWS lambda:
It all fits, and it all works wonderfully. Including a package which uses Ibis would be a non-starter Narwhals doesn't aim to support 20+ backends. My only objective here is in providing a compatibility layer to allow for writing dataframe-agnostic code which:
The target audience is library developers, not end-users. In that sense, I think its goal differ from Ibis', and hence why I don't consider them competitors I don't even know if anyone's going to use Narwhals, at the moment it's just a fun experiment, and it's working out better than I was expecting it to |
this is also out-of-scope for now, see https://github.com/MarcoGorelli/narwhals/issues/60 for more explanation |
Quick update: I'd like to support Ibis, but not for the full Narwhals API, see #566 For anyone interested in running DuckDB with a Python API, I'd suggest sticking with Ibis, realistically it'll always be out of scope for Narwhals |
I have a little library that provides an ipywidget for better ibis table exploration in jupyter. I am interested in modifying it to work with all dataframe libs, eg polars and pandas. I thought narwhals would be a good use for this. But I want it to still work with ibis. Am I correct in understanding that I am blocked on that until narwhals supports ibis as a backend? |
hey @NickCrews nice library! thanks for your question - for now I'd suggest having separate codepaths: a narwhals/dataframe one, and an ibis/sql one i'll let if you know when we progress with the lazy-only layer of support (possibly some time in 2025, right now the priority is on helping some integrations go from 80% to 100% of the way) |
Could be interesting, since you could then run any of their 20 backends with Polars like API 👀
The text was updated successfully, but these errors were encountered: