Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: VegaFusion runtime in a Worker for wasm support #529

Closed
wants to merge 27 commits into from

Conversation

jonmmease
Copy link
Collaborator

Experimental alternative to #527 that splits the VegaFusion runtime into a separate wasm package, hosting it inside a web worker. I was curious if this approach would reduce stuttering on the main thread, but it didn't end up making a difference, so I'm not planning to move forward with it right now. Creating and closing a PR here for future reference.

* Remove vegafusion-jupyter package

* port altair mock tests to use JupyterChart and

* relock, update pixi version in GA

* update ci

* install chrome, install test deps

* Don't run selenium tests on osx arm

* debugging

* disable selenium tests for windows

* remove unneeded windows deps

* comment other jobs

* add ipykernel

* fix selection

* restore all tests

* restore all tests

* comment out test with slider position issue
* Remove altair functionality from vegafusion package

* remove altair dep

* Install altair when testing
* combine vegafusion and vegafusion-python-embed

* fix exclude

* action fixes

* exclude based on crate name not directory name

* Fix pip install paths

* don't try to import embed
* drop Java library

* update workspace
…Python logic (#513)

* Initial update to DataFusion 42 and Arrow 53.1

* Remove protobuf python code

* Remove python protobuf instructions

* fix jsonwriter tests

* fix select tests

* more sort test fixes

* more test fixes

* test / clippy fixes

* more test/lint fixes

* clippy/fmt fixes

* Remove order col after evaluation

* clippy fix

* fixes

* update baselines

* fix sql tests

* update baselines

* sort

* fmt

* use published arrow crates
* Add VegaFusionWidget based on AnyWidget

* remove comment

* add vegafusion_widget test

* Rename test file

* fix ignore path for rename
* ruff format

* Initial lint fix

* Initial ruff lint passing

* Add mypy type checking

* Add ci check

* Add py.typed

* ignore mypy errors in jupyter widget

* fix imports

* bump min version to Python 3.9

* numpy < 2
* Add get_column_usage utility function

* Add Python function for column usage
* Add initial Arrow PyCapsule support

* Bump python in actions to 3.11

* fmt

* update hang test to use polars with pycapsule path

* Add system python of 3.11

* Add more efficient hashing and rechunk for DataFusion

* fmt

* use narwhals to remove unused columns.

* toward removing arrow/pyarrow flag

* Remove arrow-rs pyarrow flag, use pyo3-arrow

* Rename pyarrow feature flag to `py`

* toward using narwhals to process transformed data

* Remove python datasource

* Handle dict and non-narwhals pycapsule types

* update default extraction to arro3

* fix type checking

* build py before type checking

* skip empty fields in window transform

* Try normalize category order

* lower min Python back to 3.9

* clear wheel build dir first

* try rename artifacts

* cache?
* remove dep on psutil, pandas, pyarrow

* Add test that polars usage doesn't import pandas/pyarrow

* Add pandas/pyarrow to test deps
* Move ChartState to vegafusion-core, add VegaFusionRuntimeTrait

* format

* fmt

* type fixes
* Reapply "Refactor to move ChartState to vegafusion core (#519)" (#520)

This reverts commit 713b159.

* Use ChartState in vegafusion-wasm

* format
* Add grpc VegaFusionRuntimeTraim implementation, use from Python

* include inline datasets in query_request

* fix tests

* fix tests

* Add unimplemented stubs in wasm implementation

* fmt

* Move methods up to VegaFusionRuntimeTrait

* fmt/fix

* handle inline datasets in vegafusion-wasm

* warning / format

* fix requested indices

* fix python lint

* fix expected error message

* relock, mypy fixes

* bring back ignores for CI?
* Add release-opt profile, remove unused flags/deps

* add profile for minimizing size

* Fix tests
* Remove Python sql connection / sql dataset

* Remove rust python connection logic

* Fix python tests

* fmt
…Fs (#525)

* wip refactor to use DataFusion's DataFrame

* Add aggregate support

* Port additional transforms

* Port additional transforms

* Port window transform

* Port fold transform

* Port impute transform

* Port pivot transform

* port timeunit

* get schema from first batch

* Don't require metadata match

* start porting stack

* finish stack transform port

* Use object-store with DataFusion to load from http

* wip time functions

* parse %Y-%m-%d in UTC like the browser

* Update timeunit transform to use datafusion operations

* remove unused UDFs

* json fallback to reqwest

* Fix timezone parsing

* Fix selection_test

* all custom spec tests passing

* get image_comparison tests passing

* Get all vegafusion-runtime tests passing

* fix

* fix

* remove more udfs

* remove vegafusion-datafusion-udfs, vegafusion-dataframe, and vegafusion-sql crates

* fix tests

* clippy fix

* format

* warnings / format

* python test updates

* Update to datafusion main

* fmt

* re-enable format millis test, fix substr args

* Support Utf8View in json writer

* fix remaining python tests

* fmt

* clippy fix

* fmt

* work around wasm-pack error

* add call to update-pkg.js

* remove some stale comments
…526)

* Use vega-embed in vegafusion-wasm, drop vegafusion-embed package

* expose view, remove unneeded methods
@jonmmease jonmmease closed this Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant