Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python UDFs are incompatible with elasticity #895

Open
senderista opened this issue May 26, 2017 · 0 comments
Open

Python UDFs are incompatible with elasticity #895

senderista opened this issue May 26, 2017 · 0 comments
Assignees

Comments

@senderista
Copy link
Contributor

Since the DbCreateFunction operator used by the REST API for registering Python UDFs stores the pickled form of the UDF in each extant worker's local Postgres database, without registering the implementation in the Myria system catalog, new workers will not inherit the implementation in their local Postgres database, and queries using previously-registered UDFs on those new workers will fail.

Based on discussion with @BrandonHaynes, I think the registration API needs to be redesigned to be compatible with elasticity. It would be relatively simple to store the pickled form of each Python UDF as a file in a well-known directory on the master, with filename corresponding to the function's registered name in the catalog (this is roughly the design we're using for Java UDFs). REEF will be responsible for copying the pickled function files to each worker on cluster startup, and each worker will register the pickled function in its local Postgres database in its initialization stage. We can use the same approach for Postgres UDFs, if we store them as script files in a well-known directory on the master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants