You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since the DbCreateFunction operator used by the REST API for registering Python UDFs stores the pickled form of the UDF in each extant worker's local Postgres database, without registering the implementation in the Myria system catalog, new workers will not inherit the implementation in their local Postgres database, and queries using previously-registered UDFs on those new workers will fail.
Based on discussion with @BrandonHaynes, I think the registration API needs to be redesigned to be compatible with elasticity. It would be relatively simple to store the pickled form of each Python UDF as a file in a well-known directory on the master, with filename corresponding to the function's registered name in the catalog (this is roughly the design we're using for Java UDFs). REEF will be responsible for copying the pickled function files to each worker on cluster startup, and each worker will register the pickled function in its local Postgres database in its initialization stage. We can use the same approach for Postgres UDFs, if we store them as script files in a well-known directory on the master.
The text was updated successfully, but these errors were encountered:
Since the
DbCreateFunction
operator used by the REST API for registering Python UDFs stores the pickled form of the UDF in each extant worker's local Postgres database, without registering the implementation in the Myria system catalog, new workers will not inherit the implementation in their local Postgres database, and queries using previously-registered UDFs on those new workers will fail.Based on discussion with @BrandonHaynes, I think the registration API needs to be redesigned to be compatible with elasticity. It would be relatively simple to store the pickled form of each Python UDF as a file in a well-known directory on the master, with filename corresponding to the function's registered name in the catalog (this is roughly the design we're using for Java UDFs). REEF will be responsible for copying the pickled function files to each worker on cluster startup, and each worker will register the pickled function in its local Postgres database in its initialization stage. We can use the same approach for Postgres UDFs, if we store them as script files in a well-known directory on the master.
The text was updated successfully, but these errors were encountered: