You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Basically we do following steps to generate the result vector:
Validate input columns: column size and length for each column
Initialise the result vector
Extract values by row from columns
Call the rust function and dealing with error if any
Fill the result vector
Desired state
Because every implementation has do these 1/2/3/5 steps. An ergonomic solution
is to provide a declarative way to extract rust data types from column vectors,
and the user simply focus on calling rust function. The implementation of UDF
should be stateless, so until we have a real case, we don't need to provide any
type of context for execution except the original FunctionContext.
Inspired by how axum designed its web handler. The API looks like
FunctionExtN will provide default implementation for Function::eval.
TODO: think about how to detail with R
Limitation
Extractor has to be defined manually: But most extractor can be shared when
they don't have a particular meaning like this coordinate or resolution. In
some cases, they can be generic numbers or strings.
Variadic-argument is not supported with this design
Documentation
Procedural macro is preferred in this case for two types of usage:
As a marker for compile-time tools to extract rust docstrings to some markdown
files that can be hosted in our docs.greptime.com
As the code generation macro that generates a doc function to return
docstring at runtime. So we can have SQL query statement like SHOW DOC
function to return docstring.
Implementation challenges
No response
The text was updated successfully, but these errors were encountered:
@sunng87 We are going to remove the wrapper layer of our UDF/UDAF and use datafusion's UDF API in the future. Not sure if this issue can benefit from it.
What type of enhancement is this?
Refactor
What does the enhancement do?
The idea is to create a high level framework for UDF development (not UDAF), to
remove boilerplate code, and improve ergonomic.
The core responsibility of this framework is to provide:
Current status
At the moment, a typical implementation of UDF looks like this one:
https://github.com/GreptimeTeam/greptimedb/blob/main/src/common/function/src/scalars/geo/h3.rs#L95
Basically we do following steps to generate the result vector:
Desired state
Because every implementation has do these 1/2/3/5 steps. An ergonomic solution
is to provide a declarative way to extract rust data types from column vectors,
and the user simply focus on calling rust function. The implementation of UDF
should be stateless, so until we have a real case, we don't need to provide any
type of context for execution except the original
FunctionContext
.Inspired by how axum designed its web handler. The API looks like
FunctionExtN
will provide default implementation forFunction::eval
.TODO: think about how to detail with
R
Limitation
they don't have a particular meaning like this coordinate or resolution. In
some cases, they can be generic numbers or strings.
Documentation
Procedural macro is preferred in this case for two types of usage:
files that can be hosted in our docs.greptime.com
docfunction to returndocstring at runtime. So we can have SQL query statement like
SHOW DOCto return docstring.function
Implementation challenges
No response
The text was updated successfully, but these errors were encountered: