forked from alibaba/feathub
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
13 changed files
with
149 additions
and
129 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# FeatHub SDK | ||
|
||
FeatHub empowers users to define features using a Pythonic and declarative SDK | ||
with built-in functions, allowing recursive feature creation based on existing | ||
definitions. | ||
|
||
Compared to SQL-based SDKs, FeatHub's Pythonic SDK seamlessly integrates with | ||
Python-focused machine learning libraries like scikit-learn and PyTorch. | ||
Python's expressiveness, including for-loops and if/else statements, leads to | ||
more concise and readable feature definition code, especially for a multitude | ||
of similar patterned features. | ||
|
||
- [FeatureView and Transformation](feature-view.md) | ||
- [Data Types and Reserved Keywords](dtypes.md) | ||
- [Built-in Operators and Functions](functions.md) | ||
- [Built-in Aggregations Functions](aggregation_functions.md) | ||
|
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# FeatureView - A Table of Features | ||
|
||
A `FeatureView` provides metadata to derive a table of feature values from | ||
other tables. FeatHub currently supports the following types of FeatureViews. | ||
|
||
- [DerivedFeatureView](https://github.com/alibaba/feathub/blob/master/python/feathub/feature_views/derived_feature_view.py) | ||
derives features by applying the given transformations on an existing table. | ||
It supports per-row transformation, over window transformation and table join. | ||
It does not support sliding window transformation. | ||
- [SlidingFeatureView](https://github.com/alibaba/feathub/blob/master/python/feathub/feature_views/sql_feature_view.py) | ||
derives features by applying the given transformations on an existing table. | ||
It supports per-row transformation and sliding window transformation. It does | ||
not support join or over window transformation. | ||
- [OnDemandFeatureView](https://github.com/alibaba/feathub/blob/master/python/feathub/feature_views/on_demand_feature_view.py) | ||
derives features by joining online request with features from tables in online | ||
feature stores. It supports per-row transformation and join with tables in | ||
online stores. It does not support over window transformation or sliding window | ||
transformation. | ||
- [SqlFeatureView](https://github.com/alibaba/feathub/blob/master/python/feathub/feature_views/sql_feature_view.py) | ||
derives features by evaluating a given SQL statement. Currently, its | ||
semantics depends on the processor used during deployment. We plan to make it | ||
processor-agnostic in the future to ensure consistent semantics regardless of | ||
processor choice. | ||
|
||
`FeatureView` provides APIs to specify and access `Feature`s. Each `Feature` is | ||
defined by the following metadata: | ||
- `name`: a string that uniquely identifies this feature in the parent table. | ||
- `dtype`: the data type of this feature's values. | ||
- `transform`: A declarative definition of how to derive this feature's values. | ||
- `keys`: an optional list of strings, corresponding to the names of fields in | ||
the parent table necessary to interpret this feature's values. If it is | ||
specified, it is used as the join key when FeatHub joins this feature onto | ||
another table. | ||
|
||
# Transformation - Declarative Definition of Feature Computation | ||
|
||
A `Transformation` defines how to derive a new feature from existing features. | ||
FeatHub currently supports the following types of Transformations. | ||
|
||
- [ExpressionTransform](https://github.com/alibaba/feathub/blob/master/python/feathub/feature_views/transforms/expression_transform.py) | ||
derives feature values by applying FeatHub expression on one row of the | ||
parent table at a time. The FeatHub expression language is a declarative | ||
language, sharing a syntax and grammar reminiscent of the SQL SELECT clause. | ||
See [here](./) for a comprehensive list of built-in data types, functions | ||
and operators. | ||
- [OverWindowTransform](https://github.com/alibaba/feathub/blob/master/python/feathub/feature_views/transforms/expression_transform.py) | ||
derives feature values by applying FeatHub expression and aggregation function | ||
on multiple rows of a table at a time. | ||
- [SlidingWindowTransform](https://github.com/alibaba/feathub/blob/master/python/feathub/feature_views/transforms/sliding_window_transform.py) | ||
derives feature values by applying FeatHub expression and aggregation function | ||
on multiple rows in a sliding window. | ||
- [JoinTransform](https://github.com/alibaba/feathub/blob/master/python/feathub/feature_views/transforms/expression_transform.py) | ||
derives feature values by joining parent table with a feature from another | ||
table. | ||
- [PythonUdfTransform](https://github.com/alibaba/feathub/blob/master/python/feathub/feature_views/transforms/python_udf_transform.py) | ||
derives feature values by applying a Python UDF on one row of the parent table | ||
at a time. | ||
|
||
|
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
# Metric Stores | ||
|
||
- [Overview](overview.md) | ||
- [Built-in Metrics](metrics.md) | ||
- [Prometheus](prometheus.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
# Built-in Metrics | ||
|
||
Below are Feathub's built-in metrics's metric types, their parameters and their | ||
exposed tags. | ||
|
||
## Count | ||
|
||
Count is a metric that shows the number of features. It has the following | ||
parameters: | ||
|
||
- filter_expr: Optional with None as the default value. If it is not None, it | ||
represents a partial FeatHub expression which evaluates to a boolean value. | ||
The partial Feathub expression should be a binary operator whose left child is | ||
absent and would be filled in with the host feature name. For example, "IS | ||
NULL" will be enriched into "{feature_name} IS NULL". Only features that | ||
evaluate this expression into True will be considered when computing the | ||
metric. | ||
- window_size: Optional with 0 as the default value. The time range to compute | ||
the metric. It should be zero or a positive time span. If it is zero, the | ||
metric will be computed from all feature values that have been processed since | ||
the Feathub job is created. | ||
|
||
It exposes the following metric-specific tags: | ||
|
||
- metric_type: "count" | ||
- filter_expr: The value of the filter_expr parameter. | ||
- window_size_sec: The value of the window_size parameter in seconds. | ||
|
||
## Ratio | ||
|
||
Ratio is a metric that shows the proportion of the number features that meets | ||
filter_expr to the number of all features. It has the following parameters: | ||
|
||
- filter_expr: A partial FeatHub expression which evaluates to a boolean value. | ||
The partial Feathub expression should be a binary operator whose left child is | ||
absent and would be filled in with the host feature name. For example, "IS | ||
NULL" will be enriched into "{feature_name} IS NULL". Only features that | ||
evaluate this expression into True will be considered when computing the | ||
metric. | ||
- window_size: Optional with 0 as the default value. The time range to compute | ||
the metric. It should be zero or a positive time span. If it is zero, the | ||
metric will be computed from all feature values that have been processed since | ||
the Feathub job is created. | ||
|
||
It exposes the following metric-specific tags: | ||
|
||
- metric_type: "ratio" | ||
- filter_expr: The value of the filter_expr parameter. | ||
- window_size_sec: The value of the window_size parameter in seconds. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters