You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the ScalarRange, ScalarInequality, Positive, and Negative constraints all scale the data using log-based transformers. Since the SDV already enforces min/max values by simpler means, these constraints can be deprecated. However, it would be helpful to move the scaling logic into the RDT library, so that it may be used independently of the constraints in the future.
Expected behavior
Create a new RDT called LogitScaler. This RDT transforms the data by applying a logit function. Its functionality is equivalent to the ScalarRange constraint.
Parameters
missing_value_replacement (object): Same as FloatFormatter
missing_value_generation (str or None): Same as FloatFormatter
min_value (float): The min value for the logit function. Defaults to 0.
max_value (float): The max value for the logit function. Defaults to 1.0.
learn_rounding_scheme (bool): Same as FloatFormatter
Implementation Notes
Apply similar logic that is used in ScalarRange
During fit:
Learn needed information about missing values and rounding scheme.
Validate that we can actually apply the logit function to the data.
During transform:
Fill in missing values
Transform the data
Use the logit function to transform the data from range [min_value, max_value] into range (-inf, +inf)
If there are any issues with taking the logit, raise a descriptive error explaining what wen wrong:
Example: Error: Unable to apply the logit function to column 'output-value' due to an out-of-range value (101.0).
Intended usage: A user would call update_transformers to replace the FloatFormatter with the LogitScaler for any columns that would benefit from this.
Problem Description
Currently, the ScalarRange, ScalarInequality, Positive, and Negative constraints all scale the data using log-based transformers. Since the SDV already enforces min/max values by simpler means, these constraints can be deprecated. However, it would be helpful to move the scaling logic into the RDT library, so that it may be used independently of the constraints in the future.
Expected behavior
Create a new RDT called
LogitScaler
. This RDT transforms the data by applying a logit function. Its functionality is equivalent to the ScalarRange constraint.Parameters
missing_value_replacement (object)
: Same asFloatFormatter
missing_value_generation (str or None)
: Same asFloatFormatter
min_value (float)
: The min value for the logit function. Defaults to 0.max_value (float)
: The max value for the logit function. Defaults to 1.0.learn_rounding_scheme (bool)
: Same asFloatFormatter
Implementation Notes
ScalarRange
Error: Unable to apply the logit function to column 'output-value' due to an out-of-range value (101.0).
Intended usage: A user would call
update_transformers
to replace theFloatFormatter
with theLogitScaler
for any columns that would benefit from this.The text was updated successfully, but these errors were encountered: