Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cardinality_rule parameter to AnonymizedFaker #762

Merged
merged 13 commits into from
Feb 27, 2024

Conversation

fealho
Copy link
Member

@fealho fealho commented Feb 22, 2024

CU-86az4dgdr, Resolve #756.

@sdv-team
Copy link
Contributor

@codecov-commenter
Copy link

codecov-commenter commented Feb 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (11bbe51) to head (da30c37).

❗ Current head da30c37 differs from pull request most recent head 87d14bf. Consider uploading reports for the commit 87d14bf to get more accurate results

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #762   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           18        18           
  Lines         2031      2061   +30     
=========================================
+ Hits          2031      2061   +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fealho fealho marked this pull request as ready for review February 22, 2024 15:08
@fealho fealho requested a review from a team as a code owner February 22, 2024 15:08
@fealho fealho requested review from gsheni, rwedge and R-Palazzo and removed request for a team and gsheni February 22, 2024 15:08
Copy link
Contributor

@R-Palazzo R-Palazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Can we add an integration test for the case cardinality_rule=match to check that we generate the same cardinality or lower than inside the real data

@@ -474,7 +474,6 @@ def test_one_hot_categoricals():

# Run
transformed_data = transformer.fit_transform(test_data, column='A')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep this blank line?

rdt/transformers/pii/anonymizer.py Outdated Show resolved Hide resolved
rdt/transformers/pii/anonymizer.py Outdated Show resolved Hide resolved
@fealho fealho requested a review from R-Palazzo February 22, 2024 21:34
Comment on lines +116 to +121
if enforce_uniqueness:
warnings.warn(
"The 'enforce_uniqueness' parameter is no longer supported. "
"Please use the 'cardinality_rule' parameter instead.",
FutureWarning
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the warning, if self.cardinality_rule is None, we could set self.cardinality_rule to 'unique' to match the intent of enforce_uniqueness=True. If self.cardinality_rule isn't None then the user definitely set both parameters and we should probably ignore enforce_uniqueness

@fealho fealho changed the base branch from main to Updating-readme February 26, 2024 19:00
@fealho fealho changed the base branch from Updating-readme to main February 26, 2024 19:00
@fealho fealho merged commit 1ae11e8 into main Feb 27, 2024
45 checks passed
@fealho fealho deleted the issue-756-anonymize-faker branch February 27, 2024 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow AnonymizedFaker to learn cardinality from the real data
6 participants