Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add infer_sdtypes and infer_keys parameters to detect_from_dataframes method #2363

Merged
merged 5 commits into from
Feb 11, 2025

Conversation

fealho
Copy link
Member

@fealho fealho commented Feb 6, 2025

CU-86b3f92p5, Resolve #2341.

A few notes about the PR:

  • Setting infer_sdtypes=False means primary and foreign keys will not be detected. Primary keys can't be detected because all sdtypes are unknown and foreign keys are not detected because there are no primary keys.
  • Setting infer_sdtypes=False will set all columns to pii=True. This is the default behavior for columns with unknown sdtype.
  • Support for these parameters is only being added to the Metadata class methods, not to MultiTableMetadata or SingleTableMetadata.

@fealho fealho marked this pull request as ready for review February 7, 2025 18:02
@fealho fealho requested a review from a team as a code owner February 7, 2025 18:02
@fealho fealho requested review from gsheni and removed request for a team February 7, 2025 18:02
@fealho fealho requested review from lajohn4747 and frances-h and removed request for gsheni February 7, 2025 18:03
Copy link

codecov bot commented Feb 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.58%. Comparing base (363d8bd) to head (563358d).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2363   +/-   ##
=======================================
  Coverage   98.58%   98.58%           
=======================================
  Files          59       59           
  Lines        6130     6144   +14     
=======================================
+ Hits         6043     6057   +14     
  Misses         87       87           
Flag Coverage Δ
integration 82.21% <84.44%> (-0.01%) ⬇️
unit 97.44% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fealho fealho force-pushed the issue-2341-detection branch from 4b6dba4 to 1209687 Compare February 10, 2025 16:42
Copy link
Contributor

@lajohn4747 lajohn4747 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good just had one comment

@fealho fealho merged commit c1c5a19 into main Feb 11, 2025
41 checks passed
@fealho fealho deleted the issue-2341-detection branch February 11, 2025 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When detecting metadata from dataframes, allow me the option to turn on/off sdtype and relationship detection
4 participants