-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add primary key auto-detection for semantic sdtypes #1731
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1731 +/- ##
==========================================
- Coverage 97.13% 97.11% -0.02%
==========================================
Files 48 48
Lines 4507 4512 +5
==========================================
+ Hits 4378 4382 +4
- Misses 129 130 +1 ☔ View full report in Codecov by Sentry. |
# When no primary key column was set, choose the first pii field | ||
if self.primary_key is None and first_pii_field: | ||
self.primary_key = first_pii_field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, is the logic we want here to set the primary key to the first ID or PII column we find? Or is it to fall back on PII columns only if we don't find an ID column in the table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue doesn't specify, so I implemented the second approach. I could change this if we have evidence that PII columns are just as likely as ID ones to be selected as primary keys (in real world datasets).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
CU-86ayxgv0h, Resolve #1724