-
Notifications
You must be signed in to change notification settings - Fork 265
seemingly incorrect "ValueError: Duplicate primary keys" error #527
Comments
This is being caused by a disconnect between dbt's definition of "unique" and data-diff's definition. Dbt's uniqueness test (note that it excludes nulls):
Data-diff (nulls are counted):
If "total" != "total_distinct" -> throw a duplicate PK error Options I see:
|
I might vote for something like:
and include a count of records with null keys in the output |
My only hesitation with that is modifying the existing behavior for non Probably fairly edge case, but I think we should add a |
FWIW in the |
@mariahjrogers That makes sense to me and I think I agree. I'd rather get some data diffed (with a warning about my nulls) than have the whole thing fail. @dlawin I agree about making this |
Have a couple users reporting this isn't working for compound keys |
Describe the bug
This was reported in the dbt Slack: https://getdbt.slack.com/archives/C03D25A92UU/p1682373672200739
I'll request the user add additional info.
Make sure to include the following (minus sensitive information):
-d
switch for extra debug information.If possible, please paste these as text, and not a screenshot.
Describe the environment
Describe which OS you're using, which data-diff version, and any other information that might be relevant to this bug.
The text was updated successfully, but these errors were encountered: