-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingestion/snowflake): use user email-id in urn generation for top users stat #8513
feat(ingestion/snowflake): use user email-id in urn generation for top users stat #8513
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. Just some docs and code structure questions / requests
@@ -120,6 +120,11 @@ class SnowflakeV2Config( | |||
"upstreams_deny_pattern", "temporary_tables_pattern" | |||
) | |||
|
|||
email_as_user_identifier: bool = Field( | |||
default=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this default True and add a breaking change in docs/how/updating-datahub.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -120,6 +120,11 @@ class SnowflakeV2Config( | |||
"upstreams_deny_pattern", "temporary_tables_pattern" | |||
) | |||
|
|||
email_as_user_identifier: bool = Field( | |||
default=False, | |||
description="When enabled user email id will be used in user urn generation for stat", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does stat
mean here?
description="When enabled user email id will be used in user urn generation for stat", | |
description="Format user urns as an email, if the snowflake user's email is set. If `email_domain` is provided, generates email addresses for snowflake users with unset emails, based on their username.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -247,7 +249,9 @@ def _map_top_sql_queries(self, top_sql_queries: Dict) -> List[str]: | |||
] | |||
) | |||
|
|||
def _map_user_counts(self, user_counts: Dict) -> List[DatasetUserUsageCounts]: | |||
def _map_user_counts( | |||
self, user_counts: Dict, email_as_user_identifier: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just use self.email_as_user_identifier
in here and not pass the parameter? Don't feel strongly here at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
user_urn = make_user_urn(self.get_user_identifier(user_name, user_email)) | ||
user_urn = make_user_urn( | ||
self.get_user_identifier( | ||
user_name, user_email, self.config.email_as_user_identifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add logic to generate an email from self.config.email_domain
if it's set, like we do in _map_user_counts
? Perhaps it'd be best to put that logic in get_user_identifier
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep it as is. The current method uses the user_email to determine whether to skip the current metadata (i.e continue in for loop)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…p users stat (#8513) Co-authored-by: MohdSiddiqueBagwan <[email protected]>
No description provided.