-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingest/snowflake): initialize schema resolver from datahub for l… #8903
feat(ingest/snowflake): initialize schema resolver from datahub for l… #8903
Conversation
…ineage-only ingestion
def _init_schema_resolver(self) -> SchemaResolver: | ||
if not self.config.include_technical_schema and self.config.parse_view_ddl: | ||
if self.ctx.graph: | ||
return self.ctx.graph.initialize_schema_resolver_from_datahub( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of initialize_schema_resolver_from_datahub returning Tuple[SchemaResolver, set[urns]]
, can we change it so that it only returns SchemaResolver, and in turn SchemaResolver has a method to get a set of loaded urns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. What would we call this method in SchemaResolver ? get_urns
?
@@ -1004,8 +1017,6 @@ def _process_view( | |||
yield from self._process_tag(tag) | |||
|
|||
yield from self.gen_dataset_workunits(view, schema_name, db_name) | |||
elif self.config.parse_view_ddl: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these changes confuse me a bit - are we dropping parse_view_ddl
, or is that functionality handled elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These conditions were part of schema extraction process, such that - even if schema ingestion is disabled but parse_view_ddl is enabled, the code to fetch and generate schema metadata from snowflake would run, in order for that to be used during view definitions cll extraction.
Actual view definition parsing logic is in snowflake_lineage_v2.py and the only thing it requires is view_definitions to be populated. This code is still intact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok that broadly makes sense
d28b157
to
04ea7cb
Compare
04ea7cb
to
3eb8189
Compare
…ineage-only ingestion
Checklist