-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingestion/powerbi): support advance sql construct in native query parsing #8251
feat(ingestion/powerbi): support advance sql construct in native query parsing #8251
Conversation
<br/> | ||
|
||
|
||
Use full-table-name in `from` clause. for example `operations_analytics.transformed_prod.v_unit_targets` in below native SQL query. The table `operations_analytics.transformed_prod.v_unit_targets` will be ingested as upstream table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these docs are getting kinda unwieldy and hard to read - could you put high level stuff here + create a new section below that contains the mquery parsing details + code blocks and things
table: str = tokens[from_index].value.split("as")[0].strip() | ||
|
||
if len(table.split(".")) == 1: | ||
table = f"{default_schema}.{table}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this definitely won't create valid urns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not creating URN. It is only returning the table name in the form of .. it is old code and existing test cases are working. The difference is; added default_schema
as an argument.
""" | ||
tables: List[str] = [] | ||
|
||
parser = LineageRunner(native_query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have logic around LineageRunner in a ton of places in our codebase
does it make sense to refactor those? I'm also working on some sql parsing stuff, so having a cleanly defined interface without all this copy-pasted code would make my life a lot easier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, check this thread as well https://acryldata.slack.com/archives/C02ENS87APK/p1685715808871169?thread_ts=1685715193.359649&cid=C02ENS87APK
So Shall I use that interface in this connector initially and then have a separate PR for all of the connectors?
@@ -352,7 +362,7 @@ def resolve_to_data_platform_table_list(self) -> List[DataPlatformTable]: | |||
continue | |||
|
|||
table_full_name_creator: AbstractDataPlatformTableCreator = ( | |||
supported_resolver.get_table_full_name_creator()() | |||
supported_resolver.get_table_full_name_creator()(config=self.config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need config enable_advance_lineage_sql_construct
in full-table-name creator to decide whether to go for advanced native query parsing.
Please check config.py: https://github.com/datahub-project/datahub/pull/8251/files#diff-f32896b0964b087e35e8a171a0b941c6aa0e53a5e5d42cb243b7fdd6e4eab3bcR384
It is duplicate of #8592 |
No description provided.