Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingestion/powerbi): support advance sql construct in native query parsing #8251

Closed

Conversation

siddiquebagwan
Copy link
Contributor

No description provided.

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Jun 16, 2023
<br/>


Use full-table-name in `from` clause. for example `operations_analytics.transformed_prod.v_unit_targets` in below native SQL query. The table `operations_analytics.transformed_prod.v_unit_targets` will be ingested as upstream table.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these docs are getting kinda unwieldy and hard to read - could you put high level stuff here + create a new section below that contains the mquery parsing details + code blocks and things

table: str = tokens[from_index].value.split("as")[0].strip()

if len(table.split(".")) == 1:
table = f"{default_schema}.{table}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this definitely won't create valid urns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not creating URN. It is only returning the table name in the form of .. it is old code and existing test cases are working. The difference is; added default_schema as an argument.

"""
tables: List[str] = []

parser = LineageRunner(native_query)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have logic around LineageRunner in a ton of places in our codebase

does it make sense to refactor those? I'm also working on some sql parsing stuff, so having a cleanly defined interface without all this copy-pasted code would make my life a lot easier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, check this thread as well https://acryldata.slack.com/archives/C02ENS87APK/p1685715808871169?thread_ts=1685715193.359649&cid=C02ENS87APK

So Shall I use that interface in this connector initially and then have a separate PR for all of the connectors?

@@ -352,7 +362,7 @@ def resolve_to_data_platform_table_list(self) -> List[DataPlatformTable]:
continue

table_full_name_creator: AbstractDataPlatformTableCreator = (
supported_resolver.get_table_full_name_creator()()
supported_resolver.get_table_full_name_creator()(config=self.config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this do here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need config enable_advance_lineage_sql_construct in full-table-name creator to decide whether to go for advanced native query parsing.
Please check config.py: https://github.com/datahub-project/datahub/pull/8251/files#diff-f32896b0964b087e35e8a171a0b941c6aa0e53a5e5d42cb243b7fdd6e4eab3bcR384

@siddiquebagwan-gslab
Copy link
Contributor

It is duplicate of #8592

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants