Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with Redshift Lineage for many Queries #8917

Closed
gky249 opened this issue Sep 28, 2023 · 4 comments
Closed

Issues with Redshift Lineage for many Queries #8917

gky249 opened this issue Sep 28, 2023 · 4 comments
Labels
bug Bug report stale

Comments

@gky249
Copy link

gky249 commented Sep 28, 2023

Hi, We are observing some issues with Redshift lineage, for many queries, this error is thrown :

Error was An Identifier is expected, got Token[value: (] instead..

Could it be a issue related with dependencies on external tables ?
These problematic tables have dependency on tables from "stage_external" schema e.g.

Error parsing query create table "DATABASE"."SCHEMA"."TABLE" as ( with /* Starting points is .... from "DATABASE"."stage_external"."TABLE_X" ....
Error was An Identifier is expected, got Token[value: (] instead..

According to Redshift admin, the user we got have access to stage_external schema but no datasets are ingested from stage_external schema. When I search for "stage_external" in global search, I can see a couple of containers but these containers are empty.
Is it that tables in stage_external are external tables so they aren't ingested in Datahub ?
Datahub user is accessing SVV_TABLE_INFO and external tables aren't available in this view instead there is another view containing info about external tables i.e. SVV_EXTERNAL_TABLES.

For table lineage (parsing create, insert queries), I see these two types of error messages :

for getting lineage. Error was An Identifier is expected, got Token[value: (] instead..
for getting lineage. Error was An Identifier is expected, got Token[value: #] instead..

For Views lineage (CREATE MATERIALIZED VIEW), errors I see are :

Error was 'NoneType' object has no attribute 'raw_name'.
Error was An Identifier is expected, got Comparison[value: contracts_raw.contract_sk = events.contract_sk] instead..

Earlier I mentioned about stage_external schema, now I can see in the logs that the plugin tries to ingest tables/views from stage_external schema but it doesn't find any. i.e.

Processing schema: lt_phoenix.stage_external
Process tables in schema lt_phoenix.stage_external
No tables in cache for lt_phoenix.stage_external, skipping
Process views in schema lt_phoenix.stage_external
No views in cache for lt_phoenix.stage_external, skipping

Datahub version v10.5
Follow the discussion here - https://datahubspace.slack.com/archives/C029A3M079U/p1687961778614119

@gky249 gky249 added the bug Bug report label Sep 28, 2023
@treff7es
Copy link
Contributor

treff7es commented Oct 2, 2023

@gky249 we are working on replacing sqlineage SQL parser in Redshift as well (in Snowflake and Bigquery, we already replaced it). -> #8921
This change will also add column-level lineage support, and hopefully, it will resolve your SQL parser issues as well.

@shifanm
Copy link

shifanm commented Oct 4, 2023

Hi @treff7es this is great to know. Thanks. Is there any ETA around this being completed for Redshift?

Copy link

github-actions bot commented Nov 4, 2023

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

@github-actions github-actions bot added the stale label Nov 4, 2023
@hsheth2
Copy link
Collaborator

hsheth2 commented Nov 6, 2023

#8921 was merged, so I'm going to go ahead and close this issue.

@hsheth2 hsheth2 closed this as completed Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report stale
Projects
None yet
Development

No branches or pull requests

4 participants