-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ingest/dremio): update dremio sql query to retrieve queried datasets in sql jobs #11801
fix(ingest/dremio): update dremio sql query to retrieve queried datasets in sql jobs #11801
Conversation
@@ -251,7 +251,7 @@ class DremioSQLQueries: | |||
SELECT | |||
* | |||
FROM | |||
SYS.PROJECT.HISTORY.JOBS | |||
sys.project.history.jobs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this case change necessary ? Is it also required for other sql queries from this file ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dremio-brock has been doing some testing and found that this query specifically was giving a null pointer exception with Dremio unless this was changed to lowercase. This is a workaround to a Dremio Cloud bug.
@@ -242,7 +242,7 @@ class DremioSQLQueries: | |||
SYS.JOBS_RECENT | |||
WHERE | |||
STATUS = 'COMPLETED' | |||
AND LENGTH(queried_datasets)>0 | |||
AND ARRAY_SIZE(queried_datasets)>0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Software, the datatype is a varchar, LENGTH should be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dremio Software docs suggest queried_datasets is an array of varchar [varchar]
-> https://docs.dremio.com/current/reference/sql/system-tables/jobs_recent/#fields
Is the type in the doc incorrect ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Dremio SW this is a varchar - the docs are incorrect though as it states an array of strings, i.e. [varchar] vs varchar. In SW this is coming through as a varchar (although it is a string that is an array representation), whereas on Cloud this correctly comes through as an array of strings. there is a PR open to fix this here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming. Can you please add this inconsistency of queried_datasets type as NOTE comment on fix PR ?
Checklist