You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 4, 2024. It is now read-only.
From analysis of your code, it has a lot of overlap with your work, but has some valuable things, like the use of Data Classes from the Databricks Python SDK.
As you mentioned, you're using SDK from Databricks CLI - it's already considered deprecated and is replaced by Databricks Python SDK. It has a big advantage over the old SDK as it evolves together with the REST APIs.
If your code doesn't provide asynchronous execution, then either use of SDK could be the best way forward. Or we can switch to using DatabricksHook functions.
In your code, instead of JSON payload for tasks, and having dedicated operator for notebooks, we can switch to use data classes from the new SDK - it will give self-documenting capabilities and type safety.
The text was updated successfully, but these errors were encountered:
I've contacted @stikkireddy to support him in getting apache/airflow#32221 merged. We're making significant progress, and I expect it to be merged soon.
Given that the Databricks SDK interfaces are still changing (it didn't have a stable 1.0 release yet), we agreed not to have that as a dependency of the Airflow Databricks provider itself until it becomes stable.
On our migration task, I've been testing the DatabricksCreateJobsOperator, and I'm making changes to have DatabricksWorkflowTaskGroup working with this operator. I'm making the changes based on @stikkireddy 's branch.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
While discusssing contributing this work into the Apache Airflow repo with @alexott, he gave the following feedback:
We need to talk about integrating your work with JobsCreate operator, which is now developed by @Sri Tikkireddy (PR: Add
DatabricksJobsCreateOperator
apache/airflow#32221).From analysis of your code, it has a lot of overlap with your work, but has some valuable things, like the use of Data Classes from the Databricks Python SDK.
As you mentioned, you're using SDK from Databricks CLI - it's already considered deprecated and is replaced by Databricks Python SDK. It has a big advantage over the old SDK as it evolves together with the REST APIs.
If your code doesn't provide asynchronous execution, then either use of SDK could be the best way forward. Or we can switch to using DatabricksHook functions.
In your code, instead of JSON payload for tasks, and having dedicated operator for notebooks, we can switch to use data classes from the new SDK - it will give self-documenting capabilities and type safety.
The text was updated successfully, but these errors were encountered: