Skip to content
This repository has been archived by the owner on Sep 4, 2024. It is now read-only.

Address feedback to get this work into Apache Airflow #57

Open
tatiana opened this issue Oct 10, 2023 · 1 comment
Open

Address feedback to get this work into Apache Airflow #57

tatiana opened this issue Oct 10, 2023 · 1 comment

Comments

@tatiana
Copy link
Collaborator

tatiana commented Oct 10, 2023

While discusssing contributing this work into the Apache Airflow repo with @alexott, he gave the following feedback:

  • We need to talk about integrating your work with JobsCreate operator, which is now developed by @Sri Tikkireddy (PR: Add DatabricksJobsCreateOperator apache/airflow#32221).

  • From analysis of your code, it has a lot of overlap with your work, but has some valuable things, like the use of Data Classes from the Databricks Python SDK.

  • As you mentioned, you're using SDK from Databricks CLI - it's already considered deprecated and is replaced by Databricks Python SDK. It has a big advantage over the old SDK as it evolves together with the REST APIs.

  • If your code doesn't provide asynchronous execution, then either use of SDK could be the best way forward. Or we can switch to using DatabricksHook functions.

  • In your code, instead of JSON payload for tasks, and having dedicated operator for notebooks, we can switch to use data classes from the new SDK - it will give self-documenting capabilities and type safety.

@tatiana
Copy link
Collaborator Author

tatiana commented Oct 12, 2023

I've contacted @stikkireddy to support him in getting apache/airflow#32221 merged. We're making significant progress, and I expect it to be merged soon.

Given that the Databricks SDK interfaces are still changing (it didn't have a stable 1.0 release yet), we agreed not to have that as a dependency of the Airflow Databricks provider itself until it becomes stable.

On our migration task, I've been testing the DatabricksCreateJobsOperator, and I'm making changes to have DatabricksWorkflowTaskGroup working with this operator. I'm making the changes based on @stikkireddy 's branch.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant