Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always prepend bundle remote paths with /Workspace #1724

Merged
merged 16 commits into from
Oct 2, 2024

Conversation

andrewnester
Copy link
Contributor

@andrewnester andrewnester commented Aug 27, 2024

Changes

Due to platform changes, all libraries, notebooks and etc. paths used in Databricks must be started with either /Workspace or /Volumes prefix.

This PR makes sure that all bundle paths are correctly prefixed.

Note: this change is a breaking change if user previously configured and used /Workspace/Workspace folder in their workspace file system or having /Workspace/${workspace.root_path}... pattern configured anywhere in their bundle config

Fixes: #1751

AI:

  • Scan DABs config and error out on /Workspace/${workspace.root_path}... pattern usage

Tests

Added unit tests

Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to make the warning on composition with /Workspace part of this PR?

bundle/config/mutator/prepend_workspace_prefix.go Outdated Show resolved Hide resolved
bundle/config/mutator/prepend_workspace_prefix.go Outdated Show resolved Hide resolved
bundle/config/validate/no_workspace_prefix_used.go Outdated Show resolved Hide resolved
bundle/config/validate/no_workspace_prefix_used.go Outdated Show resolved Hide resolved
bundle/config/validate/no_workspace_prefix_used.go Outdated Show resolved Hide resolved
bundle/phases/initialize.go Outdated Show resolved Hide resolved
Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible for users to perform this prefix step in their code. For example, they pass along ${workspace.file_path} as a job parameter and then prefix it with /Workspace in their code when accessing a file.

We cannot intercept this at all so we have to call this out in the release notes.

bundle/config/validate/no_workspace_prefix_used.go Outdated Show resolved Hide resolved
bundle/config/mutator/rewrite_workspace_prefix.go Outdated Show resolved Hide resolved
bundle/config/mutator/rewrite_workspace_prefix.go Outdated Show resolved Hide resolved
docs/release_notes/workspace_prefix.md Outdated Show resolved Hide resolved
Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering, does TF gracefully deal with these paths or does it show up as a permanent diff? I recall seeing some service removing this prefix on the backend.

bundle/config/mutator/rewrite_workspace_prefix.go Outdated Show resolved Hide resolved
@andrewnester
Copy link
Contributor Author

@pietern verified how changing the paths from /Users to /Workspace/Users affects all the resources deployable by DABs. Out of everything it causes only permanent drift for ML experiments and Quality monitors (in name and assets_dir params correspondingly) but does not lead to recreation of these resources.

Also confirmed it's not possible to use paths in ML model names (A workspace registered model name must be a non-empty UTF-8 string and cannot contain forward slashes(/), periods(.), or colons(:))

newPath := strings.Replace(vv, path, replacePath, 1)
diags = append(diags, diag.Diagnostic{
Severity: diag.Warning,
Summary: fmt.Sprintf("substring %q found in %q. Please update this to %q. For more information, please refer to: https://docs.databricks.com/en/release-notes/dev-tools/bundles.html#workspace-paths", path, vv, newPath),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the instructions + link should appear in the detail.

@andrewnester andrewnester added this pull request to the merge queue Oct 2, 2024
@andrewnester andrewnester removed this pull request from the merge queue due to a manual request Oct 2, 2024
@andrewnester andrewnester added this pull request to the merge queue Oct 2, 2024
Merged via the queue into main with commit a8cff48 Oct 2, 2024
5 checks passed
@andrewnester andrewnester deleted the fix/always-workspace-prefix branch October 2, 2024 15:43
andrewnester added a commit that referenced this pull request Oct 9, 2024
Important changes:

Workspace paths are automatically prefixed with `/Workspace`. In addition, all usage of path strings such as `/Workspace/${workspace.root_path}/...` in bundle configuration is automatically replaced with `${workspace.root_path}/...` and generates a warning as part of bundle validate.

If you have specified a custom `workspace.root_path`, `workspace.artifact_path`, or `workspace.file_path`, Databricks Asset Bundles automatically prefixes it with `/Workspace`, but if you use any of these as variables (for example, `my_config_path: /Workspace/${workspace.file_path}/config`), you need to update those entries to remove the /Workspace prefix to avoid the warning.

If you pass one of these as variables and prefix them in your code, you need to update your code to not do this.

This change is required because originally when the workspace file system was rooted at `/` and home directories were under `/Users`, to access workspace paths through the Databricks REST API you would use these paths directly. To access workspace paths from your code, you could use the `/Workspace` file path and home directories were also available under `/Workspace/Users`. To avoid this duality of workspace paths, as well as the ambiguity between workspace paths and Unity Catalog `/Volumes` paths, all workspace paths are prefixed with `/Workspace`.

Bundles:
 * Add an error if state files grow bigger than the export limit ([#1795](#1795)).
 * Always prepend bundle remote paths with /Workspace ([#1724](#1724)).
 * Add resource path field to bundle workspace configuration ([#1800](#1800)).
 * Add validation for files with a `.(resource-name).yml` extension ([#1780](#1780)).

Internal:
 * Remove deprecated or readonly fields from the bundle schema ([#1809](#1809)).

API Changes:
 * Changed `databricks git-credentials create` command . New request type is .
 * Changed `databricks git-credentials delete` command . New request type is .
 * Changed `databricks git-credentials delete` command to return .
 * Changed `databricks git-credentials get` command . New request type is .
 * Changed `databricks git-credentials get` command to return .
 * Changed `databricks git-credentials list` command to return .
 * Changed `databricks git-credentials update` command . New request type is .
 * Changed `databricks git-credentials update` command to return .
 * Changed `databricks repos create` command . New request type is .
 * Changed `databricks repos create` command to return .
 * Changed `databricks repos delete` command to return .
 * Changed `databricks repos get` command to return .
 * Changed `databricks repos update` command . New request type is .
 * Changed `databricks repos update` command to return .

OpenAPI commit 0c86ea6dbd9a730c24ff0d4e509603e476955ac5 (2024-10-02)
Dependency updates:
 * Upgrade TF provider to 1.53.0 ([#1815](#1815)).
 * Bump golang.org/x/term from 0.24.0 to 0.25.0 ([#1811](#1811)).
 * Bump golang.org/x/text from 0.18.0 to 0.19.0 ([#1812](#1812)).
 * Bump github.com/databricks/databricks-sdk-go from 0.47.0 to 0.48.0 ([#1810](#1810)).
github-merge-queue bot pushed a commit that referenced this pull request Oct 9, 2024
Notable changes for Databricks Asset Bundles:

Workspace paths are automatically prefixed with `/Workspace`. In
addition, all usage of path strings such as
`/Workspace/${workspace.root_path}/...` in bundle configuration is
automatically replaced with `${workspace.root_path}/...` and generates a
warning as part of bundle validate.

More details can be find here:
https://docs.databricks.com/en/release-notes/dev-tools/bundles.html#workspace-paths-will-be-automatically-prefixed

Bundles:
* Add an error if state files grow bigger than the export limit
([#1795](#1795)).
* Always prepend bundle remote paths with /Workspace
([#1724](#1724)).
* Add resource path field to bundle workspace configuration
([#1800](#1800)).
* Add validation for files with a `.(resource-name).yml` extension
([#1780](#1780)).

Internal:
* Remove deprecated or readonly fields from the bundle schema
([#1809](#1809)).

API Changes:
* Changed `databricks git-credentials create`, `databricks
git-credentials delete`, `databricks git-credentials get`, `databricks
git-credentials list`, `databricks git-credentials update` commands .
* Changed `databricks repos create`, `databricks repos delete`,
`databricks repos get`, `databricks repos update` command .

OpenAPI commit 0c86ea6dbd9a730c24ff0d4e509603e476955ac5 (2024-10-02)
Dependency updates:
* Upgrade TF provider to 1.53.0
([#1815](#1815)).
* Bump golang.org/x/term from 0.24.0 to 0.25.0
([#1811](#1811)).
* Bump golang.org/x/text from 0.18.0 to 0.19.0
([#1812](#1812)).
* Bump github.com/databricks/databricks-sdk-go from 0.47.0 to 0.48.0
([#1810](#1810)).

---------

Co-authored-by: Pieter Noordhuis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Libraries requirements not correctly replaced by Workspace path
4 participants