Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encourage the use of root_path in production to ensure single deployment #1712

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
29 changes: 29 additions & 0 deletions bundle/config/mutator/cleanup_targets.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
package mutator

import (
"context"
"fmt"

"github.com/databricks/cli/bundle"
"github.com/databricks/cli/libs/diag"
)

type cleanupTargets struct {
name string
}

// CleanupTargets cleans up configuration properties before the configuration
// is reported by the 'bundle summary' command.
lennartkats-db marked this conversation as resolved.
Show resolved Hide resolved
func CleanupTargets() bundle.Mutator {
return &cleanupTargets{}
}

func (m *cleanupTargets) Name() string {
return fmt.Sprintf("Cleanup(%s)", m.name)
}

func (m *cleanupTargets) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
b.Config.Targets = nil
b.Config.Environments = nil
return nil
}
24 changes: 22 additions & 2 deletions bundle/config/mutator/process_target_mode.go
Original file line number Diff line number Diff line change
Expand Up @@ -130,8 +130,17 @@ func validateProductionMode(ctx context.Context, b *bundle.Bundle, isPrincipalUs
}
}

if !isPrincipalUsed && !isRunAsSet(r) {
return diag.Errorf("'run_as' must be set for all jobs when using 'mode: production'")
// We need to verify that there is only a single deployment of the current target.
// The best way to enforce this is to explicitly set root_path.
if !isExplicitRootSet(b) {
if isRunAsSet(r) || isPrincipalUsed {
// Just setting run_as is not enough to guarantee a single deployment,
// and neither is setting a principal.
// We only show a warning for these cases since we didn't historically
// report an error for them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is breaking IFF:

  1. You're a regular user
  2. You don't have run_as set (i.e. it runs as self)

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We previously showed an error under those conditions ('run_as' must be set). We now show an error about root_path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should continue to comment setting run_as, even if root_path is configured?

If a user goes in and looks at defaults and configures it to ~/some/path, it'll still have multiple deployments if multiple people deploy it, even though the warning is silenced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think it's a good practice to set run_as despite its limitations and it seems okay to set it in the templates. Providing a warning when you only set run_as and don't set root_path seems like a sweet spot to me. Providing an error would be a breaking change and doesn't really seem warranted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should de-emphasize run_as in general?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_as still seems helpful to me and it will become much more usable once we add support for it in pipelines.

return diag.Warningf("target with 'mode: production' should specify explicit 'workspace.root_path' to make sure only one copy is deployed")
}
return diag.Errorf("target with 'mode: production' must specify explicit 'workspace.root_path' to make sure only one copy is deployed")
lennartkats-db marked this conversation as resolved.
Show resolved Hide resolved
}
return nil
}
Expand All @@ -148,6 +157,17 @@ func isRunAsSet(r config.Resources) bool {
return true
}

func isExplicitRootSet(b *bundle.Bundle) bool {
lennartkats-db marked this conversation as resolved.
Show resolved Hide resolved
targetConfig, ok := b.Config.Targets[b.Config.Bundle.Target]
if !ok || targetConfig == nil {
return false
}
if targetConfig.Workspace == nil {
return false
}
return targetConfig.Workspace.RootPath != ""
}

func (m *processTargetMode) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
switch b.Config.Bundle.Mode {
case config.Development:
Expand Down
22 changes: 20 additions & 2 deletions bundle/config/mutator/process_target_mode_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@ func mockBundle(mode config.Mode) *bundle.Bundle {
Branch: "main",
},
},
Targets: map[string]*config.Target{
"": {},
},
lennartkats-db marked this conversation as resolved.
Show resolved Hide resolved
Workspace: config.Workspace{
CurrentUser: &config.User{
ShortName: "lennart",
Expand Down Expand Up @@ -277,14 +280,14 @@ func TestProcessTargetModeProduction(t *testing.T) {
b := mockBundle(config.Production)

diags := validateProductionMode(context.Background(), b, false)
require.ErrorContains(t, diags.Error(), "run_as")
require.ErrorContains(t, diags.Error(), "target with 'mode: production' must specify explicit 'workspace.root_path' to make sure only one copy is deployed")

b.Config.Workspace.StatePath = "/Shared/.bundle/x/y/state"
b.Config.Workspace.ArtifactPath = "/Shared/.bundle/x/y/artifacts"
b.Config.Workspace.FilePath = "/Shared/.bundle/x/y/files"

diags = validateProductionMode(context.Background(), b, false)
require.ErrorContains(t, diags.Error(), "production")
require.ErrorContains(t, diags.Error(), "target with 'mode: production' must specify explicit 'workspace.root_path' to make sure only one copy is deployed")

permissions := []resources.Permission{
{
Expand Down Expand Up @@ -326,6 +329,21 @@ func TestProcessTargetModeProductionOkForPrincipal(t *testing.T) {
require.NoError(t, diags.Error())
}

func TestProcessTargetModeProductionOkWithRootPath(t *testing.T) {
b := mockBundle(config.Production)

// Our target has all kinds of problems when not using service principals ...
diags := validateProductionMode(context.Background(), b, false)
require.Error(t, diags.Error())

// ... but we're okay if we specify a root path
b.Config.Targets[""].Workspace = &config.Workspace{
RootPath: "some-root-path",
}
diags = validateProductionMode(context.Background(), b, false)
require.NoError(t, diags.Error())
}

// Make sure that we have test coverage for all resource types
func TestAllResourcesMocked(t *testing.T) {
b := mockBundle(config.Development)
Expand Down
4 changes: 0 additions & 4 deletions bundle/config/mutator/select_target.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,5 @@ func (m *selectTarget) Apply(_ context.Context, b *bundle.Bundle) diag.Diagnosti
// TODO: remove when Environments section is not supported anymore.
b.Config.Bundle.Environment = b.Config.Bundle.Target

// Clear targets after loading.
b.Config.Targets = nil
b.Config.Environments = nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of keeping these around here, could you break out a field on the bundle struct where we can keep a snapshot of the selected target? Then you can interrogate it and there's no risk of other mutators changing it after selection. The targets in the configuration have no significance beyond this point.

E.g.

// Target stores a snapshot of the target configuration when it was selected.
Target *config.Target

Copy link
Contributor Author

@lennartkats-db lennartkats-db Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it cleaner to just remove the side effect from select_target? Instead of just recording which target is selected, the mutator removes fields, which is a bit hard to discover and not really motivated in the code. It's a bit surprising if you want to build a new mutator that consumes this value. Based on your comments, the motivation is to clean things up in order for consumption by summary/validate; shouldn't we just make that a separate step?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I see as a risk is that keeping them around means another location that new mutators can go and look at, even though everything under targets no longer has any effect. Variable interpolation won't run either, so values under it shouldn't be used.

I see how this is most convenient though. @andrewnester What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just that over the past year I ran into this problem twice. You need to use a step through debugger to find where this property is secretly deleted. And the code that deletes it includes no rationale and is just meant to select the default target.

Copy link
Contributor Author

@lennartkats-db lennartkats-db Oct 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I don't want to leave this PR open just because we can't make a decision here. I added 6ea5306 which merges the cleanup behavior back into SelectTarget and adds a few comments about the behavior for maintainers.

return nil
}
3 changes: 2 additions & 1 deletion cmd/bundle/summary.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (
"path/filepath"

"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config/mutator"
"github.com/databricks/cli/bundle/deploy/terraform"
"github.com/databricks/cli/bundle/phases"
"github.com/databricks/cli/cmd/bundle/utils"
Expand Down Expand Up @@ -60,7 +61,7 @@ func newSummaryCommand() *cobra.Command {
}
}

diags = bundle.Apply(ctx, b, terraform.Load())
diags = bundle.Apply(ctx, b, bundle.Seq(terraform.Load(), mutator.CleanupTargets()))
if err := diags.Error(); err != nil {
return err
}
Expand Down
2 changes: 2 additions & 0 deletions cmd/bundle/validate.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import (
"fmt"

"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config/mutator"
"github.com/databricks/cli/bundle/config/validate"
"github.com/databricks/cli/bundle/phases"
"github.com/databricks/cli/bundle/render"
Expand Down Expand Up @@ -65,6 +66,7 @@ func newValidateCommand() *cobra.Command {

return nil
case flags.OutputJSON:
bundle.Apply(ctx, b, mutator.CleanupTargets())
return renderJsonOutput(cmd, b, diags)
default:
return fmt.Errorf("unknown output type %s", root.OutputType(cmd))
Expand Down