Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE] Issue with databricks_notebook, databricks_secret_scope, databricks_cluster, etc resources #4113

Open
kjanos0502 opened this issue Oct 15, 2024 · 5 comments

Comments

@kjanos0502
Copy link

Deployed configuration

locals {
  resource_group_name     = "test-rg"
  resource_group_location = "westeurope"
}

provider "databricks" {
  host                        = azurerm_databricks_workspace.dataanalytics.workspace_url
  azure_workspace_resource_id = azurerm_databricks_workspace.dataanalytics.id
}

resource "azurerm_databricks_workspace" "dataanalytics" {
  name                          = "test-dataanalytics-dbws"
  resource_group_name           = local.resource_group_name
  location                      = local.resource_group_location
  sku                           = "standard"
  managed_resource_group_name   = "${local.resource_group_name}-databricks"
  public_network_access_enabled = true

  custom_parameters {
    no_public_ip             = true
    storage_account_sku_name = "Standard_LRS"
  }
}

resource "databricks_notebook" "da_test" {
  path     = "/Shared/Test"
  language = "PYTHON"
  source   = "${path.module}/../databricks-notebooks/Test.py"

  depends_on = [azurerm_databricks_workspace.dataanalytics]
}

Updated configuration - to be deployed

locals {
  resource_group_name     = "test-rg"
  resource_group_location = "westeurope"
  vnet_name               = "test-rg-vnet"
}

provider "databricks" {
  host                        = azurerm_databricks_workspace.dataanalytics.workspace_url
  azure_workspace_resource_id = azurerm_databricks_workspace.dataanalytics.id
}

data "azurerm_virtual_network" "vnet" {
  name                = local.vnet_name
  resource_group_name = local.resource_group_name
}

resource "azurerm_subnet" "databricks_public" {
  name                 = "${local.resource_group_name}-da-databricks-public-subnet"
  resource_group_name  = local.resource_group_name
  virtual_network_name = local.vnet_name
  address_prefixes     = [cidrsubnet(cidrsubnet("10.1.0.0/16", 8, 14), 2, 0)]

  delegation {
    name = "databricks"
    service_delegation {
      name = "Microsoft.Databricks/workspaces"
      actions = [
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
        "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"
      ]
    }
  }
}

resource "azurerm_subnet" "databricks_private" {
  name                 = "${local.resource_group_name}-da-databricks-private-subnet"
  resource_group_name  = local.resource_group_name
  virtual_network_name = local.vnet_name
  address_prefixes     = [cidrsubnet(cidrsubnet("10.1.0.0/16", 8, 14), 2, 1)]

  delegation {
    name = "databricks"
    service_delegation {
      name = "Microsoft.Databricks/workspaces"
      actions = [
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
        "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"
      ]
    }
  }
}

resource "azurerm_network_security_group" "databricks_nsg" {
  name                = "${local.resource_group_name}-da-databricks-nsg"
  location            = local.resource_group_location
  resource_group_name = local.resource_group_name
}

resource "azurerm_subnet_network_security_group_association" "databricks_public" {
  subnet_id                 = azurerm_subnet.databricks_public.id
  network_security_group_id = azurerm_network_security_group.databricks_nsg.id
}

resource "azurerm_subnet_network_security_group_association" "databricks_private" {
  subnet_id                 = azurerm_subnet.databricks_private.id
  network_security_group_id = azurerm_network_security_group.databricks_nsg.id
}

resource "azurerm_databricks_workspace" "dataanalytics" {
  name                          = "test-dataanalytics-dbws"
  resource_group_name           = local.resource_group_name
  location                      = local.resource_group_location
  sku                           = "standard"
  managed_resource_group_name   = "${local.resource_group_name}-databricks"
  public_network_access_enabled = true

  custom_parameters {
    no_public_ip                                         = true
    storage_account_sku_name                             = "Standard_LRS"
    virtual_network_id                                   = azurerm_virtual_network.id
    private_subnet_name                                  = azurerm_subnet.databricks_private.name
    public_subnet_name                                   = azurerm_subnet.databricks_public.name
    public_subnet_network_security_group_association_id  = azurerm_subnet_network_security_group_association.databrick_public.id
    private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.databrick_private.id
  }
}

resource "databricks_notebook" "da_test" {
  path     = "/Shared/Test"
  language = "PYTHON"
  source   = "${path.module}/../databricks-notebooks/Test.py"

  depends_on = [azurerm_databricks_workspace.dataanalytics]
}

Expected Behavior

Terraform understanding that the workspace needs to be recreated because of the vnet integration being turned on.
No issues during plan generation.

Actual Behavior

Terraform plan fails with authentication error when trying to refresh the state of the notebooks, clusters and secret scopes.
Seems as if the provider tries to access these resources differently when it realizes that the databricks workspace is going to be switched to be vnet-integrated.

Error: cannot read notebook: failed during request visitor: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: azure_client_secret=, azure_client_id=, azure_tenant_id=omitted now. Env: ARM_CLIENT_SECRET, ARM_CLIENT_ID, ARM_TENANT_ID

Steps to Reproduce

  1. Deploy initial configuration
  2. Update code to contain vnet-integration
  3. terraform plan

Terraform and provider versions

Terraform: v1.5.5 (windows_amd64)
Databricks provider: v1.53.0

Is it a regression?

First we tried on v1.49.0, had the same experience.

Workaround

If all these problematic resources that are part of the configuration (notebooks, clusters, etc) are removed from the state before running terraform plan, everything works fine.

@alexott
Copy link
Contributor

alexott commented Oct 17, 2024

@alexott
Copy link
Contributor

alexott commented Oct 17, 2024

You can't update existing workspace without recreating it...

@kjanos0502
Copy link
Author

I'm not sure how the first item in the troubleshooting guide relates to this, we're not using a data resource for the workspace. We have it defined in the same module. We also have the depends_on added for the notebook.

Recreating the workspace would be absolutely fine, but the plan step already fails.

Please elaborate if I'm misunderstanding the point you're trying to make.

@alexott
Copy link
Contributor

alexott commented Oct 20, 2024

It's more about the behavior of terraforming itself—depends_on works only on create/delete. But in your case, it's updated with recreating the workspace resource, and as a result, the host that refers to the workspace URL is empty because the workspace hasn't been recreated yet.

If you're recreating a workspace anyway, why not do terraform destroy followed by terraform apply?

@kjanos0502
Copy link
Author

I wouldn't have expected that I need additional terraform destroy, I would expect that because of the vnet integration change everything is recreated during the apply just like for other terraform resources and the plan does not fail. But if I get it correctly, this is not so easy in this case because of this azurerm - databricks borderline between the workspace and the notebooks, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants