Skip to content

Commit

Permalink
add databricks-workspace module (#529)
Browse files Browse the repository at this point in the history
  • Loading branch information
jayengee authored Oct 31, 2023
1 parent e527df1 commit 67c02c7
Show file tree
Hide file tree
Showing 8 changed files with 545 additions and 0 deletions.
64 changes: 64 additions & 0 deletions databricks-workspace-e2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
## References
* [Here](https://databrickslabs.github.io/terraform-provider-databricks/overview/) is the provider docs.

<!-- START -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 0.13 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | n/a |
| <a name="provider_databricks"></a> [databricks](#provider\_databricks) | n/a |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_databricks_bucket"></a> [databricks\_bucket](#module\_databricks\_bucket) | github.com/chanzuckerberg/cztack//aws-s3-private-bucket | v0.60.1 |

## Resources

| Name | Type |
|------|------|
| [aws_iam_role.databricks](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role) | resource |
| [aws_iam_role_policy.policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy) | resource |
| [aws_security_group.databricks](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group) | resource |
| [databricks_mws_credentials.databricks](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_credentials) | resource |
| [databricks_mws_networks.networking](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_networks) | resource |
| [databricks_mws_storage_configurations.databricks](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_storage_configurations) | resource |
| [databricks_mws_workspaces.databricks](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource |
| [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source |
| [aws_iam_policy_document.databricks-s3](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.databricks-setup-assume-role](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_audit_log_bucket_name"></a> [audit\_log\_bucket\_name](#input\_audit\_log\_bucket\_name) | Name of bucket to write cluster logs to - also where the audit logs go, too | `string` | `"czi-audit-logs"` | no |
| <a name="input_databricks_external_id"></a> [databricks\_external\_id](#input\_databricks\_external\_id) | The ID of a Databricks root account. | `string` | n/a | yes |
| <a name="input_env"></a> [env](#input\_env) | The environment / stage. Aka staging, dev, prod. | `string` | n/a | yes |
| <a name="input_object_ownership"></a> [object\_ownership](#input\_object\_ownership) | Set default owner of all objects within bucket (e.g., bucket vs. object owner) | `string` | `null` | no |
| <a name="input_owner"></a> [owner](#input\_owner) | n/a | `string` | n/a | yes |
| <a name="input_passable_role_arn"></a> [passable\_role\_arn](#input\_passable\_role\_arn) | A role to allow the cross-account role to pass to other accounts | `string` | `""` | no |
| <a name="input_private_subnets"></a> [private\_subnets](#input\_private\_subnets) | List of private subnets. | `list(string)` | n/a | yes |
| <a name="input_project"></a> [project](#input\_project) | A high level name, typically the name of the site. | `string` | n/a | yes |
| <a name="input_service"></a> [service](#input\_service) | The service. Aka databricks-workspace. | `string` | n/a | yes |
| <a name="input_vpc_id"></a> [vpc\_id](#input\_vpc\_id) | ID of the VPC. | `string` | n/a | yes |
| <a name="input_workspace_name_override"></a> [workspace\_name\_override](#input\_workspace\_name\_override) | Override the workspace name. If not set, the workspace name will be set to the project, env, and service. | `string` | `null` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_role_arn"></a> [role\_arn](#output\_role\_arn) | ARN of the AWS IAM role. |
| <a name="output_workspace_id"></a> [workspace\_id](#output\_workspace\_id) | ID of the workspace. |
| <a name="output_workspace_url"></a> [workspace\_url](#output\_workspace\_url) | Url of the deployed workspace. |
<!-- END -->
282 changes: 282 additions & 0 deletions databricks-workspace-e2/aws_iam_role.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,282 @@
locals {
cluster_log_bucket_prefix = "databricks-cluster-logs"
}

data "aws_iam_policy_document" "databricks-setup-assume-role" {
statement {
principals {
type = "AWS"
identifiers = ["arn:aws:iam::${local.databricks_aws_account}:root"]
}

actions = ["sts:AssumeRole"]
condition {
test = "StringLike"
variable = "sts:ExternalId"
values = [var.databricks_external_id]
}
}
}

resource "aws_iam_role" "databricks" {
name = local.name
assume_role_policy = data.aws_iam_policy_document.databricks-setup-assume-role.json
tags = local.tags
}

data "aws_iam_policy_document" "policy" {
statement {
sid = "NonResourceBasedPermissions"
actions = [
"ec2:CancelSpotInstanceRequests",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeIamInstanceProfileAssociations",
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstances",
"ec2:DescribeInternetGateways",
"ec2:DescribeNatGateways",
"ec2:DescribeNetworkAcls",
"ec2:DescribePlacementGroups",
"ec2:DescribePrefixLists",
"ec2:DescribeReservedInstancesOfferings",
"ec2:DescribeRouteTables",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSpotInstanceRequests",
"ec2:DescribeSpotPriceHistory",
"ec2:DescribeSubnets",
"ec2:DescribeVolumes",
"ec2:DescribeVpcAttribute",
"ec2:DescribeVpcs",
"ec2:CreatePlacementGroup",
"ec2:DeletePlacementGroup",
"ec2:CreateKeyPair",
"ec2:DeleteKeyPair",
"ec2:CreateTags",
"ec2:DeleteTags",
"ec2:RequestSpotInstances",
]
resources = ["*"]
effect = "Allow"
}

statement {
effect = "Allow"
actions = ["iam:PassRole"]
resources = ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/databricks/*"]
}

dynamic "statement" {
for_each = length(var.passable_role_arn) > 0 ? [1] : []

content {
actions = [
"iam:PassRole"
]
resources = [
var.passable_role_arn
]
}
}

statement {
sid = "InstancePoolsSupport"
actions = [
"ec2:AssociateIamInstanceProfile",
"ec2:DisassociateIamInstanceProfile",
"ec2:ReplaceIamInstanceProfileAssociation",
]

resources = ["${local.ec2_arn_base}:instance/*"]

condition {
test = "StringEquals"
variable = "ec2:ResourceTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "AllowEc2RunInstancePerTag"
actions = [
"ec2:RunInstances",
]

resources = [
"${local.ec2_arn_base}:instance/*",
"${local.ec2_arn_base}:volume/*",
]

condition {
test = "StringEquals"
variable = "aws:RequestTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "AllowEc2RunInstanceImagePerTag"
actions = [
"ec2:RunInstances",
]

resources = [
"${local.ec2_arn_base}:image/*",
]

condition {
test = "StringEquals"
variable = "aws:ResourceTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "AllowEc2RunInstancePerVPCid"
actions = [
"ec2:RunInstances",
]

resources = [
"${local.ec2_arn_base}:network-interface/*",
"${local.ec2_arn_base}:subnet/*",
"${local.ec2_arn_base}:security-group/*",
]

condition {
test = "StringEquals"
variable = "ec2:vpc"
values = ["${local.ec2_arn_base}:vpc/${var.vpc_id}"]
}
}

statement {
sid = "AllowEc2RunInstanceOtherResources"
actions = [
"ec2:RunInstances",
]

not_resources = [
"${local.ec2_arn_base}:image/*",
"${local.ec2_arn_base}:network-interface/*",
"${local.ec2_arn_base}:subnet/*",
"${local.ec2_arn_base}:security-group/*",
"${local.ec2_arn_base}:volume/*",
"${local.ec2_arn_base}:instance/*"
]
}

statement {
sid = "EC2TerminateInstancesTag"
actions = [
"ec2:TerminateInstances",
]

resources = [
"${local.ec2_arn_base}:instance/*",
]

condition {
test = "StringEquals"
variable = "ec2:ResourceTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "EC2AttachDetachVolumeTag"
actions = [
"ec2:AttachVolume",
"ec2:DetachVolume",
]

resources = [
"${local.ec2_arn_base}:instance/*",
"${local.ec2_arn_base}:volume/*",
]

condition {
test = "StringEquals"
variable = "ec2:ResourceTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "EC2CreateVolumeByTag"
actions = [
"ec2:CreateVolume",
]

resources = [
"${local.ec2_arn_base}:volume/*",
]

condition {
test = "StringEquals"
variable = "aws:RequestTag/Vendor"
values = ["Databricks"]
}
}

statement {
sid = "EC2DeleteVolumeByTag"
actions = [
"ec2:DeleteVolume",
]

resources = [
"${local.ec2_arn_base}:volume/*",
]

condition {
test = "StringEquals"
variable = "ec2:ResourceTag/Vendor"
values = ["Databricks"]
}
}

statement {
actions = [
"iam:CreateServiceLinkedRole",
"iam:PutRolePolicy",
]

resources = [
"arn:aws:iam::*:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot",
]

condition {
test = "StringLike"
variable = "iam:AWSServiceName"
values = ["spot.amazonaws.com"]
}

effect = "Allow"
}

statement {
sid = "VpcNonresourceSpecificActions"
actions = [
"ec2:AuthorizeSecurityGroupEgress",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:RevokeSecurityGroupEgress",
"ec2:RevokeSecurityGroupIngress",
]

resources = [
"${local.ec2_arn_base}:security-group/${aws_security_group.databricks.id}",
]

condition {
test = "StringEquals"
variable = "ec2:vpc"
values = ["${local.ec2_arn_base}:vpc/${var.vpc_id}"]
}
}
}

resource "aws_iam_role_policy" "policy" {
name = "extras"
role = aws_iam_role.databricks.id
policy = data.aws_iam_policy_document.policy.json
}
33 changes: 33 additions & 0 deletions databricks-workspace-e2/bucket.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
data "aws_iam_policy_document" "databricks-s3" {
statement {
sid = "grant databricks access"
effect = "Allow"
principals {
type = "AWS"
identifiers = ["arn:aws:iam::${local.databricks_aws_account}:root"]
}
actions = [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket",
"s3:GetBucketLocation",
]
resources = [
"arn:aws:s3:::${local.name}/*",
"arn:aws:s3:::${local.name}",
]
}
}

module "databricks_bucket" {
source = "github.com/chanzuckerberg/cztack//aws-s3-private-bucket?ref=v0.60.1"
bucket_name = local.name
bucket_policy = data.aws_iam_policy_document.databricks-s3.json
project = var.project
env = var.env
service = var.service
owner = var.owner
object_ownership = var.object_ownership
}
Loading

0 comments on commit 67c02c7

Please sign in to comment.