Skip to content

Commit

Permalink
feat: ok and alert actions can be configured
Browse files Browse the repository at this point in the history
  • Loading branch information
stefanfreitagrwe authored and stefanfreitag committed Jul 22, 2024
1 parent 0b5018d commit 7760f26
Show file tree
Hide file tree
Showing 5 changed files with 48 additions and 7 deletions.
4 changes: 2 additions & 2 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@
"version": "latest"
},
"ghcr.io/devcontainers/features/terraform:1": {
"version": "1.6.2",
"tflint": "0.48.0",
"version": "1.9.0",
"tflint": "0.51.1",
"installTFsec": "true",
"installTerraformDocs": "true"
},
Expand Down
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,26 @@

This module deploys a Lambda function that checks the health of MSK cluster and sends a notification if a cluster is unhealthy.

If the target for `ok_actions`, `alarm_actions` or `insufficient_data_actions` is an SNS topic using a KMS key, ensure
that CloudWatch Alarms has sufficient permissions to publish messages.
For example:
```shell
statement {
sid = "Allow access for CloudWatch Alarms"
effect = "Allow"
principals {
type = "Service"
identifiers = ["cloudwatch.amazonaws.com"]
}
actions = [
"kms:Decrypt",
"kms:GenerateDataKey"
]
resources = ["*"]

}
```

<!-- BEGIN_TF_DOCS -->
## Requirements

Expand Down Expand Up @@ -49,15 +69,18 @@ No modules.

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_alarm_actions"></a> [alarm\_actions](#input\_alarm\_actions) | The list of actions to execute when this alarm transitions into an ALARM state from any other state. Each action is specified as an Amazon Resource Name (ARN). Default is `null`. | `list(string)` | `null` | no |
| <a name="input_cloudwatch_alarms_treat_missing_data"></a> [cloudwatch\_alarms\_treat\_missing\_data](#input\_cloudwatch\_alarms\_treat\_missing\_data) | Sets how the alarms handle missing data points. The following values are supported: `missing`, `ignore`, `breaching` and `notBreaching`. Default is `breaching`. | `string` | `"breaching"` | no |
| <a name="input_cluster_arns"></a> [cluster\_arns](#input\_cluster\_arns) | List of MSK cluster ARNs. Default is `[]`. | `list(string)` | `[]` | no |
| <a name="input_email"></a> [email](#input\_email) | List of e-mail addresses subscribing to the SNS topic. Default is `[]`. | `list(string)` | `[]` | no |
| <a name="input_enable_cloudwatch_alarms"></a> [enable\_cloudwatch\_alarms](#input\_enable\_cloudwatch\_alarms) | Setup CloudWatch alarms for the MSK clusters state. For each state a separate alarm will be created. Default is `false`. | `bool` | `false` | no |
| <a name="input_enable_sns_notifications"></a> [enable\_sns\_notifications](#input\_enable\_sns\_notifications) | Setup SNS notifications for the MSK clusters state. Default is `false`. | `bool` | `false` | no |
| <a name="input_ignore_states"></a> [ignore\_states](#input\_ignore\_states) | Suppress warnings for the listed MSK states. Default: ['MAINTENANCE'] | `list(string)` | <pre>[<br> "MAINTENANCE"<br>]</pre> | no |
| <a name="input_insufficient_data_actions"></a> [insufficient\_data\_actions](#input\_insufficient\_data\_actions) | The list of actions to execute when this alarm transitions into an INSUFFICIENT\_DATA state from any other state. Each action is specified as an Amazon Resource Name (ARN). Default is `null`. | `list(string)` | `null` | no |
| <a name="input_log_retion_period_in_days"></a> [log\_retion\_period\_in\_days](#input\_log\_retion\_period\_in\_days) | Number of days logs will be retained. Default is `365`. | `number` | `365` | no |
| <a name="input_memory_size"></a> [memory\_size](#input\_memory\_size) | Amount of memory in MByte that the Lambda function can use at runtime. Default is `160`. | `number` | `160` | no |
| <a name="input_name"></a> [name](#input\_name) | Name of the health monitor. Default is `msk_status_monitor`. | `string` | `"msk_status_monitor"` | no |
| <a name="input_ok_actions"></a> [ok\_actions](#input\_ok\_actions) | The list of actions to execute when this alarm transitions into an OK state from any other state. Each action is specified as an Amazon Resource Name (ARN). | `list(string)` | `null` | no |
| <a name="input_schedule_expression"></a> [schedule\_expression](#input\_schedule\_expression) | The schedule expression for the CloudWatch event rule. Default is `rate(5 minutes)`. | `string` | `"rate(5 minutes)"` | no |
| <a name="input_tags"></a> [tags](#input\_tags) | A map of tags to add to all resources. Default is `{}`. | `map(string)` | `{}` | no |

Expand Down
4 changes: 2 additions & 2 deletions examples/01_default_configuration/versions.tf
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
terraform {
required_version = "~>1.8"
required_version = "~>1.9"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~>5.32"
version = "~>5.59"
}
}
}
6 changes: 3 additions & 3 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -178,9 +178,9 @@ resource "aws_cloudwatch_metric_alarm" "this" {
statistic = "Average"
threshold = 0
treat_missing_data = var.cloudwatch_alarms_treat_missing_data
alarm_actions = []
insufficient_data_actions = []
# TODO: ok_actions = [var.sns_topic_alarms_arn]
ok_actions = var.ok_actions
alarm_actions = var.alarm_actions
insufficient_data_actions = var.insufficient_data_actions
dimensions = {
ClusterName = each.key
}
Expand Down
18 changes: 18 additions & 0 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,24 @@ variable "cloudwatch_alarms_treat_missing_data" {
}
}

variable "alarm_actions" {
description = "The list of actions to execute when this alarm transitions into an ALARM state from any other state. Each action is specified as an Amazon Resource Name (ARN). Default is `null`."
type = list(string)
default = null
}

variable "insufficient_data_actions" {
description = "The list of actions to execute when this alarm transitions into an INSUFFICIENT_DATA state from any other state. Each action is specified as an Amazon Resource Name (ARN). Default is `null`."
type = list(string)
default = null
}

variable "ok_actions" {
description = "The list of actions to execute when this alarm transitions into an OK state from any other state. Each action is specified as an Amazon Resource Name (ARN)."
type = list(string)
default = null
}

variable "enable_sns_notifications" {
description = "Setup SNS notifications for the MSK clusters state. Default is `false`."
type = bool
Expand Down

0 comments on commit 7760f26

Please sign in to comment.