Skip to content

Commit

Permalink
Add ADRs for terraform decisions (#3096)
Browse files Browse the repository at this point in the history
Add one ADR to explain why we use terraform and another one to explain
the main infrastructure changes that we are doing
  • Loading branch information
TheOneFromNorway authored Feb 27, 2025
2 parents 9785b01 + 6e586a2 commit e4e7123
Show file tree
Hide file tree
Showing 4 changed files with 85 additions and 0 deletions.
44 changes: 44 additions & 0 deletions adr/00008-infrastructure-management.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# 8. Infrastructure management tooling

Date: 2025-02-26

## Status

Accepted

## Context

So far, the infrastructure for Mavis has been managed with AWS Copilot. While it worked well for the initial phase
of the project, AWS Copilot also has some drawbacks:

#### Opinionated Defaults

AWS Copilot imposes a certain service architecture and default values.
While it's possible to customize the generated configuration to some extent by overwriting the Cloudformation resources,
this is cumbersome and AWS Copilot doesn't provide any benefits here.

#### Integration of NHS Terraform modules

NHSDigital provides Terraform modules for common infrastructure components. In particular, there exists a cloud backup
module (https://github.com/NHSDigital/terraform-aws-backup/) which according to the Red Lines document must be used for backups.
It's not possible to integrate a Terraform module with AWS Copilot.

#### Uncertain future of AWS Copilot

Despite no official announcement, AWS Copilot seems not to be maintained anymore.
The last release happened 8 months ago in June 2024. Before that, releases occurred roughly monthly. According to
https://github.com/aws/copilot-cli/issues/5987, there was already an official announcement for end of support which got removed again.

For this reason we would in any case want to replace AWS Copilot as the infrastructure management tool.

## Decision

We will use Terraform to manage our infrastructure. This is based on

- Terraform is a logical choice as it allows to use the NHS Terraform modules easily.
- Terraform widely used and has a large community.
- AWS Copilot future is uncertain

## Consequences

A proof of concept with Terraform has already been created and has been accepted. Each environment must be now migrated to Terraform.
41 changes: 41 additions & 0 deletions adr/00009-cloud-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# 9. Cloud architecture

Date: 2025-02-26

## Status

Accepted

## Context

Terraform allows for much greater flexibility to define the cloud architecture than AWS Copilot.
This is the architecture as it was created by AWS Copilot:

![AWS Copilot Architecture](architecture_copilot.png)

- The ECS tasks are running in public subnets, which is not ideal from a security perspective
- Each of the tasks has a public IP address assigned allowing tasks to initiate outgoing connections to the internet
- This means there are multiple routes of traffic out of the service
- This is a security risk as it exposes another potential point of attack
- Security group rules for the ECS service are designed to limit incoming traffic only
- Egress is unrestricted
- Ingress only allows incoming connections from the load balancer

## Decision

We will implement the following architecture with Terraform:

![Terraform Architecture](architecture_terraform.png)

## Consequences

Overall the architecture remains similar to the existing setup, with core components such as the load balancer,
database, and VPC persisting. The main difference between the two architectures are:

1. Tasks are moved into the private subnets to restrict the ability to communicate with the containers directly.
1. As a result there is no longer any public IP associated to the containers.
2. We can be more restrictive with the allowed communication flows.
2. To allow outgoing connections to NHS services such as PDS, CIS2, and Splunk a NAT gateway is introduced.
1. This will also make it seamless to introduce further safeguards like network firewalls in the future.
3. More restrictive ingress/egress rules are implemented explicitly allowing only pair-wise communication between services.
1. E.g. no "allow all" ingress or egress rules are used for internal communication.
Binary file added adr/architecture_copilot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added adr/architecture_terraform.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit e4e7123

Please sign in to comment.