-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add ADRs for terraform decisions (#3096)
Add one ADR to explain why we use terraform and another one to explain the main infrastructure changes that we are doing
- Loading branch information
Showing
4 changed files
with
85 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# 8. Infrastructure management tooling | ||
|
||
Date: 2025-02-26 | ||
|
||
## Status | ||
|
||
Accepted | ||
|
||
## Context | ||
|
||
So far, the infrastructure for Mavis has been managed with AWS Copilot. While it worked well for the initial phase | ||
of the project, AWS Copilot also has some drawbacks: | ||
|
||
#### Opinionated Defaults | ||
|
||
AWS Copilot imposes a certain service architecture and default values. | ||
While it's possible to customize the generated configuration to some extent by overwriting the Cloudformation resources, | ||
this is cumbersome and AWS Copilot doesn't provide any benefits here. | ||
|
||
#### Integration of NHS Terraform modules | ||
|
||
NHSDigital provides Terraform modules for common infrastructure components. In particular, there exists a cloud backup | ||
module (https://github.com/NHSDigital/terraform-aws-backup/) which according to the Red Lines document must be used for backups. | ||
It's not possible to integrate a Terraform module with AWS Copilot. | ||
|
||
#### Uncertain future of AWS Copilot | ||
|
||
Despite no official announcement, AWS Copilot seems not to be maintained anymore. | ||
The last release happened 8 months ago in June 2024. Before that, releases occurred roughly monthly. According to | ||
https://github.com/aws/copilot-cli/issues/5987, there was already an official announcement for end of support which got removed again. | ||
|
||
For this reason we would in any case want to replace AWS Copilot as the infrastructure management tool. | ||
|
||
## Decision | ||
|
||
We will use Terraform to manage our infrastructure. This is based on | ||
|
||
- Terraform is a logical choice as it allows to use the NHS Terraform modules easily. | ||
- Terraform widely used and has a large community. | ||
- AWS Copilot future is uncertain | ||
|
||
## Consequences | ||
|
||
A proof of concept with Terraform has already been created and has been accepted. Each environment must be now migrated to Terraform. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# 9. Cloud architecture | ||
|
||
Date: 2025-02-26 | ||
|
||
## Status | ||
|
||
Accepted | ||
|
||
## Context | ||
|
||
Terraform allows for much greater flexibility to define the cloud architecture than AWS Copilot. | ||
This is the architecture as it was created by AWS Copilot: | ||
|
||
 | ||
|
||
- The ECS tasks are running in public subnets, which is not ideal from a security perspective | ||
- Each of the tasks has a public IP address assigned allowing tasks to initiate outgoing connections to the internet | ||
- This means there are multiple routes of traffic out of the service | ||
- This is a security risk as it exposes another potential point of attack | ||
- Security group rules for the ECS service are designed to limit incoming traffic only | ||
- Egress is unrestricted | ||
- Ingress only allows incoming connections from the load balancer | ||
|
||
## Decision | ||
|
||
We will implement the following architecture with Terraform: | ||
|
||
 | ||
|
||
## Consequences | ||
|
||
Overall the architecture remains similar to the existing setup, with core components such as the load balancer, | ||
database, and VPC persisting. The main difference between the two architectures are: | ||
|
||
1. Tasks are moved into the private subnets to restrict the ability to communicate with the containers directly. | ||
1. As a result there is no longer any public IP associated to the containers. | ||
2. We can be more restrictive with the allowed communication flows. | ||
2. To allow outgoing connections to NHS services such as PDS, CIS2, and Splunk a NAT gateway is introduced. | ||
1. This will also make it seamless to introduce further safeguards like network firewalls in the future. | ||
3. More restrictive ingress/egress rules are implemented explicitly allowing only pair-wise communication between services. | ||
1. E.g. no "allow all" ingress or egress rules are used for internal communication. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.