Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
zewenwu committed May 24, 2024
1 parent b0a1938 commit 145aa86
Show file tree
Hide file tree
Showing 132 changed files with 5,357 additions and 0 deletions.
117 changes: 117 additions & 0 deletions .github/workflows/cicd-lambda-code.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
name: CI/CD Lambda Code

on: [push]

jobs:
ci-lambda-code:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- name: Add conda to system path
run: |
# $CONDA is an environment variable pointing to the root of the miniconda directory
echo $CONDA/bin >> $GITHUB_PATH
- name: Set up Linting environment via Conda
run: |
cd src/application
make setup-lint
- name: Lint with Black/Flake8/ISort
run: |
cd src/application
make lint
cd-lambda-code:
needs: ci-lambda-code
runs-on: ubuntu-latest
# if: github.ref == 'refs/heads/main'

steps:
- uses: actions/checkout@v4
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v44
# To compare changes between the current commit and the last pushed remote commit set `since_last_remote_commit: true`. e.g
# with:
# since_last_remote_commit: true
- name: List all changed files
env:
ALL_CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}
run: |
for file in ${ALL_CHANGED_FILES}; do
echo "$file was changed"
done
- name: Configure AWS Credentials
run: |
aws configure set aws_access_key_id ${{ secrets.AWS_ACCESS_KEY_ID }}
aws configure set aws_secret_access_key ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws configure set aws_session_token ${{ secrets.AWS_SESSION_TOKEN }}
aws configure set default.region ${{ secrets.AWS_REGION }}
- name: Deploy all Lambda code in lambda/ subfolders
env:
ALL_CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}
DEPLOY_CONFIG_FILE_NAME: deploy-config.yml
run: |
for dir in src/application/lambda/*/
do
echo "======================="
### Checks for the deployment config file and changes in the Lambda directory
dir=${dir%*/}
# Check if there a deployment config file in the current directory
if [ ! -f "$dir/$DEPLOY_CONFIG_FILE_NAME" ]; then
echo "UPDATE FOLDER $dir: No $DEPLOY_CONFIG_FILE_NAME found. Skipping..."
continue
fi
# Check if 'enabled' key is true
DEPLOY_ENABLED=$(yq e '.cd-deploy.enabled' "$dir/$DEPLOY_CONFIG_FILE_NAME")
if [ "$DEPLOY_ENABLED" != "true" ]; then
echo "UPDATE FOLDER $dir: 'enabled' key in $DEPLOY_CONFIG_FILE_NAME is not true. Skipping..."
continue
fi
# Check if 'always-deploy' key is true
ALWAYS_DEPLOY=$(yq e '.cd-deploy.always-deploy' "$dir/$DEPLOY_CONFIG_FILE_NAME")
if [ "$ALWAYS_DEPLOY" == "true" ]; then
echo "UPDATE FOLDER $dir: 'always-deploy' key in $DEPLOY_CONFIG_FILE_NAME is true."
else
# Check if there are changes in the current Lambda directory
changes_detected=false
for file in ${ALL_CHANGED_FILES[@]}; do
if [[ "$file" == "$dir"* ]]; then
changes_detected=true
break
fi
done
if [ "$changes_detected" = false ]; then
echo "UPDATE FOLDER $dir: No file changes detected. Skipping..."
continue
else
echo "UPDATE FOLDER $dir: Changes detected."
fi
fi
### Docker build and push to ECR
cd $dir
ecr_repo_name=$(yq e '.cd-deploy.ecr-repo-name' "./$DEPLOY_CONFIG_FILE_NAME")
lambda_name=$(yq e '.cd-deploy.lambda-name' "./$DEPLOY_CONFIG_FILE_NAME")
echo "Build Lambda image and push to ECR repository for Lambda: $lambda_name..."
# Build the Docker image
docker build --platform linux/amd64 -t lambda:test . --no-cache
# Get the ECR repository URL
ECR_URL=$(aws ecr describe-repositories --repository-names $ecr_repo_name --query 'repositories[0].repositoryUri' --output text)
# Authenticate and Push to the ECR repository
aws ecr get-login-password --region ${{ secrets.AWS_REGION }} | docker login --username AWS --password-stdin $ECR_URL
docker tag lambda:test $ECR_URL:latest
docker push $ECR_URL:latest
### Update the Lambda function with the new Docker image
aws lambda update-function-code --function-name $lambda_name --image-uri $ECR_URL:latest
### Return to the root repository directory
cd ../../../..
done
44 changes: 44 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Local .terraform directories
**/.terraform/*

# .tfstate files
*.tfstate
*.tfstate.*

# Crash log files
crash.log
crash.*.log

# Exclude all .tfvars files, which are likely to contain sensitive data, such as
# password, private keys, and other secrets. These should not be part of version
# control as they are data points which are potentially sensitive and subject
# to change depending on the environment.
*.tfvars
*.tfvars.json

# Ignore override files as they are usually used to override resources locally and so
# are not checked in
override.tf
override.tf.json
*_override.tf
*_override.tf.json

# Include override files you do wish to add to version control using negated pattern
# !example_override.tf

# Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan
# example: *tfplan*

# Ignore CLI configuration files
.terraformrc
terraform.rc
.terraform.lock.hcl

# Mac
.DS_Store

# Data
# data/

# Ignore pycache
__pycache__/
148 changes: 148 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# OpenAQ DEMO: AWS Data Streaming Platform for Live Air Quality Measurements in Belgium

## Introduction

<img src="./img/openaq-demo.png" alt="alt text" width="800"/>

This repository contains a live streaming DEMO solution architecture and Python data pipelines to ingest, process, and visualize streaming data from the OpenAQ API. The solution architecture is built on AWS and uses the AWS services displayed in the diagram above.

<img src="./img/results-map-index.png" alt="alt text" width="600"/>
<img src="./img/results-data-index.png" alt="alt text" width="600"/>

We used Amazon Lambda functions to ingest and process streaming data live from the OpenAQ API. The final refined data and visualisarions are stored in Amazon S3 and hosted as a public static website. The static website are several HTML pages that displays the latest air quality data measured in Belgium from the OpenAQ API.

### Terraform Blueprints/Components

We use Terraform to automatically deploy the solution architecture. The Terraform blueprints are located in the `src/platform/live-sandbox` folder. **The Terraform blueprints are Terraform use-case specific files that references Terraform components.** For our use case, we are defining Terraform blueprints to deploy a end-to-end solution architecture for Ingesting, processing, and visualizing streaming data from OpenAQ API.

Terraform components are located in the `src/platform/terraform-components` folder. **The Terraform components are reusable Terraform code that can be used to deploy a specific AWS resource.** Terraform components not only deploys its specific AWS resource, but deploys them considering best practices regarding reusability, security, and scalability.

For more info on Terraform, please refer to the [Terraform documentation](https://www.terraform.io/docs/language/index.html).

## GitHub Action CI/CD Lambda Pipeline

We use GitHub Actions to implement a CI/CD pipeline for integrating and deploying Lambda functions. The GitHub Actions workflow is located in the `.github/workflows` folder. The GitHub Actions workflow is triggered at each push on any branch in the repository.

The GitHub Actions workflow contains the following steps:
- **ci-lambda-code**: This step runs linting tests on the Lambda application code located in the `src/application` folder, using Black, Flake8, and ISort.
- **cd-lambda-code** (depends on ci-lambda-code): This step builds the Docker image for the Lambda functions and deploys it to AWS Lambda. Deployment configuration files are located in the `src/application/lambda/lambda-<zone>/deploy-config.yml`, and specify the Lambda function to update and ECR repository name to upload the Docker image to. By default, the deployment is triggered only when the Lambda application code, located in their respective folders changes.

## Tutorial

Please follow the below tutorials to deploy the solution architecture using Terraform and GitHub Actions CI/CD pipeline:

1. Set up Terraform with AWS Cloud account
2. Deploy our AWS infrastructure and Lambda pipelines

### 1. Set up Terraform with AWS Cloud account

The following tools are required to deploy the solution architecture using Terraform. Please ensure you have the following tools available on your local machine:

- [OpenAQ API key](https://docs.openaq.org/docs/getting-started): You need to have an OpenAQ API key to access the OpenAQ API. The OpenAQ API key is used in the Lambda functions to ingest and process streaming data from the OpenAQ API.
- [AWS account](https://aws.amazon.com/): You need to have an AWS account to deploy resources on AWS.
- [Terraform](https://learn.hashicorp.com/tutorials/terraform/install-cli): You need to have Terraform installed on your local machine to deploy the Terraform blueprints.
- [Docker](https://docs.docker.com/get-docker/): You need to have Docker installed on your local machine to build and push Docker images to Amazon ECR, using Terrform.
- [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html): Optional, You need to have the AWS CLI installed on your local machine for viewing and updating AWS resources programmatically.

Follow the below steps to configure local Terraform with your AWS account:

**Step 1.** Configure Terraform to use your AWS access key, secret key, and session token by copy-pasting your AWS access and secret key in a Terminal:

```bash
export AWS_ACCESS_KEY_ID="xxx"
export AWS_SECRET_ACCESS_KEY="xxx"
export AWS_SESSION_TOKEN="xxx"
```

**Step 2.** Change directory to `live-sandbox` that contains Terraform blueprints. Setup up and validate the Terraform blueprints by running the below commands:

```bash
cd src/platform/live-sandbox
terraform init
terraform validate
```

![alt text](./img/terraform-init-example.png)

> **Remark:** In a multi-engineer environment, it is recommended to store Terraform state files in a remote backend, such as AWS S3, to allow multiple engineers to work on the same Terraform codebase. For more info on Terraform backends, please refer to the [Terraform documentation](https://www.terraform.io/docs/language/settings/backends/index.html).
### 2. Deploy our AWS infrastructure and Lambda pipelines

#### Terraform

To deploy above solution architecture using Terraform,

**Step 1.** Adjust the Terraform file `src/platform/live-sandbox/meta.tf` to reference your OpenAQ API key in plain text:

```terraform
locals {
openaq_api_key_file_path = "../../../data/01_raw/openaq-api-key.txt"
tags = {
Organisation = "DemoOrg"
Department = "DataEngineering"
Environement = "Sandbox"
Management = "Terraform"
}
}
```

**Step 2.** Change directory to `live-sandbox` that contains Terraform blueprints to deploy the solution architecture by running the below commands:

```bash
cd src/platform/live-sandbox
terraform apply
```

**Step 3.** Confirm the Terraform resources to deploy in your AWS account, confirm by typing `yes` in the Terminal.

![alt text](./img/terraform-apply-example.png)

#### GitHub Actions CI/CD Lambda Pipeline

Once the Terraform apply is successful, build and deploy the Lambda functions the first time using the GitHub Actions CI/CD pipeline,

**Step 4.** In GitHub, navigate to the repository and click on the `Settings` tab to add the following secrets to the repository to allow GitHub Actions to deploy the Lambda functions to your AWS account:

![alt text](./img/github-secrets.png)

**Step 5.** Change the three files in the `src/application/lambda/lambda-<zone>/deploy-config.yml` to set `always-deploy` to `true`:

```yaml
cd-deploy:
enabled: true
always-deploy: true <-- CHANGE HERE
lambda-name: lambda-<zone>
ecr-repo-name: lambda-<zone>
aws-region: us-east-1
```
**Step 5.** Push the changes to the your local clone repository to trigger the GitHub Actions CI/CD pipeline to build and deploy the Lambda functions. In GitHub, navigate to the repository and click on the `Actions` tab to view the GitHub Actions CI/CD pipeline:

![alt text](./img/github-cd-example.png)

> **Remark:** Remember to change the `always-deploy` to `false` after the first deployment to avoid unnecessary Lambda deployments at each commit.

#### Destroy the AWS infrastructure

To destroy the AWS infrastructure deployed using Terraform, run the below command:

```bash
terraform destroy
```

Confirm the Terraform resources to destroy in your AWS account, confirm by typing `yes` in the Terminal.

## Troubleshooting

If you encounter any issues during the deployment of the solution architecture, please refer to the below troubleshooting cases.

### creating Lambda function. image does not exist. Provide a valide source image.

During Terraform apply, you may encounter the below error message:

![alt text](./img/error-image-not-exist.png)

**Problem:** This error occurs when the Docker image for the Lambda function does not exist in the Amazon ECR repository. Even though, Terraform pushes a dummy Docker image to the Amazon ECR repository, this might take some time. Lambda functions might try to pull the Docker image from the Amazon ECR repository before the Docker image is pushed to the Amazon ECR repository.

**Resolution:** Repeat Terraform apply again.
Empty file added data/00_test/.gitkeep
Empty file.
Empty file added data/01_raw/.gitkeep
Empty file.
1 change: 1 addition & 0 deletions data/01_raw/example-s3-json.json

Large diffs are not rendered by default.

Binary file added img/error-image-not-exist.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/github-cd-example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/github-secrets.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 145aa86

Please sign in to comment.