This is a simple serverless application experiment that,
- Builds and pushes Images of specified sub directories to ECR for lambda
- Uses cloudformation and github actions to deploy the serverless Infra
- contains Jupyter Notebooks on possible ML solutions for fraud detection problem
- and notes on the problem
- Clone the repository
- Install the Pre-requisites
- Run
uv run pytest
this will automatically create an environment and install all dependencies mentioned in the pyproject.toml file.
- Create a GitHub Actions workflow that will trigger on every push to the main branch.
You can deploy from local by running github actions locally,
- Setup nektos/act this helps you run GitHub Actions locally.
- Install Docker, nektos/act uses docker to run your workflows.
- And if you are using vscode you can install Github Local Actions Docs for a helpful UI to run this. (this is how I was testing my workflows locally)
Alternatively, you can run with your own github actions
- Fork the repo and add in
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
to Actions secrets. - Run the workflow using your github actions.
- A cloud formation application
fraud-detection
with the following resources:
You can use utils/simulate_events.py
to simulate events and test the cloudformation application.
- Activate the environment with
source .venv/bin/activate
, if you don't have a virtualenv, create one withuv run pytest
. - Add AWS accont information to the session using aws-vault
aws-vault exec <profile name> --region <region name>
this for me turns out to beaws-vault exec personal --region ca-central-1
personal being my profile and ca-central-1 being my region. - Now run
uv run python utils/simulate_events.py --stream-name fraud-detection-TransactionIngestion-mHXRRNzZAWJV --num-transactions 1000
The simulate.py pushes transactions events random some with error in the transaction-id format and some in the correct format, you should observe data being pushed into both your valid and invalid s3 buckets.
This is a multi sub project mono repo, each lambda function logic is a separate folder in src
directory and the build workflow builds each of them and pushes to ECR which can then be used in AWS Lambda.
├── src
│ ├── fraud_detection_model
│ │ ├── app.py
│ │ ├── Dockerfile
│ │ ├── fraud_detection_model.pkl
│ │ └── requirements.txt
│ └── validation_lambda
│ ├── app.py
│ ├── Dockerfile
│ └── requirements.txt
The Serverless infra is managed by the template.yaml file using cloudformation, the best process of maintaining it so far that I have found is,
Edit in Infrastructure Composer on AWS using the UI -> copy over locally -> Update code and workflow as needed -> test with act -> deploy
I believe this is not a good way ideally I would want the template to be split into separate folders and load dynamically based on the environment but that was taking too much time to get right and also I would prefer my experience with Terraform over this but due to the time limit I am sticking to the above process.