Skip to content

an event based fraud detection based on aws serverless tech

Notifications You must be signed in to change notification settings

heaven00/aws-serverless-experiment

Repository files navigation

AWS severless fraud detection experiment

This is a simple serverless application experiment that,

  • Builds and pushes Images of specified sub directories to ECR for lambda
  • Uses cloudformation and github actions to deploy the serverless Infra
  • contains Jupyter Notebooks on possible ML solutions for fraud detection problem
  • and notes on the problem

Prerequisites

Steps to run the experiment

Setting up the project

  • Clone the repository
  • Install the Pre-requisites
  • Run uv run pytest this will automatically create an environment and install all dependencies mentioned in the pyproject.toml file.

Deploy the application

  • Create a GitHub Actions workflow that will trigger on every push to the main branch.

Deploying to your own AWS account

You can deploy from local by running github actions locally,

  • Setup nektos/act this helps you run GitHub Actions locally.
  • Install Docker, nektos/act uses docker to run your workflows.
  • And if you are using vscode you can install Github Local Actions Docs for a helpful UI to run this. (this is how I was testing my workflows locally)

Alternatively, you can run with your own github actions

  • Fork the repo and add in AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to Actions secrets.
  • Run the workflow using your github actions.

What it will deploy?

  • A cloud formation application fraud-detection with the following resources:

architecture

Testing the cloudformation application

You can use utils/simulate_events.py to simulate events and test the cloudformation application.

  • Activate the environment with source .venv/bin/activate, if you don't have a virtualenv, create one with uv run pytest.
  • Add AWS accont information to the session using aws-vault aws-vault exec <profile name> --region <region name> this for me turns out to be aws-vault exec personal --region ca-central-1 personal being my profile and ca-central-1 being my region.
  • Now run uv run python utils/simulate_events.py --stream-name fraud-detection-TransactionIngestion-mHXRRNzZAWJV --num-transactions 1000

The simulate.py pushes transactions events random some with error in the transaction-id format and some in the correct format, you should observe data being pushed into both your valid and invalid s3 buckets.

The project structure

This is a multi sub project mono repo, each lambda function logic is a separate folder in src directory and the build workflow builds each of them and pushes to ECR which can then be used in AWS Lambda.

├── src
│   ├── fraud_detection_model
│   │   ├── app.py
│   │   ├── Dockerfile
│   │   ├── fraud_detection_model.pkl
│   │   └── requirements.txt
│   └── validation_lambda
│       ├── app.py
│       ├── Dockerfile
│       └── requirements.txt

The Serverless infra is managed by the template.yaml file using cloudformation, the best process of maintaining it so far that I have found is,

Edit in Infrastructure Composer on AWS using the UI -> copy over locally -> Update code and workflow as needed -> test with act -> deploy

I believe this is not a good way ideally I would want the template to be split into separate folders and load dynamically based on the environment but that was taking too much time to get right and also I would prefer my experience with Terraform over this but due to the time limit I am sticking to the above process.

About

an event based fraud detection based on aws serverless tech

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published