Skip to content

martsec/big-data-infrastructure-exercises

Repository files navigation

Big Data Infrastructure Exercises

We’ll use this repository through our course. Starting it from a small application to have something bigger and more complex.

You’ll learn to develop an API, the common intrface between software systems, package it and deploy it to the cloud (AWS) and finally how we can scale and better automate the data-intensive parts.

Installation

  • Use python 3.11 or 3.12

  • Install pipx 🔗

  • Install poetry (dependency manager) pipx install poetry

Run poetry install

If you need to add any package, just execute poetry add the_dependency or add it inside pyproject.toml to the [tool.poetry.dependencies] section. Then run poetry update

Tip
Try not to use Windows and use WSL2 instead: a linux "layer" for windows. Windows 11 windows 10

Running the app

poetry run fastapi dev bdi_api/app.py


# And the tests
poetry run pytest

How will the exercises be evaluated?

The exercises are a bit different so evaluation will be specified in each bdi_api/sX/README.adoc file.

Libraries

FastAPI is one of the best libraries for building an API with Python. It has a great documentation and enables us to build type safe and documented APIs easily.

Data

Use the data/ folder to store your data. It will not be uploaded to any git repository.

About

Exercises for the Big Data Infrastructure course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published