Skip to content

The purpose of this repository is to apply a data ingestion with Amazon Kinesis Firehose saving that data to S3 using the boto3. After that I use AWS Glue to catalog and Athena to query the data.

Notifications You must be signed in to change notification settings

harmatiuk/ingestion_with_kinesis_firehose

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Data Ingestion with Kinesis Firehose

The purpose of this repository is to apply a data ingestion with Amazon Kinesis Firehose saving that data to S3. After that I use AWS Glue to catalog and Athena to query the data. Basically I created this scenario:

image


The Data

For the data, I used the Fake Web Events lib (https://github.com/andresionek91/fake-web-events) which is a fake web event generator. Where you can use to generate semi-random web events for your study.


Tools

  • Kinesis Data Firehose:
    Amazon Kinesis Data Firehose is an extract, transform, and load (ETL) service that reliably captures, transforms, and delivers streaming data to data lakes, data stores, and analytics services.
  • S3 Bucket:
    Amazon S3 buckets, which are similar to file folders, store objects, which consist of data and its descriptive metadata.
  • Glue Data Catalog:
    Amazon Glue Data Catalog provides a uniform repository where disparate systems can store and find metadata to keep track of data in data silos. You can then use the metadata to query and transform that data in a consistent manner across a wide variety of applications.
  • Athena:
    Amazon Athena is a serverless, interactive query service to query data and analyze big data in Amazon S3 using standard SQL.
  • Boto3:
    Boto3 makes it easy to integrate your Python application, library, or script with AWS services including Amazon S3, Amazon EC2, Amazon DynamoDB, and more.

About

The purpose of this repository is to apply a data ingestion with Amazon Kinesis Firehose saving that data to S3 using the boto3. After that I use AWS Glue to catalog and Athena to query the data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages