Skip to content

Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.

Notifications You must be signed in to change notification settings

DataTalksClub/data-engineering-zoomcamp

Repository files navigation

Data Engineering Zoomcamp Overview

Data Engineering Zoomcamp: A Free 9-Week Course on Data Engineering Fundamentals

Master the fundamentals of data engineering by building an end-to-end data pipeline from scratch. Gain hands-on experience with industry-standard tools and best practices.

Join Slack#course-data-engineering ChannelTelegram AnnouncementsCourse PlaylistFAQ

How to Enroll

2025 Cohort

Self-Paced Learning

All course materials are freely available for independent study. Follow these steps:

  1. Watch the course videos.
  2. Join the Slack community.
  3. Refer to the FAQ document for guidance.

Syllabus Overview

The course consists of structured modules, hands-on workshops, and a final project to reinforce your learning.

Prerequisites

To get the most out of this course, you should have:

  • Basic coding experience
  • Familiarity with SQL
  • Experience with Python (helpful but not required)

No prior data engineering experience is necessary.

Modules

  • Introduction to GCP
  • Docker and Docker Compose
  • Running PostgreSQL with Docker
  • Infrastructure setup with Terraform
  • Homework
  • Data Lakes and Workflow Orchestration
  • Workflow orchestration with Kestra
  • Homework
  • API reading and pipeline scalability
  • Data normalization and incremental loading
  • Homework
  • Introduction to BigQuery
  • Partitioning, clustering, and best practices
  • Machine learning in BigQuery
  • dbt (data build tool) with PostgreSQL & BigQuery
  • Testing, documentation, and deployment
  • Data visualization with Metabase
  • Introduction to Apache Spark
  • DataFrames and SQL
  • Internals of GroupBy and Joins
  • Introduction to Kafka
  • Kafka Streams and KSQL
  • Schema management with Avro
  • Apply all concepts learned in a real-world scenario
  • Peer review and feedback process

Community & Support

Getting Help on Slack

Join the #course-data-engineering channel on DataTalks.Club Slack for discussions, troubleshooting, and networking.

To keep discussions organized:

Meet the Instructors

Past instructors:

Sponsors & Supporters

A special thanks to our course sponsors for making this initiative possible!

Interested in supporting our community? Reach out to [email protected].

About DataTalks.Club

DataTalks.Club

DataTalks.Club is a global online community of data enthusiasts. It's a place to discuss data, learn, share knowledge, ask and answer questions, and support each other.

WebsiteJoin Slack CommunityNewsletterUpcoming EventsGoogle CalendarYouTubeGitHubLinkedInTwitter

All the activity at DataTalks.Club mainly happens on Slack. We post updates there and discuss different aspects of data, career questions, and more.

At DataTalksClub, we organize online events, community activities, and free courses. You can learn more about what we do at DataTalksClub Community Navigation.

About

Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages