Skip to content

Portfolio of my projects at WBS Coding School Data Science Bootcamp

Notifications You must be signed in to change notification settings

tvogel/datascience-bootcamp

Repository files navigation

WBS Data Science Bootcamp Portfolio

Primer: A conversation with ChatGPT 3.5 about SQL challenges

It's interesting to see how well ChatGPT works and where its (current) limitations are, so I chatted with it about some of our SQL challenges. I found it a fun read!

Primer: No-Hangman

At the end of the two-weeks primer course on SQL, Tableau and Python, everybody builds a simple text-based Hangman (click on the image for my take on it):

No-Hangman Screenshot

Chapter 1: Eniac expansion from Spain to Brazil

In this case study, the company Eniac wants to expand its business to Brazil and evaluates the potential after-sales fulfillment partner Magist for its suitability.

scatter plot of all sellers with the x-axis saying what fraction of products was sold in tech categories and the y-axis depicting the average monthly sales

Chapter 2: Introduction to pandas

As second basic data-handling system after SQL, we were introduced to the Python pandas library. Read about our challenges here!

Chapter 3: Data Cleaning and Storytelling

In this quite intense two-weeks chapter, we were diving deep into a sales database with severe problems and learned how to still extract useful conclusions from it. Take a look!

Chapter 4: A/B Testing

In this one-week chapter, we learned a lot about the foundations of inferential statistics and deciding whether the outcome of UI experiments have statistical significance. Because the methods are applicable in a much larger class of problems, I found the material very helpful.


Chapter 5: Data Pipelines on the Cloud

In this two-week project, we learned and exercised ETL data-engineering skills, i.e. extracting, transforming and loading data into storage for comprehensive analysis. We scraped the web, used public APIs, transformed and augmented the data and stored it in an SQL database. The finished ETL process was then wrapped into a Google cloud function for automatic execution and I even went further to produce automatically updated reports on the data.

One of the deliverables was a blog post which I wrote on dev.to.

AI generated: Sky with sun, clouds and airplanes, pipelines running through the clouds and a hand holding a drawing pencil

Chapter 6: Unsupervised ML - Clustering Songs

In this one-week project we learned about high dimensional distances, scaling, PCA, k-Means, inertia elbow and silhouette score and the Spotify API.

My special treat was to apply harmony theory to order songs by harmonic distance.

Chapter 7: Supervised ML - Housing Prices and Mushroom classification

confusion matrix display

Two weeks were devoted and crammed with insights into supervised machine-learning. We learned about

  • training data preparation
  • classification, regression
  • prediction metrics
  • decision trees, gradient boosted random forests
  • linear and logistic regression
  • support vector classifiers
  • one-hot and ordinal encoding
  • parameter optimization and cross-validation

and even pickling data and creating classifiers as web-apps with streamlit! Our model data-sets were selling prices of houses 🏰 and poisonous vs. edible mushrooms 🍄.

Chapter 8: Recommender systems

WBSFLIX logo

This week took us to learn about different ways to extract movie recommendations for the fictitious WBSFLIX online DVD rental shop from previous movie ratings. Read all about it here and check out the recommendation app!

Chapter 9: Advanced SQL 🗄️

I actually do like SQL a lot, so this one-week reinforcement on advances SQL topics was actually good fun!

Final project: Wave-energy converter power optimization with deep learning 🌊

coming soon

About

Portfolio of my projects at WBS Coding School Data Science Bootcamp

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published