Skip to content

Latest commit

 

History

History
116 lines (94 loc) · 3.79 KB

outline.md

File metadata and controls

116 lines (94 loc) · 3.79 KB

Learn to Code - Python for Data Science

What and How?

  • 4 hour intro workshop to python for data science. Heavier on the python, light intro to data science libraries and methods, plenty of exercises and examples
  • 4 page Handout
    • Setup instructions (fork this link on kaggle to start, then fork exercise #2, etc..)
    • Data types, variables, operators, assignment, etc..
    • Sequences, collections, and operating on them
    • Jupyter notebook
  • Deliver notebook(s) on kaggle for students to fork, 3 notebooks one for each section of the 4 hour workshop
  • slide deck

Preparation

  • Request Data Science students or folks proficient in python to volunteer as assistants.
  • Print out handouts for participants 1-3 days prior to the workshop. We don't need to worry about the printer the morning of the workshop.

Setup

  • Be sure to have music going when people show up.
  • Have the contents of welcome.md showing on the displays for when people show up. Participants should get on the WiFi and create a Kaggle account before the workshop begins.

Deliverables for attendees:

  • 4 page handout
  • Jupyter notebooks w/ lesson material, examples, and exercises
  • a 3 act play (intro to python, intro to , do some analysis on lemonade.csv)

What is a data scientists?

  • someone who knows more stats than the software engineers
  • someone who knows more software engineering than the statisticians

3 act play

  1. Overview of Data Science (what we're doing today, why we're here) (maybe)

    • Review
    • Activity: Game called conversation with everybody
    • Take 5 minutes to visit with a person close to you
    • collect the data on the podium and present it
  2. Intro to python fundamentals w/ lots of examples

    • Lecture
    • Exercise
    • Review
  3. Intro to stats and mathy bits with python & libraries

    • just enough numpy and pandas to be dangerous
    • multiple working examples w/ descriptive stats and data manipulation <br> break for lunch
  4. Walk through a larger exercise like lemonade.csv

  • how does price impact revenue?
  • how does temperature impact revenue?
  • how does rainfall impact revenue?
  • does distributing more flyers tend to occur with more sales?
  • logistic regression - If it's 100 degrees and I distribute 100 flyers, what will revenue be?

Intro to Python using Notebooks

  • Intro to data types, variables, and operators
  • Show assignment, reassignment
  • working with sequences and collections
    • indexing
    • slicing
    • iterating
  • How to lookup the documentation or "how to do XYZ in python"
  • Intro to writing your own functions

Data Sets for the workshop

mpg.csv lemonade.csv

Exercises

  • what's the maximum in a collection
  • what's the most frequently occurring observation
  • what's the least frequently occurring observation?
  • what's the average?

Math notation explained in Python

or and union intersection empty_set is a subset of

element is a member of element is not a member of

Actual statistics, yo!

  • descriptive statistics of lemonade.csv
  • maybe visualization of lemonade.csv?
  • hypothesis testing on lemonade.csv
  • maybe a regression?

Storyboard and script

Before things officially start

  • hang out and visit with people
  • Handouts printed and on every seat (backup handouts ready)
  • Setup instructions on display (wifi info, goto kaggle, fork this, read this)

Kickoff or transition from Dimitri/introducer

  • Introduce material
  • Introduce self
  • Outline the process for the day, set expectations
    • Part I is lecture, exercise, review
    • Part II is lecture, exercise, review
    • Part III is a small project combining

Slides

  • Welcome slide with salient info:
    • wifi network names and password
    • instructions for getting started
  • The "about codeup" part should have already been covered
  • About your instructor
  • Ground rules