Generalized Machine Learning

This repository contains notebooks, data, and slides for the survey of generalized machine learning and distributed computing training from September 14, 2018 - September 28, 2018. During this three day course, we will cover the following topics:

Day One:

ML Review: Generalized ML and Spatial Learning, Bias/Variance Tradeoff, Model Selection Triple
Regularized Regression: LASSO vs Ridge; ElasticNet and more
Clustering: Partitive vs Agglomerative Clustering; clustering evaluation methods, visualization
Classification I: Instance and Inductive Models (kNN, Decision Trees, Ensembles of Trees)

Day 2:

Classification II: Parametric Models: SVMs, Bayesian Models, Logistic Regression
Dimensionality Reduction and Manifolds: PCA, SVD, tSNE, Isomaps
Neural Networks I: Multi-Layer Perceptrons
Neural Networks II: Deep Learning and Tensorflow

Day 3:

Introduction to Spark: RDDs and Architecture
Programming Spark - interactive analysis and distributed jobs
Using Spark for data analysis: Spark SQL and Spark DataFrames
Spark for distributed ML: Spark MLlib

Notes:

class experience with Logistic Regression and ANNs
background is mostly math and stats, not computational
don't rely on Python or coding knowledge; do exercises as live demos
focus on feature analysis and hyperparameter tuning
visual analysis with YB a big help!
for distributed computing, focus on high level computing issues, not mechanisms
no need for a cluster or workshops on the distributed computing day

Other Notes:

Classification Metrics II to follow (ROC/AUC, DecisionThreshold, PR Curves, Class Balance issues)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
figures		figures
slides		slides
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Session 1 - Generalized Machine Learning.ipynb		Session 1 - Generalized Machine Learning.ipynb
Session 2 - Regression and Regularization.ipynb		Session 2 - Regression and Regularization.ipynb
Session 3 - Clustering and Similarity.ipynb		Session 3 - Clustering and Similarity.ipynb
Session 4 - Instance and Inductive Models.ipynb		Session 4 - Instance and Inductive Models.ipynb
requirements.txt		requirements.txt
slideshow.sh		slideshow.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generalized Machine Learning

About

Releases

Packages

Languages

License

rebeccabilbro/navyfcu-ml

Folders and files

Latest commit

History

Repository files navigation

Generalized Machine Learning

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages