This is the code repository for scikit-learn Cookbook - Second Edition, published by Packt. It contains all the supporting project files necessary to work through the book from start to finish.
Python is quickly becoming the go-to language for analysts and data scientists due to its simplicity and flexibility, and within the Python data space, scikit-learn is the unequivocal choice for machine learning. This book includes walk throughs and solutions to the common as well as the not-so-common problems in machine learning, and how scikit-learn can be leveraged to perform various machine learning tasks effectively.
The second edition begins with taking you through recipes on evaluating the statistical properties of data and generates synthetic data for machine learning modelling. As you progress through the chapters, you will comes across recipes that will teach you to implement techniques like data pre-processing, linear regression, logistic regression, K-NN, Naïve Bayes, classification, decision trees, Ensembles and much more. Furthermore, you’ll learn to optimize your models with multi-class classification, cross validation, model evaluation and dive deeper in to implementing deep learning with scikit-learn. Along with covering the enhanced features on model section, API and new features like classifiers, regressors and estimators the book also contains recipes on evaluating and fine-tuning the performance of your model.
All of the code is organized into folders. Each folder starts with a number followed by the application name. For example, Chapter02.
The code will look like the following:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
You will need to install following libraries:
- anaconda 4.1.1
- numba 0.26.0
- numpy 1.12.1
- pandas 0.20.3
- pandas-datareader 0.4.0
- patsy 0.4.1
- scikit-learn 0.19.0
- scipy 0.19.1
- statsmodels 0.8.0
- sympy 1.0