Skip to content

Commit

Permalink
Add some section headers
Browse files Browse the repository at this point in the history
  • Loading branch information
ageron committed Oct 2, 2021
1 parent 38e4cce commit 4bf39cc
Show file tree
Hide file tree
Showing 3 changed files with 239 additions and 26 deletions.
138 changes: 133 additions & 5 deletions 02_end_to_end_machine_learning_project.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Get the data"
"# Get the Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Download the Data"
]
},
{
Expand Down Expand Up @@ -132,6 +139,13 @@
" return pd.read_csv(csv_path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Take a Quick Look at the Data Structure"
]
},
{
"cell_type": "code",
"execution_count": 5,
Expand Down Expand Up @@ -536,6 +550,13 @@
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a Test Set"
]
},
{
"cell_type": "code",
"execution_count": 10,
Expand Down Expand Up @@ -1274,7 +1295,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Discover and visualize the data to gain insights"
"# Discover and Visualize the Data to Gain Insights"
]
},
{
Expand All @@ -1286,6 +1307,13 @@
"housing = strat_train_set.copy()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualizing Geographical Data"
]
},
{
"cell_type": "code",
"execution_count": 33,
Expand Down Expand Up @@ -1470,6 +1498,13 @@
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Looking for Correlations"
]
},
{
"cell_type": "code",
"execution_count": 38,
Expand Down Expand Up @@ -1575,6 +1610,13 @@
"save_fig(\"income_vs_house_value_scatterplot\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Experimenting with Attribute Combinations"
]
},
{
"cell_type": "code",
"execution_count": 42,
Expand Down Expand Up @@ -1864,7 +1906,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Prepare the data for Machine Learning algorithms"
"# Prepare the Data for Machine Learning Algorithms"
]
},
{
Expand All @@ -1877,6 +1919,29 @@
"housing_labels = strat_train_set[\"median_house_value\"].copy()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data Cleaning"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the book 3 options are listed:\n",
"\n",
"```python\n",
"housing.dropna(subset=[\"total_bedrooms\"]) # option 1\n",
"housing.drop(\"total_bedrooms\", axis=1) # option 2\n",
"median = housing[\"total_bedrooms\"].median() # option 3\n",
"housing[\"total_bedrooms\"].fillna(median, inplace=True)\n",
"```\n",
"\n",
"To demonstrate each of them, let's create a copy of the housing dataset, but keeping only the rows that contain at least one null. Then it will be easier to visualize exactly what each option does:"
]
},
{
"cell_type": "code",
"execution_count": 47,
Expand Down Expand Up @@ -2714,6 +2779,13 @@
"housing_tr.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Handling Text and Categorical Attributes"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -2987,6 +3059,13 @@
"cat_encoder.categories_"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Custom Transformers"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -3199,6 +3278,13 @@
"housing_extra_attribs.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Transformation Pipelines"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -3459,7 +3545,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Select and train a model "
"# Select and Train a Model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Training and Evaluating on the Training Set"
]
},
{
Expand Down Expand Up @@ -3676,7 +3769,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Fine-tune your model"
"## Better Evaluation Using Cross-Validation"
]
},
{
Expand Down Expand Up @@ -3877,6 +3970,20 @@
"svm_rmse"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Fine-Tune Your Model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Grid Search"
]
},
{
"cell_type": "code",
"execution_count": 99,
Expand Down Expand Up @@ -4626,6 +4733,13 @@
"pd.DataFrame(grid_search.cv_results_)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Randomized Search"
]
},
{
"cell_type": "code",
"execution_count": 104,
Expand Down Expand Up @@ -4688,6 +4802,13 @@
" print(np.sqrt(-mean_score), params)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Analyze the Best Models and Their Errors"
]
},
{
"cell_type": "code",
"execution_count": 106,
Expand Down Expand Up @@ -4752,6 +4873,13 @@
"sorted(zip(feature_importances, attributes), reverse=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Evaluate Your System on the Test Set"
]
},
{
"cell_type": "code",
"execution_count": 108,
Expand Down
47 changes: 41 additions & 6 deletions 03_classification.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -351,7 +351,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Binary classifier"
"# Training a Binary Classifier"
]
},
{
Expand Down Expand Up @@ -435,6 +435,20 @@
"cross_val_score(sgd_clf, X_train, y_train_5, cv=3, scoring=\"accuracy\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Performance Measures"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Measuring Accuracy Using Cross-Validation"
]
},
{
"cell_type": "code",
"execution_count": 18,
Expand Down Expand Up @@ -522,6 +536,13 @@
"* lastly, other things may prevent perfect reproducibility, such as Python dicts and sets whose order is not guaranteed to be stable across sessions, or the order of files in a directory which is also not guaranteed."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Confusion Matrix"
]
},
{
"cell_type": "code",
"execution_count": 21,
Expand Down Expand Up @@ -578,6 +599,13 @@
"confusion_matrix(y_train_5, y_train_perfect_predictions)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Precision and Recall"
]
},
{
"cell_type": "code",
"execution_count": 24,
Expand Down Expand Up @@ -703,6 +731,13 @@
"cm[1, 1] / (cm[1, 1] + (cm[1, 0] + cm[0, 1]) / 2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Precision/Recall Trade-off"
]
},
{
"cell_type": "code",
"execution_count": 30,
Expand Down Expand Up @@ -992,7 +1027,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# ROC curves"
"## The ROC Curve"
]
},
{
Expand Down Expand Up @@ -1208,7 +1243,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Multiclass classification"
"# Multiclass Classification"
]
},
{
Expand Down Expand Up @@ -1458,7 +1493,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Error analysis"
"# Error Analysis"
]
},
{
Expand Down Expand Up @@ -1625,7 +1660,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Multilabel classification"
"# Multilabel Classification"
]
},
{
Expand Down Expand Up @@ -1707,7 +1742,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Multioutput classification"
"# Multioutput Classification"
]
},
{
Expand Down
Loading

0 comments on commit 4bf39cc

Please sign in to comment.