Add some section headers

Terminaldienst · Oct 2, 2021 · 4bf39cc · 4bf39cc
1 parent 38e4cce
commit 4bf39cc
Show file tree

Hide file tree

Showing 3 changed files with 239 additions and 26 deletions.
diff --git a/02_end_to_end_machine_learning_project.ipynb b/02_end_to_end_machine_learning_project.ipynb
@@ -83,7 +83,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Get the data"
+    "# Get the Data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Download the Data"
    ]
   },
   {
@@ -132,6 +139,13 @@
     "    return pd.read_csv(csv_path)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Take a Quick Look at the Data Structure"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 5,
@@ -536,6 +550,13 @@
     "plt.show()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create a Test Set"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 10,
@@ -1274,7 +1295,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Discover and visualize the data to gain insights"
+    "# Discover and Visualize the Data to Gain Insights"
    ]
   },
   {
@@ -1286,6 +1307,13 @@
     "housing = strat_train_set.copy()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Visualizing Geographical Data"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 33,
@@ -1470,6 +1498,13 @@
     "plt.show()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Looking for Correlations"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 38,
@@ -1575,6 +1610,13 @@
     "save_fig(\"income_vs_house_value_scatterplot\")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Experimenting with Attribute Combinations"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 42,
@@ -1864,7 +1906,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Prepare the data for Machine Learning algorithms"
+    "# Prepare the Data for Machine Learning Algorithms"
    ]
   },
   {
@@ -1877,6 +1919,29 @@
     "housing_labels = strat_train_set[\"median_house_value\"].copy()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Data Cleaning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the book 3 options are listed:\n",
+    "\n",
+    "```python\n",
+    "housing.dropna(subset=[\"total_bedrooms\"])    # option 1\n",
+    "housing.drop(\"total_bedrooms\", axis=1)       # option 2\n",
+    "median = housing[\"total_bedrooms\"].median()  # option 3\n",
+    "housing[\"total_bedrooms\"].fillna(median, inplace=True)\n",
+    "```\n",
+    "\n",
+    "To demonstrate each of them, let's create a copy of the housing dataset, but keeping only the rows that contain at least one null. Then it will be easier to visualize exactly what each option does:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 47,
@@ -2714,6 +2779,13 @@
     "housing_tr.head()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Handling Text and Categorical Attributes"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -2987,6 +3059,13 @@
     "cat_encoder.categories_"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Custom Transformers"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -3199,6 +3278,13 @@
     "housing_extra_attribs.head()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Transformation Pipelines"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -3459,7 +3545,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Select and train a model "
+    "# Select and Train a Model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Training and Evaluating on the Training Set"
    ]
   },
   {
@@ -3676,7 +3769,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Fine-tune your model"
+    "## Better Evaluation Using Cross-Validation"
    ]
   },
   {
@@ -3877,6 +3970,20 @@
     "svm_rmse"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Fine-Tune Your Model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Grid Search"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 99,
@@ -4626,6 +4733,13 @@
     "pd.DataFrame(grid_search.cv_results_)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Randomized Search"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 104,
@@ -4688,6 +4802,13 @@
     "    print(np.sqrt(-mean_score), params)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Analyze the Best Models and Their Errors"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 106,
@@ -4752,6 +4873,13 @@
     "sorted(zip(feature_importances, attributes), reverse=True)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Evaluate Your System on the Test Set"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 108,

diff --git a/03_classification.ipynb b/03_classification.ipynb
@@ -351,7 +351,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Binary classifier"
+    "# Training a Binary Classifier"
    ]
   },
   {
@@ -435,6 +435,20 @@
     "cross_val_score(sgd_clf, X_train, y_train_5, cv=3, scoring=\"accuracy\")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Performance Measures"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Measuring Accuracy Using Cross-Validation"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 18,
@@ -522,6 +536,13 @@
     "* lastly, other things may prevent perfect reproducibility, such as Python dicts and sets whose order is not guaranteed to be stable across sessions, or the order of files in a directory which is also not guaranteed."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Confusion Matrix"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 21,
@@ -578,6 +599,13 @@
     "confusion_matrix(y_train_5, y_train_perfect_predictions)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Precision and Recall"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 24,
@@ -703,6 +731,13 @@
     "cm[1, 1] / (cm[1, 1] + (cm[1, 0] + cm[0, 1]) / 2)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Precision/Recall Trade-off"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 30,
@@ -992,7 +1027,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# ROC curves"
+    "## The ROC Curve"
    ]
   },
   {
@@ -1208,7 +1243,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Multiclass classification"
+    "# Multiclass Classification"
    ]
   },
   {
@@ -1458,7 +1493,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Error analysis"
+    "# Error Analysis"
    ]
   },
   {
@@ -1625,7 +1660,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Multilabel classification"
+    "# Multilabel Classification"
    ]
   },
   {
@@ -1707,7 +1742,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Multioutput classification"
+    "# Multioutput Classification"
    ]
   },
   {