Skip to content

Commit

Permalink
Merge pull request #105 from kk-Syuer/main
Browse files Browse the repository at this point in the history
Main
  • Loading branch information
iacopomasi authored May 2, 2024
2 parents c4b0958 + cc3f47b commit 4e91bc1
Showing 1 changed file with 14 additions and 14 deletions.
28 changes: 14 additions & 14 deletions AA2324/course/09_decision_trees/09_decision_trees.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@
"source": [
"# This lecture material is taken from\n",
"- Information Theory part - (Entropy etc) is taken from __Chapter 1 - Bishop__.\n",
"- Decision Trees are very briefly covered in __Bishop at page 663__.\n",
"- Decision Trees are very briefly covered in __Bishop on page 663__.\n",
"- [Cimi Book - Chapter 01](http://ciml.info/dl/v0_99/ciml-v0_99-ch01.pdf)\n",
"- [CSC411: Introduction to Machine Learning](https://www.cs.toronto.edu/~urtasun/courses/CSC411_Fall16/06_trees_handout.pdf)\n",
"- [CSC411: Introduction to Machine Learning - Tutorial](https://www.cs.toronto.edu/~urtasun/courses/CSC411_Fall16/tutorial3.pdf)\n",
Expand Down Expand Up @@ -299,7 +299,7 @@
}
},
"source": [
"# What is the the training error of $k$-NN? 🤔"
"# What is the training error of $k$-NN? 🤔"
]
},
{
Expand All @@ -310,7 +310,7 @@
}
},
"source": [
"- In $k$-NN there there is no explicit cost/loss, how can we measure the training error? \n"
"- In $k$-NN there is no explicit cost/loss, how can we measure the training error? \n"
]
},
{
Expand Down Expand Up @@ -630,7 +630,7 @@
"source": [
"# When $k=1$ we perfectly classify the training set! 100% accuracy!\n",
"\n",
"It is easy to show that this follow by definition **(each point is neighbour to itself).**\n",
"It is easy to show that this follows by definition **(each point is neighbour to itself).**\n",
"\n",
"but will this hold for $K \\gt 1$?"
]
Expand All @@ -643,7 +643,7 @@
}
},
"source": [
"# We record the training accuracy in function of increasing $k$"
"# We record the training accuracy in the function of increasing $k$"
]
},
{
Expand Down Expand Up @@ -912,7 +912,7 @@
"source": [
"# Remember to estimate scaling on the training set only!\n",
"\n",
"- In theory this is part below is an error.\n",
"- In theory this part below is an error.\n",
"- I took the code from sklearn documentation but in practice you have to estimate the scale parameters **ONLY** in the training set.\n",
"- Then applying it directly to the test set. \n",
"- If you work in inductive settings, you cannot do it jointly like the code above.\n",
Expand Down Expand Up @@ -1268,7 +1268,7 @@
}
},
"source": [
"# Plot Miclassification function for binary case\n",
"# Plot Misclassification function for binary case\n",
"\n",
"```python\n",
"pk = np.arange(0, 1.1, 0.1)\n",
Expand Down Expand Up @@ -1772,7 +1772,7 @@
"source": [
"# This lecture material is taken from\n",
"- Information Theory part - (Entropy etc) is taken from __Chapter 1 - Bishop__.\n",
"- Decision Trees are very briefly covered in __Bishop at page 663__.\n",
"- Decision Trees are very briefly covered in __Bishop on page 663__.\n",
"- [Cimi Book - Chapter 01](http://ciml.info/dl/v0_99/ciml-v0_99-ch01.pdf)\n",
"- [CSC411: Introduction to Machine Learning](https://www.cs.toronto.edu/~urtasun/courses/CSC411_Fall16/06_trees_handout.pdf)\n",
"- [CSC411: Introduction to Machine Learning - Tutorial](https://www.cs.toronto.edu/~urtasun/courses/CSC411_Fall16/tutorial3.pdf)\n",
Expand Down Expand Up @@ -2010,8 +2010,8 @@
"\n",
" \n",
"- **Termination**:\n",
" 1. if no examples – return **majority** from parent (Voting such as in k-NN).\n",
" 2. else if all examples in same class – return the class **(pure node)**.\n",
" 1. if no examples – return **majority** from the parent (Voting such as in k-NN).\n",
" 2. else if all examples are in the same class – return the class **(pure node)**.\n",
" 3. else we are not in a termination node (keep recursing)\n",
" 4. **[Optional]** we could also terminate for some **regularization** parameters"
]
Expand Down Expand Up @@ -3276,7 +3276,7 @@
"\n",
"$$G(Q, \\theta) = \\frac{N^{L}}{N} H(Q^{L}(\\theta)) + \\frac{N^{R}}{N} H(Q^{R}(\\theta))\n",
"$$\n",
"Select the parameters that minimises the impurity\n",
"Select the parameters that minimize the impurity\n",
"\n",
"$$\n",
"\\boldsymbol{\\theta}^* = \\operatorname{argmin}_\\boldsymbol{\\theta} G(Q_m, \\theta)\n",
Expand Down Expand Up @@ -3483,7 +3483,7 @@
"source": [
"# Quick Remedies\n",
"\n",
"However, even if we have these **on-hand weapon to avoid overfitting**, it is **still hard to train a single decision tree to perform well generally**. Thus, we will use another useful training technique called **ensemble methods or bagging**, which leads to random-forest."
"However, even if we have these **on-hand weapons to avoid overfitting**, it is **still hard to train a single decision tree to perform well generally**. Thus, we will use another useful training technique called **ensemble methods or bagging**, which leads to random-forest."
]
},
{
Expand Down Expand Up @@ -3725,7 +3725,7 @@
"- $K=\\sqrt{D}$ so it is a fixed hyper-param.\n",
"- You have to tune $M$ but in general it needs to be large.\n",
"- DT are **very interpretable**; DT/RF could be used for **feature selection**\n",
" - To answer the question: __which feature contribute more to the label?__\n",
" - To answer the question: __which feature contributes more to the label?__\n",
"- You can evaluate them **without a validation split** (Out of Bag Generalization - OOB)"
]
},
Expand Down Expand Up @@ -4052,7 +4052,7 @@
"\n",
"[Link to the Microsoft paper](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/BodyPartRecognition.pdf)\n",
"\n",
"_To keep the training times down we employ a distributed implementation. Training 3 trees to depth 20 from 1 million images takes about a day on a 1000 core cluster._"
"_To keep the training times down we employ a distributed implementation. Training 3 trees to depth 20 from 1 million images takes about a day on a 1000-core cluster._"
]
},
{
Expand Down

0 comments on commit 4e91bc1

Please sign in to comment.