diff --git a/slides/supervised-classification/slides-classification-discranalysis.tex b/slides/supervised-classification/slides-classification-discranalysis.tex index a3a52ae41..f72f35888 100644 --- a/slides/supervised-classification/slides-classification-discranalysis.tex +++ b/slides/supervised-classification/slides-classification-discranalysis.tex @@ -239,7 +239,12 @@ \begin{vbframe}{Discriminant analysis comparison} \begin{small} -Measuring the classification error on a toy binary classification task of increasing dimension, where data for each class are drawn from a multivariate normal distribution with the same mean and slight variations in covariance, followed by a small shift in up to 5 features for one class to create structure: +\begin{itemize} +\item We benchmark on simple toy data set(s) +\item Normally distributed data per class, but unequal cov matrices +\item And then increase dimensionality +\item We might assume that QDA always wins here ... +\end{itemize} \end{small} \begin{center} diff --git a/slides/supervised-classification/slides-classification-naivebayes.tex b/slides/supervised-classification/slides-classification-naivebayes.tex index e5fef9383..71745af68 100644 --- a/slides/supervised-classification/slides-classification-naivebayes.tex +++ b/slides/supervised-classification/slides-classification-naivebayes.tex @@ -13,28 +13,32 @@ }{% Relative path to title page image: Can be empty but must not start with slides/ figure/nb-db }{% Learning goals, wrapped inside itemize environment - \item Understand the idea of Naive Bayes - \item Understand in which sense Naive Bayes is a special QDA model + \item Construction principle of NB + \item Conditional independence assumption + \item Numerical and categorical features + \item Similarity to QDA, quadratic decision boundaries + \item Laplace smoothing } \framebreak \begin{vbframe}{Naive Bayes classifier} -NB is a generative multiclass technique. Remember: We use Bayes' theorem and only need $\pdfxyk$ to compute the posterior as: +Generative multiclass technique. Remember: We use Bayes' theorem and only need $\pdfxyk$ to compute the posterior as: $$\pikx \approx \postk = \frac{\P(\xv | y = k) \P(y = k)}{\P(\xv)} = \frac{\pdfxyk \pik}{\sumjg \pdfxyk[j] \pi_j} $$ -NB is based on a simple \textbf{conditional independence assumption}: the features are conditionally independent given class $y$. +NB is based on a simple \textbf{conditional independence assumption}: \\ +the features are conditionally independent given class $y$. $$ \pdfxyk = p((x_1, x_2, ..., x_p)|y = k)=\prodjp p(x_j|y = k). $$ -So we only need to specify and estimate the distribution $p(x_j|y = k)$, which is considerably simpler as this is univariate. +So we only need to specify and estimate the distributions $p(x_j|y = k)$, which is considerably simpler as these are univariate. \end{vbframe} -\begin{vbframe}{NB: Numerical Features} +\begin{vbframe}{Numerical Features} We use a univariate Gaussian for $p(x_j | y=k)$, and estimate $(\mu_{kj}, \sigma^2_{kj})$ in the standard manner. Because of $\pdfxyk = \prodjp p(x_j|y = k)$, the joint conditional density is Gaussian with diagonal but non-isotropic covariance structure, and potentially different across classes.