Introduction.tex

\chapter{Introduction}

\yinipar{\fontsize{60pt}{72pt}\usefont{U}{Kramer}{xl}{n}T}his note aims at presenting the three most common forms of neural network architectures. It does so in a technical though hopefully pedagogical way, buiding up in complexity as one progresses through the chapters.

\vspace{0.2cm}

Chapter \ref{sec:chapterFNN} starts with the first type of network introduced historically: a regular feedforward neural network, itself an evolution of the original perceptron \cite{Rosenblatt58theperceptron:} algorithm. One should see the latter as a non-linear regression, and feedforward networks schematically stack perceptron layers on top of one another.

\vspace{0.2cm}

We will thus introduce in chapter \ref{sec:chapterFNN} the fundamental building blocks of the simplest neural network layers: weight averaging and activation functions. We will also introduce gradient descent as a way to train the network when joint with the backpropagation algorithm, as a way to minimize a loss function adapted to the task at hand (classification or regression). The more technical details of the backpropagation algorithm are found in the appendix of this chapter, alongside with an introduction to the state of the art feedforward neural network, the ResNet. One can finally find a short matrix description of the feedforward network.

\vspace{0.2cm}

In chapter \ref{sec:chapterCNN}, we present the second type of neural network studied: the convolutional networks, particularly suited to treat images and label them. This implies presenting the mathematical tools related to this network: convolution, pooling, stride... As well as seeing the modification of the building block introduced in chapter \ref{sec:chapterFNN}. Several convolutional architectures are then presented, and the appendices once again detail the difficult steps of the main text.

\vspace{0.2cm}

Chapter \ref{sec:chapterRNN} finally presents the network architecture suited for data with a temporal structure -- as time series for instance, the recurrent neural network. There again, the novelties and the modifications of the material introduced in the two previous chapters are detailed in the main text, while the appendices give all what one needs to understand the most cumbersome formula of this kind of network architecture.