You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Showing empirical general convolutional Model(Temporal Convolutional Networks; TCN) are better than RNNs in several tasks.
Abstract
Convolutional networks should be regarded as a natural starting point for sequence modeling tasks
1. Introduction
As starting sequence modeling, Recurrent models are first approach
But there's some research about convolutional models can reach state-of-art
This paper shows TCN architecture that is applied across all tasks
TCN is
simple and clearer than canonical recurrent networks
combines modern convolutional architectures
outperforms baseline recurrent architectures
retains longer memory and longer history
3. Temporal Convolutional Networks
Paper aims simple and powerful architecture
Characteristics of TCNs are
there is no information leakage from future to past
architecture can take sequence of any length and output sequence of same length
uses residual layers and dilated convolutions
3.1. Sequence Modeling
The goal of learning in sequence modeling setting is to find a network f that minimizes some expected loss between the actual outputs and the predictions.
3.2. Causal Convolutions
TCN's principle
input and output has same lengths
no leakage from the future into past
To achieve 2. above, TCN uses causal convolutions
that convolutions where an output at time t is convolved only with elements from t and ealier in the previous layer
this seems like masked convolution (van den Oord et al., 2016)
TCN = 1D FCN + causal convolutions
3.3. Dilated Convolutions
Simple causal convolution has retention memory problem
Paper employs diilated convolutions (van den Oord et al., 2016) to enable an exponentially large receptive field (Yu & Koltun, 2106)
Dilated factors are exponential (d=1, d=2, d=4 ...)
3.4. Residual Connections
see Figure 1 (b) and (c)
3.5. Discussion
Advantage
Parallelism
Flexible receptive field size
Stable Gradients comparing to RNN
TCN avoids exploding/vanishing gradients
because TCN has a backpropagation path different from the temporal direction of the sequence.
Low memory requirement for training
Variable length inputs
Disadvantage
Data storage during evaluation
RNN can use less memory on evaluation compare to training process
Potential parameter change for a transfer of domain
TCN may not perform well on transfer of domain
when little memory need -> large memory need
for not having a sufficiently large receptive field
https://arxiv.org/abs/1803.01271
this paper introduces Temporal Convolutional Networks (aka TCN)
Summary
Showing empirical general convolutional Model(Temporal Convolutional Networks; TCN) are better than RNNs in several tasks.
Abstract
1. Introduction
3. Temporal Convolutional Networks
3.1. Sequence Modeling
3.2. Causal Convolutions
3.3. Dilated Convolutions
3.4. Residual Connections
3.5. Discussion
Advantage
Disadvantage
5. Experiments
References
The text was updated successfully, but these errors were encountered: