Attention Is All You Need

This is the paper presentation repo for "Attention Is All You Need".

Overview

The paper proposes a new network architecture, the Transformer, based solyely on attention mechanisms. The Transformer is more parallelizable and require less time to train (WMT).
The Transformer contains Encoder and Decoder parts. Each of them contaions 6 identical layers.
For Encoder, each layer has 2 sub-layers. The first is a multi-head self-attention mechanism, and the second is a simple, position-wise fully connected feed forward network.
The Decorder is also composed of 6 identical layers. In addition to the two sub-layers in each encoder layer, the decoder inserts a third sub-layer, which performs multi-head attion over the output of the encorder stack.
An attention function can be described as mapping a query and a set of key-value pairs to an output, where the query, keys, values, and output are all vectors. The Transformer model in this paper uses the "Scaled Dot-Product Attention".

The picture in this paper

Discussion Topic 1

Why the Masked Multi-Head Attention need to have “Masked”?

Discussion Topic 2

Why there is Multi-Head Attention?

Discussion Topic 3

Why there is attention?

Critical Analysis

The paper poposed a new mdoel architecture and it is only implemented in the translation.
I think the every part in the transformer could be explained in a more detailed way.
The transformer contains 2 parts in this paper. But they can also be used in a seperate way, which are BERT and GPT.
I think the attention mechanism could make the deep learning model more explainable.

Resource links

Original Article: https://arxiv.org/abs/1706.03762v5
Tensor2Tensor has some code with a tutorial: https://www.tensorflow.org/text/tutorials/transformer
Transformer very intuitively explained: http://jalammar.github.io/illustrated-transformer/

Code demonstration

The link of the notebook: https://colab.research.google.com/github/bentrevett/pytorch-seq2seq/blob/master/6%20-%20Attention%20is%20All%20You%20Need.ipynb#scrollTo=8hOLjW7rJKJL

Video Recording

Link to video recording: https://drive.google.com/file/d/1td5ZYbuOeuZ8Hta0rR_TuCtCNS9CQTWB/view?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
img		img
slide		slide
README.md		README.md
attention is all you need.pdf		attention is all you need.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Attention Is All You Need

Overview

The picture in this paper

Discussion Topic 1

Discussion Topic 2

Discussion Topic 3

Critical Analysis

Resource links

Code demonstration

Video Recording

About

Releases

Packages

yuechen-yang/attention_is_all_you_need

Folders and files

Latest commit

History

Repository files navigation

Attention Is All You Need

Overview

The picture in this paper

Discussion Topic 1

Discussion Topic 2

Discussion Topic 3

Critical Analysis

Resource links

Code demonstration

Video Recording

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages