Confusion regarding embedding space #186

IamAdiSri · 2021-09-08T16:53:04Z

The paper says, "...the same weight matrix is shared between the two embedding layers..." referring to the encoder and decoder embedding layers respectively. However, in the lines below I can see that the encoder initializes its own embedding matrix, separate from the one in the decoder. Can you explain why this is so?

attention-is-all-you-need-pytorch/transformer/Models.py

Line 57 in 132907d

self.src_word_emb = nn.Embedding(n_src_vocab, d_word_vec, padding_idx=pad_idx)

attention-is-all-you-need-pytorch/transformer/Models.py

Line 96 in 132907d

self.trg_word_emb = nn.Embedding(n_trg_vocab, d_word_vec, padding_idx=pad_idx)

kian98 · 2021-12-06T07:33:02Z

Weights are shared in the __init__ function of class Transformer

attention-is-all-you-need-pytorch/transformer/Models.py

Line 179 in 132907d

if trg_emb_prj_weight_sharing:

attention-is-all-you-need-pytorch/transformer/Models.py

Line 183 in 132907d

if emb_src_trg_weight_sharing:

chaudharynabin6 · 2022-01-19T18:01:04Z

i am also confused on that key , query , value get's trained or the embedding get's trained please help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusion regarding embedding space #186

Confusion regarding embedding space #186

IamAdiSri commented Sep 8, 2021 •

edited

Loading

kian98 commented Dec 6, 2021

chaudharynabin6 commented Jan 19, 2022

Confusion regarding embedding space #186

Confusion regarding embedding space #186

Comments

IamAdiSri commented Sep 8, 2021 • edited Loading

kian98 commented Dec 6, 2021

chaudharynabin6 commented Jan 19, 2022

IamAdiSri commented Sep 8, 2021 •

edited

Loading