My question #195

Messiz · 2022-03-21T07:22:22Z

if trg_emb_prj_weight_sharing:
            # Share the weight between target word embedding & last dense layer
            self.trg_word_prj.weight = self.decoder.trg_word_emb.weight
if emb_src_trg_weight_sharing:
            self.encoder.src_word_emb.weight = self.decoder.trg_word_emb.weight

The code above want to realize weight share, but I'm confused that the embed layer and the linear layer have different shape of weight. How can this assignment work?

The text was updated successfully, but these errors were encountered:

yingying123321 · 2022-08-01T03:17:53Z

I just found the information from the doc of pytorch(in the attaced picture). It shows that for a fc = nn.Linear(d_model, n_trg_vocab), actually the shape of fc's weight is (n_trg_vocab, d_model)!

Messiz · 2022-08-01T07:23:44Z

Thank you for your answer, but I figured it out a few days after that by myself. Thanks anyway!😂

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

My question #195

My question #195

Messiz commented Mar 21, 2022 •

edited

Loading

yingying123321 commented Aug 1, 2022

Messiz commented Aug 1, 2022

My question #195

My question #195

Comments

Messiz commented Mar 21, 2022 • edited Loading

yingying123321 commented Aug 1, 2022

Messiz commented Aug 1, 2022

Messiz commented Mar 21, 2022 •

edited

Loading