pos_embedding_layer question #11

nicolaleo · 2023-12-28T16:36:59Z

nicolaleo
Dec 28, 2023

is it correct that the embeddings for token and position have the same input size equals to vocab_size?
it soudns me strange that a pos_embedding is related with the total vocab size,

token_embedding_layer = torch.nn.Embedding(vocab_size, output_dim)
pos_embedding_layer = torch.nn.Embedding(vocab_size, output_dim)

Answered by rasbt

Dec 28, 2023

Good question. It should be equal to to the maximum context length, which is usually smaller than the vocabulary size. E.g., for GPT-2 that would be 1024 but for modern LLMs that usually somewhere above 2056. I think in the recent GPT-4 model it's >100k now.

I will modify this using a separate parameter to make it more clear. E.g.,

token_embedding_layer = torch.nn.Embedding(vocab_size, output_dim)
pos_embedding_layer = torch.nn.Embedding(context_len, output_dim)

View full answer

rasbt · 2023-12-28T16:56:54Z

rasbt
Dec 28, 2023
Maintainer

Good question. It should be equal to to the maximum context length, which is usually smaller than the vocabulary size. E.g., for GPT-2 that would be 1024 but for modern LLMs that usually somewhere above 2056. I think in the recent GPT-4 model it's >100k now.

I will modify this using a separate parameter to make it more clear. E.g.,

token_embedding_layer = torch.nn.Embedding(vocab_size, output_dim)
pos_embedding_layer = torch.nn.Embedding(context_len, output_dim)

2 replies

nicolaleo Dec 28, 2023
Author

exactly what I meant. Thanks and congratulations for the good work

rasbt Dec 28, 2023
Maintainer

Thanks! I saw I already had used block_size in the chapter 2 file. I adjusted it accordingly in the other files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pos_embedding_layer question #11

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

pos_embedding_layer question #11

nicolaleo Dec 28, 2023

Replies: 1 comment · 2 replies

rasbt Dec 28, 2023 Maintainer

nicolaleo Dec 28, 2023 Author

rasbt Dec 28, 2023 Maintainer

nicolaleo
Dec 28, 2023

Replies: 1 comment 2 replies

rasbt
Dec 28, 2023
Maintainer

nicolaleo Dec 28, 2023
Author

rasbt Dec 28, 2023
Maintainer