Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

head_indexes_2d是干什么用的 #5

Open
Hanlard opened this issue Jan 8, 2020 · 2 comments
Open

head_indexes_2d是干什么用的 #5

Hanlard opened this issue Jan 8, 2020 · 2 comments

Comments

@Hanlard
Copy link

Hanlard commented Jan 8, 2020

x是[batch_size,SEQ_LEN,768]的bert表达
有一句代码:
for i in range(batch_size):
x[i] = torch.index_select(x[i], 0, head_indexes_2d[i])
请问这是在做什么?

@meisin
Copy link

meisin commented Jul 12, 2020

The words are tokenized into tokens using BERT's wordpiece tokenizer. So some words will be split into more than one token, eg: lamb -> 'la', '##mb'. head_indexes_2d captures the 'head' tokens, in other words, it captures the index of 'la' only for processing for the downstream task such as TokenClassification (eg: POS tagging, NER).

x[i] = torch.index_select(x[i], 0, head_indexes_2d[i]) is used to select the index of only the head words and ignoring the subsequent 'chunks' if the word is split into more than one token.

This is how i interpret the codes. Please do correct me if I get this wrong, anyone? :D

@scarydemon2
Copy link

每个word 对应 token 的index,因为只保留word第一个token进行训练。别的直接扔了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants