head_indexes_2d是干什么用的 #5

Hanlard · 2020-01-08T09:20:32Z

x是[batch_size，SEQ_LEN，768]的bert表达
有一句代码：
for i in range(batch_size):
x[i] = torch.index_select(x[i], 0, head_indexes_2d[i])
请问这是在做什么？

meisin · 2020-07-12T07:44:47Z

The words are tokenized into tokens using BERT's wordpiece tokenizer. So some words will be split into more than one token, eg: lamb -> 'la', '##mb'. head_indexes_2d captures the 'head' tokens, in other words, it captures the index of 'la' only for processing for the downstream task such as TokenClassification (eg: POS tagging, NER).

x[i] = torch.index_select(x[i], 0, head_indexes_2d[i]) is used to select the index of only the head words and ignoring the subsequent 'chunks' if the word is split into more than one token.

This is how i interpret the codes. Please do correct me if I get this wrong, anyone? :D

scarydemon2 · 2021-11-22T01:12:57Z

每个word 对应 token 的index，因为只保留word第一个token进行训练。别的直接扔了

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

head_indexes_2d是干什么用的 #5

head_indexes_2d是干什么用的 #5

Hanlard commented Jan 8, 2020

meisin commented Jul 12, 2020

scarydemon2 commented Nov 22, 2021

head_indexes_2d是干什么用的 #5

head_indexes_2d是干什么用的 #5

Comments

Hanlard commented Jan 8, 2020

meisin commented Jul 12, 2020

scarydemon2 commented Nov 22, 2021