Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In patch_trg, i cant understand why do you change the data shape like that #205

Open
kwanhoP opened this issue Mar 8, 2023 · 3 comments

Comments

@kwanhoP
Copy link

kwanhoP commented Mar 8, 2023

my dataset is composed with horizontal

i didn't use transpose(0,1)

so i changed your code like below

def patch_trg(trg, pad_idx):
       trg , gold = trg[:, :-1], trg[:, 1:].contiguous().view(-1)
       return trg, gold

And my dataset example is composed with below
sample_1 = bos, 346, 32, 124, 214, eos
sample_2 = bos, 346, 124, 214, eos
...
sample_N = bos, 346, 32, 32, 32, 124, 214, eos

every length of sample data is different

so, this is my question.
if i running your code, when making trg parameter, the eos token of longest sample is deleted
that means, in every batch, the longest sample will be trained without eos token

so i want to know the correct role of that code(trg[:, :-1] and trg[:, 1:])?

i think that gold made for to get rid of bos token but i dont know the trg parameter

@Gi-gigi
Copy link

Gi-gigi commented Mar 19, 2023

Hi bro, how did you get the program to work? The dataset doesn't download, the preprocess.py file doesn't work.

@kwanhoP
Copy link
Author

kwanhoP commented Mar 28, 2023

@Gi-gigi actually, i didn't use the dataset that @jadore801120 prepare. i just use my datset and i have to change to something in preprocess.py. easily i customize the transformer code.

@TIanCat
Copy link

TIanCat commented Jun 16, 2023

Perhaps I can answer your question.
The ‘trg’ will be used as the input of decoder, and the decoder will predict the next word of known information. The 'gold' will be used as the label of predicted word. Let me give you an example.
sample_1 =
trg: bos, 346, 32, 124, 214
gold: 346, 32, 124, 214, eos

So, the 'eos' in the trg is meningless, and the loss function does not include it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants