Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about the DR4SR+ traing process #9

Open
ComeFromTheMars opened this issue Jan 8, 2025 · 1 comment
Open

about the DR4SR+ traing process #9

ComeFromTheMars opened this issue Jan 8, 2025 · 1 comment

Comments

@ComeFromTheMars
Copy link

I need to train a DR4SR+ using SASRec.But this tensor has no gradient during training, so is something wrong?
print(meta_loss.grad)
print(meta_train_loss.grad)
self.meta_optimizer.step(
val_loss=meta_loss,
train_loss=meta_train_loss,
aux_params = list(self.meta_module.parameters()),
parameters = list(self.sub_model.parameters()),
return_grads = False,
)

@ShadowTinker
Copy link
Contributor

It is normal for meta_loss and meta_train_loss in DR4SR+ to have no gradients. This is because, in the bi-level optimization framework, DR4SR+ does not compute gradients through conventional gradient descent methods to update the model. Instead, it directly updates the parameters based on implicit gradient calculations. For more details, please refer to the step code in the MetaOptimizer:

DR4SR/utils/utils.py

Lines 235 to 240 in 67aa7ad

hyper_gards = self.hypergrad.grad(
loss_val=val_loss,
loss_train=train_loss,
aux_params=aux_params,
params=parameters
)

Here, you will notice that we compute hyper_grads based on meta_loss and meta_train_loss, and then bind these hyper_grads to the parameters that need optimization:

DR4SR/utils/utils.py

Lines 242 to 243 in 67aa7ad

for p, g in zip(aux_params, hyper_gards):
p.grad = g

Finally, through the step operation, the corresponding parameters are updated using the bound hyper_grads (p = p - lr * g):

self.meta_optimizer.step()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants