about the DR4SR+ traing process #9

ComeFromTheMars · 2025-01-08T07:31:55Z

I need to train a DR4SR+ using SASRec.But this tensor has no gradient during training, so is something wrong?
print(meta_loss.grad)
print(meta_train_loss.grad)
self.meta_optimizer.step(
val_loss=meta_loss,
train_loss=meta_train_loss,
aux_params = list(self.meta_module.parameters()),
parameters = list(self.sub_model.parameters()),
return_grads = False,
)

ShadowTinker · 2025-01-14T13:00:49Z

It is normal for meta_loss and meta_train_loss in DR4SR+ to have no gradients. This is because, in the bi-level optimization framework, DR4SR+ does not compute gradients through conventional gradient descent methods to update the model. Instead, it directly updates the parameters based on implicit gradient calculations. For more details, please refer to the step code in the MetaOptimizer:

DR4SR/utils/utils.py

Lines 235 to 240 in 67aa7ad

    
           hyper_gards = self.hypergrad.grad( 
        
               loss_val=val_loss, 
        
               loss_train=train_loss, 
        
               aux_params=aux_params, 
        
               params=parameters 
        
           )

Here, you will notice that we compute hyper_grads based on meta_loss and meta_train_loss, and then bind these hyper_grads to the parameters that need optimization:

DR4SR/utils/utils.py

Lines 242 to 243 in 67aa7ad

    
           for p, g in zip(aux_params, hyper_gards): 
        
               p.grad = g

Finally, through the step operation, the corresponding parameters are updated using the bound hyper_grads (p = p - lr * g):

DR4SR/utils/utils.py

Line 250 in 67aa7ad

self.meta_optimizer.step()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about the DR4SR+ traing process #9

about the DR4SR+ traing process #9

ComeFromTheMars commented Jan 8, 2025

ShadowTinker commented Jan 14, 2025

about the DR4SR+ traing process #9

about the DR4SR+ traing process #9

Comments

ComeFromTheMars commented Jan 8, 2025

ShadowTinker commented Jan 14, 2025