Are customized loss functions/layers supported? #119

zzpustc · 2022-12-22T03:52:59Z

Hi!

Thanks for the great work. I'm trying to use the Laplace approximation in my work, but your package only supports MSE and CE, and the corresponding layers only support the nn.Module class. Is there any method to use your package with customized loss functions/layers?

Best

aleximmer · 2023-01-03T13:51:14Z

Hi,

Thanks for your interest. Unfortunately, only basic nn.Modules can be supported since this allows to compute Hessian approximations (for example see the necessary extensions in ASDL). The same applies to losses.

However, in some cases it is not that complicated to extend the corresponding backend so if you have a specific use-case, we can try to give suggestions how to get it done with the help of the library if possible.

wiseodd · 2024-03-12T18:08:19Z

The CurvlinopsGGN/EF with the diagonal structure should be able to handle non-module layers (as long as its params are included in model.parameters()) since it's just a pure torch.func.

For custom loss functions, the requirement is that they correspond to a log-likelihood. Then we need to know how to sample or compute the 2nd derivative w.r.t. network's output. E.g. here

Laplace/laplace/curvature/curvature.py

Lines 239 to 257 in 7541c9b

    
               def _get_mc_functional_fisher(self, f): 
        
                   """ Approximate the Fisher's middle matrix (expected outer product of the functional gradient) 
        
                   using MC integral with `self.num_samples` many samples. 
        
                   """ 
        
                   F = 0 
        
                   for _ in range(self.num_samples): 
        
                       if self.likelihood == 'regression': 
        
                           y_sample = f + torch.randn(f.shape, device=f.device)  # N(y | f, 1) 
        
                           grad_sample = f - y_sample  # functional MSE grad 
        
                       else:  # classification with softmax 
        
                           y_sample = torch.distributions.Multinomial(logits=f).sample() 
        
                           # First functional derivative of the loglik is p - y 
        
                           p = torch.softmax(f, dim=-1) 
        
                           grad_sample = p - y_sample 
        
                       F += 1/self.num_samples * torch.einsum('bc,bk->bck', grad_sample, grad_sample) 
        
                   return F

We plan to support more likelihood, e.g. BCE (#130), after milestone 0.2.

wiseodd added this to the 0.3 milestone Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are customized loss functions/layers supported? #119

Are customized loss functions/layers supported? #119

zzpustc commented Dec 22, 2022

aleximmer commented Jan 3, 2023

wiseodd commented Mar 12, 2024

Are customized loss functions/layers supported? #119

Are customized loss functions/layers supported? #119

Comments

zzpustc commented Dec 22, 2022

aleximmer commented Jan 3, 2023

wiseodd commented Mar 12, 2024