You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've seen the dkpd paper, the experiment results show dkpd works, but I don't really see why implement dpo to KD in the first place, and how it should improve the traditional kld or reverse kld method. Can you explain that to me?
The text was updated successfully, but these errors were encountered:
I've seen the dkpd paper, the experiment results show dkpd works, but I don't really see why implement dpo to KD in the first place, and how it should improve the traditional kld or reverse kld method. Can you explain that to me?
The text was updated successfully, but these errors were encountered: