You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some original ideas will likely be needed, but in general this should be achievable simply through measuring the dot-product similarities of F1 and F2 (and also softmaxing them) to see what tokens the model is paying attention to, then factor in the querying aspect.
A good entry point would be to reproduce a variation of the pairwise Attention diagram found in the appendix of Attention is All you Need
The text was updated successfully, but these errors were encountered:
Some original ideas will likely be needed, but in general this should be achievable simply through measuring the dot-product similarities of F1 and F2 (and also softmaxing them) to see what tokens the model is paying attention to, then factor in the querying aspect.
A good entry point would be to reproduce a variation of the pairwise Attention diagram found in the appendix of Attention is All you Need
The text was updated successfully, but these errors were encountered: