You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I checked the source code inside the source python file causal_linear_attention.py.
I do not understand why is 'attn_mask' not using? Any Hints?
Thanks very much
defforward(self, queries, keys, values, attn_mask, query_lengths,
key_lengths):
# Apply the feature map to the queries and keysself.feature_map.new_feature_map(queries.device)
Q=self.feature_map.forward_queries(queries)
K=self.feature_map.forward_keys(keys)
# Apply the key padding mask and make sure the attn_mask is a# lower triangular causal maskifnotattn_mask.lower_triangular:
raiseRuntimeError(("CausalLinearAttention only supports full ""lower triangular masks"))
K=K*key_lengths.float_matrix[:, :, None, None]
# Ensure that Q and K have compatible sizes for the following# computations, namely L == SQ, K=self._make_sizes_compatible(Q, K)
# TODO: Shall we divide the Q and K with a relatively large number to# avoid numerical instabilities in computing the denominator?# We used to divide each with the max norm of all q and k but# that seems relatively costly for a simple normalization.# Compute the normalizersZ=1/(torch.einsum("nlhi,nlhi->nlh", Q, K.cumsum(1)) +self.eps)
# Compute the unnormalized resultV=causal_linear(
Q,
K,
values
)
returnV*Z[:, :, :, None]
The text was updated successfully, but these errors were encountered:
Maybe it implements the CUDA version of the attention scores computation in the causal_product_cuda.cu, only by controlling the loop access bounds and not explicitly setting the mask. : )
I checked the source code inside the source python file causal_linear_attention.py.
I do not understand why is 'attn_mask' not using? Any Hints?
Thanks very much
The text was updated successfully, but these errors were encountered: