Add support for MambaIR #265

vorselon8 · 2024-05-20T15:03:36Z

RunDevelopment · 2024-05-20T17:53:56Z

While interesting, I don't think we'll be able to easily support it.

The problem is that MambaIR depends on mamba-ssm. mamba-ssm is pretty difficult to install, because it requires that your system has an Nvidia GPU and specific versions of pytorch and CUDA installed.

We could say that mamba-ssm is an optional dependency, so people that have it installed can use MambaIR. However, this is would make it difficult to verify that our implementation of MambaIR actually works, because we GH CI test runners don't have Nvidia GPU.

All in all, we unfortunately can't support MambaIR right now. mamba-ssm needs to improve to be compatible with more machines, then we can add MambaIR.

umzi2 · 2024-06-27T20:37:36Z

@RunDevelopment
it will be very long, but you can get rid of the dependency by replacing the function with a PyTorch analogue

def selective_scan_fn(u, delta, A, B, C, D=None, z=None, delta_bias=None, delta_softplus=False,
                      return_last_state=False):
    """if return_last_state is True, returns (out, last_state)
    last_state has shape (batch, dim, dstate). Note that the gradient of the last state is
    not considered in the backward pass.
    """
    # return SelectiveScanFn.apply(u, delta, A, B, C, D, z, delta_bias, delta_softplus, return_last_state)
    return selective_scan_ref(u, delta, A, B, C, D, z, delta_bias, delta_softplus, return_last_state)


def selective_scan_ref(u, delta, A, B, C, D=None, z=None, delta_bias=None, delta_softplus=False,
                       return_last_state=False):
    """
    u: r(B D L)
    delta: r(B D L)
    A: c(D N) or r(D N)
    B: c(D N) or r(B N L) or r(B N 2L) or r(B G N L) or (B G N L)
    C: c(D N) or r(B N L) or r(B N 2L) or r(B G N L) or (B G N L)
    D: r(D)
    z: r(B D L)
    delta_bias: r(D), fp32

    out: r(B D L)
    last_state (optional): r(B D dstate) or c(B D dstate)
    """
    dtype_in = u.dtype
    u = u.float()
    delta = delta.float()
    if delta_bias is not None:
        delta = delta + delta_bias[..., None].float()
    if delta_softplus:
        delta = F.softplus(delta)
    batch, dim, dstate = u.shape[0], A.shape[0], A.shape[1]
    is_variable_B = B.dim() >= 3
    is_variable_C = C.dim() >= 3
    if A.is_complex():
        if is_variable_B:
            B = torch.view_as_complex(rearrange(B.float(), "... (L two) -> ... L two", two=2))
        if is_variable_C:
            C = torch.view_as_complex(rearrange(C.float(), "... (L two) -> ... L two", two=2))
    else:
        B = B.float()
        C = C.float()
    x = A.new_zeros((batch, dim, dstate))
    ys = []
    deltaA = torch.exp(torch.einsum('bdl,dn->bdln', delta, A))
    if not is_variable_B:
        deltaB_u = torch.einsum('bdl,dn,bdl->bdln', delta, B, u)
    else:
        if B.dim() == 3:
            deltaB_u = torch.einsum('bdl,bnl,bdl->bdln', delta, B, u)
        else:
            B = repeat(B, "B G N L -> B (G H) N L", H=dim // B.shape[1])
            deltaB_u = torch.einsum('bdl,bdnl,bdl->bdln', delta, B, u)
    if is_variable_C and C.dim() == 4:
        C = repeat(C, "B G N L -> B (G H) N L", H=dim // C.shape[1])
    last_state = None
    for i in range(u.shape[2]):
        x = deltaA[:, :, i] * x + deltaB_u[:, :, i]
        if not is_variable_C:
            y = torch.einsum('bdn,dn->bd', x, C)
        else:
            if C.dim() == 3:
                y = torch.einsum('bdn,bn->bd', x, C[:, :, i])
            else:
                y = torch.einsum('bdn,bdn->bd', x, C[:, :, :, i])
        if i == u.shape[2] - 1:
            last_state = x
        if y.is_complex():
            y = y.real * 2
        ys.append(y)
    y = torch.stack(ys, dim=2)  # (batch dim L)
    out = y if D is None else y + u * rearrange(D, "d -> d 1")
    if z is not None:
        out = out * F.silu(z)
    out = out.to(dtype=dtype_in)
    return out if not return_last_state else (out, last_state)

this is the code from mambadrct

RunDevelopment added the Arch request Request to add support for an architecture label May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for MambaIR #265

Add support for MambaIR #265

vorselon8 commented May 20, 2024

RunDevelopment commented May 20, 2024

umzi2 commented Jun 27, 2024 •

edited

Loading

Add support for MambaIR #265

Add support for MambaIR #265

Comments

vorselon8 commented May 20, 2024

RunDevelopment commented May 20, 2024

umzi2 commented Jun 27, 2024 • edited Loading

umzi2 commented Jun 27, 2024 •

edited

Loading