Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to fully install xFormers in auxilliary notebook #1335

Open
probit2011 opened this issue Dec 1, 2023 · 1 comment
Open

Unable to fully install xFormers in auxilliary notebook #1335

probit2011 opened this issue Dec 1, 2023 · 1 comment
Labels
bug bug & failures with existing packages help wanted

Comments

@probit2011
Copy link

🐛 Bug

I`m trying to install xFormers and a few other packages in an auxillliary notebook in order to use it as a utility script in a submission notebook I am preparing for the UBC-OCEAN competition. However, after installation, I try running my submission notebook and obtain the following error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[6], line 37
     35 coords = coords.squeeze(0)
     36 X = tiles.float().to(device=device, non_blocking=True)
---> 37 y_prob, pred, features = model(X, coords)
     38 query_preds.append((image_id.item(), labels[pred.to(device='cpu').item()]))
     39 query_features.append(features.view(-1).to(device='cpu'))

File /kaggle/usr/lib/ubc_ocean_packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

Cell In[4], line 227, in WSINet.forward(self, x, coords)
    225 def forward(self, x, coords):
    226     features = self.encoder(x).unsqueeze(0)
--> 227     features, mask = self.roformer(features, coords)
    228     y_prob, y_hat, attention = self.attention(features)
    230     return y_prob, y_hat, attention

File /kaggle/usr/lib/ubc_ocean_packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

Cell In[4], line 97, in RoFormerLayer.forward(self, x, coords)
     95 q, k = apply_rotary_position_embeddings(self.rope(h, grid_h, grid_w), q, k)
     96 q, k, v = q.reshape(bs, n, self.heads, self.head_dim), k.reshape(bs, n, self.heads, self.head_dim), v.reshape(bs, n, self.heads, self.head_dim)
---> 97 att = fmha.memory_efficient_attention(q, k, v, attn_bias=mask, p = self.dropout, op=(fmha.cutlass.FwOp, fmha.cutlass.BwOp))
     98 o = self.norm2(h + att.reshape(bs, n, h.size(-1)))
     99 ff = self.mlp(o)

File /kaggle/usr/lib/ubc_ocean_packages/xformers/ops/fmha/__init__.py:223, in memory_efficient_attention(query, key, value, attn_bias, p, scale, op)
    116 def memory_efficient_attention(
    117     query: torch.Tensor,
    118     key: torch.Tensor,
   (...)
    124     op: Optional[AttentionOp] = None,
    125 ) -> torch.Tensor:
    126     """Implements the memory-efficient attention mechanism following
    127     `"Self-Attention Does Not Need O(n^2) Memory" <[http://arxiv.org/abs/2112.05682>`_.](http://arxiv.org/abs/2112.05682%3E%60_.%3C/span%3E)
    128 
   (...)
    221     :return: multi-head attention Tensor with shape ``[B, Mq, H, Kv]``
    222     """
--> 223     return _memory_efficient_attention(
    224         Inputs(
    225             query=query, key=key, value=value, p=p, attn_bias=attn_bias, scale=scale
    226         ),
    227         op=op,
    228     )

File /kaggle/usr/lib/ubc_ocean_packages/xformers/ops/fmha/__init__.py:321, in _memory_efficient_attention(inp, op)
    316 def _memory_efficient_attention(
    317     inp: Inputs, op: Optional[AttentionOp] = None
    318 ) -> torch.Tensor:
    319     # fast-path that doesn't require computing the logsumexp for backward computation
    320     if all(x.requires_grad is False for x in [inp.query, inp.key, inp.value]):
--> 321         return _memory_efficient_attention_forward(
    322             inp, op=op[0] if op is not None else None
    323         )
    325     output_shape = inp.normalize_bmhk()
    326     return _fMHA.apply(
    327         op, inp.query, inp.key, inp.value, inp.attn_bias, inp.p, inp.scale
    328     ).reshape(output_shape)

File /kaggle/usr/lib/ubc_ocean_packages/xformers/ops/fmha/__init__.py:339, in _memory_efficient_attention_forward(inp, op)
    337     op = _dispatch_fw(inp, False)
    338 else:
--> 339     _ensure_op_supports_or_raise(ValueError, "memory_efficient_attention", op, inp)
    341 out, *_ = op.apply(inp, needs_gradient=False)
    342 return out.reshape(output_shape)

File /kaggle/usr/lib/ubc_ocean_packages/xformers/ops/fmha/dispatch.py:39, in _ensure_op_supports_or_raise(exc_type, name, op, inp)
     37     if not reasons:
     38         return
---> 39     raise exc_type(
     40         f"""Operator `{name}` does not support inputs:
     41 {textwrap.indent(_format_inputs_description(inp), '     ')}
     42 {_format_not_supported_reasons(op, reasons)}"""
     43     )

ValueError: Operator `memory_efficient_attention` does not support inputs:
     query       : shape=(1, 7040, 8, 96) (torch.float32)
     key         : shape=(1, 7040, 8, 96) (torch.float32)
     value       : shape=(1, 7040, 8, 96) (torch.float32)
     attn_bias   : <class 'xformers.ops.fmha.attn_bias.BlockDiagonalMask'>
     p           : 0.25
`cutlassF` is not supported because:
    xFormers wasn't build with CUDA support
    operator wasn't built - see `python -m xformers.info` for more info

The 'python -m xformers.info' command returns the following (note the unavailability of the memory_efficient_attention methods; I need these):

xFormers 0.0.22.post7+cu118
memory_efficient_attention.cutlassF:               unavailable
memory_efficient_attention.cutlassB:               unavailable
memory_efficient_attention.decoderF:               unavailable
[email protected]:         unavailable
[email protected]:         unavailable
memory_efficient_attention.smallkF:                unavailable
memory_efficient_attention.smallkB:                unavailable
memory_efficient_attention.tritonflashattF:        unavailable
memory_efficient_attention.tritonflashattB:        unavailable
memory_efficient_attention.triton_splitKF:         unavailable
indexing.scaled_index_addF:                        available
indexing.scaled_index_addB:                        available
indexing.index_select:                             available
swiglu.dual_gemm_silu:                             unavailable
swiglu.gemm_fused_operand_sum:                     unavailable
swiglu.fused.p.cpp:                                not built
is_triton_available:                               True
pytorch.version:                                   2.0.1+cu118
pytorch.cuda:                                      available
gpu.compute_capability:                            6.0
gpu.name:                                          Tesla P100-PCIE-16GB
build.info:                                        available
build.cuda_version:                                1108
build.python_version:                              3.10.13
build.torch_version:                               2.1.0+cu118
build.env.TORCH_CUDA_ARCH_LIST:                    5.0+PTX 6.0 6.1 7.0 7.5 8.0+PTX 9.0
build.env.XFORMERS_BUILD_TYPE:                     Release
build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS:        None
build.env.NVCC_FLAGS:                              None
build.env.XFORMERS_PACKAGE_FROM:                   wheel-v0.0.22.post7
source.privacy:                                    open source

Also, I have noticed the following warnings on the xformers part of the install log for the auxilliary notebook, despite having successully installed the required cupy-cuda11x:

cudf 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
cuml 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
dask-cudf 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.

I believe the missing cupy-cuda11x is responsible for the fact that CUDA is not available for this xFormers install.

In short, I need one or both the following:
1- the package cupy-cuda11x (current version is 12.2.0 and works with CUDA 11.2-11.8) installed in the environment for GPUs
2- (even better) the package xFormers cu118 installed in the environment for GPUs (however, I am conscious of the fact that the P100s are using CUDA 11.4 only, an upgrade of Nvidia driver might be required)

To Reproduce

On a GPU P100 notebook with Persistence - Files only and set up as a utility script, run and save the following:

!pip install cupy-cuda11x --target=/kaggle/working/
!pip install torchstain --target=/kaggle/working
!pip install faiss-cpu --target=/kaggle/working/
!pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2 --target=/kaggle/working/ --index-url https://download.pytorch.org/whl/cu118
!pip3 install xformers --target=/kaggle/working/ --index-url https://download.pytorch.org/whl/cu118

Then, in another GPU P100 notebook, add the utility script built above and run the command '!python -m xformers.info'

Expected behavior

The notebook should be able to contain and execute any correctly written piece of code calling for the 'memory_efficient_attention' component of xFormers, at least for the base CUTLASS methods (the other methods are reliant on the Triton and Flash-Attention packages being installed; for the purpose of this request, I do not need them)

@probit2011 probit2011 added bug bug & failures with existing packages help wanted labels Dec 1, 2023
@ycchen-tw
Copy link

I'm encountering the same issue as well. Does anyone know how to resolve this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug bug & failures with existing packages help wanted
Projects
None yet
Development

No branches or pull requests

2 participants