Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cupy support + CI #1066

Merged
merged 42 commits into from
Jul 31, 2023
Merged

Add cupy support + CI #1066

merged 42 commits into from
Jul 31, 2023

Conversation

Zethson
Copy link
Member

@Zethson Zethson commented Jul 21, 2023


@ivirshup says:

TODO:

  • Figure out if pytest logic can be simplified: Add cupy support + CI #1066 (comment)
  • Concatenation
  • IO (at least writing, probably via CPU memory)
  • Indexing
  • Views
  • Release note
    - [ ] Consider how much can be done with array_api
  • Benchmark concatenation to be sure CPU stuff didn't get slower

@codecov
Copy link

codecov bot commented Jul 21, 2023

Codecov Report

Merging #1066 (c476a5d) into main (0c4c0b0) will decrease coverage by 1.72%.
The diff coverage is 41.29%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1066      +/-   ##
==========================================
- Coverage   84.28%   82.57%   -1.72%     
==========================================
  Files          35       35              
  Lines        4932     5112     +180     
==========================================
+ Hits         4157     4221      +64     
- Misses        775      891     +116     
Files Changed Coverage Δ
anndata/_core/raw.py 79.56% <25.00%> (-1.04%) ⬇️
anndata/tests/helpers.py 86.15% <25.00%> (-9.86%) ⬇️
anndata/_core/merge.py 82.50% <28.88%> (-10.76%) ⬇️
anndata/_core/views.py 83.41% <51.85%> (-4.86%) ⬇️
anndata/utils.py 84.39% <71.42%> (-0.55%) ⬇️
anndata/compat/__init__.py 80.44% <75.00%> (-0.69%) ⬇️
anndata/_io/specs/methods.py 87.59% <90.90%> (-0.22%) ⬇️
anndata/_core/anndata.py 82.79% <100.00%> (+0.04%) ⬆️

... and 1 file with indirect coverage changes

.cirun.yml Outdated Show resolved Hide resolved
Signed-off-by: zethson <[email protected]>
Zethson and others added 3 commits July 21, 2023 15:33
Co-authored-by: Isaac Virshup <[email protected]>
Signed-off-by: zethson <[email protected]>
Signed-off-by: zethson <[email protected]>
@ivirshup ivirshup mentioned this pull request Jul 21, 2023
6 tasks
@ivirshup
Copy link
Member

So, it runs. But I see some issues:

  • micromamba list does not report the names of pip installed packages. I do not actually know if this feature is supported.
  • pip install is trying to overwrite the micromamba installed numpy. This is probably bad, but does not seem to be causing an import issue...

@flying-sheep
Copy link
Member

flying-sheep commented Jul 25, 2023

micromamba list does not report the names of pip installed packages. I do not actually know if this feature is supported.

That’s mamba-org/mamba#2059

pip install is trying to overwrite the micromamba installed numpy. This is probably bad, but does not seem to be causing an import issue...

Why is it bad?

Both are dependency resolvers. The ideal way to use package managers would be to pick a single one, and let it install everything in a single step. Since we don’t do that (and probably can’t?), what pip does is correct. Note that it avoids upgrading packages when --upgrade/-U is not specified, unless that’s necessary:

Controlling what gets installed

[…]

the “default” upgrade strategy when --upgrade is not set [is that] packages are not upgraded (not even direct requirements) unless the currently installed version fails to satisfy a requirement (either explicitly specified or a dependency).

The fact that pip updates numpy therefore means that something requires a minimum numpy version greater than the one that micromamba installed.

@ivirshup
Copy link
Member

Because pip can't actually uninstall conda packages.

@Intron7
Copy link
Member

Intron7 commented Jul 26, 2023

I tested the preprocessing from rapids-singlecell with cupy anndata. There are no issues expect the way views interact with .X replacements. This happens when i want to put a dense matrix in after regress_out.
In anycase it works super well so far.

@Intron7
Copy link
Member

Intron7 commented Jul 27, 2023

So I have some more small Ideas that I think would be good. I can also start implementing some of them:

  • Check if .nnz for cpx is less than 2^31-1 since cupy only supports int32 indptr
  • make a .todevice() like torch. To transform X or .layers from and to GPU. Maybe add an all option to dump everything into RAM.
  • add a flag/property for .X and layers if its in RAM or VRAM

@ivirshup ivirshup mentioned this pull request Jul 27, 2023
2 tasks
@ivirshup
Copy link
Member

I think this is getting pretty close to mergable, so I think I'd leave extra features on the actual cupy support out for now. I've opened #1080 to discuss follow up PRs.

@ivirshup
Copy link
Member

@Intron7, you had mentioned some rules for controlling when this CI was run. Do you think you could link out to this/ maybe help set this up?

@Intron7
Copy link
Member

Intron7 commented Jul 27, 2023

So far I only know that cuML uses that solution. I didnt check out how this works. But I will investigate this.

@Intron7
Copy link
Member

Intron7 commented Jul 28, 2023

If you try to set a dense array for a sparse matrix in a view cuda this happens. Can we handle this a bit more gracefully?

Right now I have to copy before the function or use _init_as_actual. It would be amazing to have a method that would update adata in place to actual.

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
File cupy/cuda/memory.pyx:742, in cupy.cuda.memory.alloc()

File ~/miniconda3/envs/anndata_test/lib/python3.10/site-packages/rmm/allocators/cupy.py:37, in rmm_cupy_allocator(nbytes)
     34     raise ModuleNotFoundError("No module named 'cupy'")
     36 stream = Stream(obj=cupy.cuda.get_current_stream())
---> 37 buf = librmm.device_buffer.DeviceBuffer(size=nbytes, stream=stream)
     38 dev_id = -1 if buf.ptr else cupy.cuda.device.get_device_id()
     39 mem = cupy.cuda.UnownedMemory(
     40     ptr=buf.ptr, size=buf.size, owner=buf, device_id=dev_id
     41 )

File device_buffer.pyx:85, in rmm._lib.device_buffer.DeviceBuffer.__cinit__()

MemoryError: std::bad_alloc: out_of_memory: CUDA error at: /home/sdicks/miniconda3/envs/anndata_test/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorMemoryAllocation out of memory
Exception ignored in: 'cupy.cuda.thrust.cupy_malloc'
Traceback (most recent call last):
  File "cupy/cuda/memory.pyx", line 742, in cupy.cuda.memory.alloc
  File "/home/sdicks/miniconda3/envs/anndata_test/lib/python3.10/site-packages/rmm/allocators/cupy.py", line 37, in rmm_cupy_allocator
    buf = librmm.device_buffer.DeviceBuffer(size=nbytes, stream=stream)
  File "device_buffer.pyx", line 85, in rmm._lib.device_buffer.DeviceBuffer.__cinit__
MemoryError: std::bad_alloc: out_of_memory: CUDA error at: /home/sdicks/miniconda3/envs/anndata_test/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorMemoryAllocation out of memory
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
File <timed eval>:1

File ~/miniconda3/envs/anndata_test/lib/python3.10/site-packages/rapids_singlecell/cunnData_funcs/_regress_out.py:115, in regress_out(cudata, keys, layer, inplace, batchsize, verbose)
    113         cudata.layers[layer] = outputs
    114     else:
--> 115         cudata.X = outputs
    116 else:
    117     return outputs

File ~/git/anndata/anndata/_core/anndata.py:682, in AnnData.X(self, value)
    678     if sparse.issparse(self._adata_ref._X) and isinstance(
    679         value, np.ndarray
    680     ):
    681         value = sparse.coo_matrix(value)
--> 682     self._adata_ref._X[oidx, vidx] = value
    683 else:
    684     self._X = value

File ~/miniconda3/envs/anndata_test/lib/python3.10/site-packages/cupyx/scipy/sparse/_index.py:446, in IndexMixin.__setitem__(self, key, x)
    444     return
    445 x = x.reshape(i.shape)
--> 446 self._set_arrayXarray(i, j, x)

File ~/miniconda3/envs/anndata_test/lib/python3.10/site-packages/cupyx/scipy/sparse/_compressed.py:480, in _compressed_sparse_matrix._set_arrayXarray(self, row, col, x)
    478 def _set_arrayXarray(self, row, col, x):
    479     i, j = self._swap(row, col)
--> 480     self._set_many(i, j, x)

File ~/miniconda3/envs/anndata_test/lib/python3.10/site-packages/cupyx/scipy/sparse/_compressed.py:557, in _compressed_sparse_matrix._set_many(self, i, j, x)
    555 j = j[mask]
    556 j[j < 0] += N
--> 557 self._insert_many(i, j, x[mask])

File ~/miniconda3/envs/anndata_test/lib/python3.10/site-packages/cupyx/scipy/sparse/_compressed.py:616, in _compressed_sparse_matrix._insert_many(self, i, j, x)
    607 def _insert_many(self, i, j, x):
    608     """Inserts new nonzero at each (i, j) with value x
    609     Here (i,j) index major and minor respectively.
    610     i, j and x must be non-empty, 1d arrays.
   (...)
    613     Modifies i, j, x in place.
    614     """
--> 616     order = cupy.argsort(i)  # stable for duplicates
    617     i = i.take(order)
    618     j = j.take(order)

File ~/miniconda3/envs/anndata_test/lib/python3.10/site-packages/cupy/_sorting/sort.py:116, in argsort(a, axis, kind)
    114 if kind is not None and kind != 'stable':
    115     raise ValueError("kind can only be None or 'stable'")
--> 116 return a.argsort(axis=axis)

File cupy/_core/core.pyx:879, in cupy._core.core._ndarray_base.argsort()

File cupy/_core/core.pyx:896, in cupy._core.core._ndarray_base.argsort()

File cupy/_core/_routines_sorting.pyx:96, in cupy._core._routines_sorting._ndarray_argsort()

File cupy/cuda/thrust.pyx:117, in cupy.cuda.thrust.argsort()

RuntimeError: transform: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered

@ivirshup
Copy link
Member

Could you share some code that throws this?

@Intron7
Copy link
Member

Intron7 commented Jul 28, 2023

Could you share some code that throws this?

from anndata import AnnData
from cupyx.scipy import sparse as cpsparse
from scipy import sparse
import cupy as cp
import numpy as np

rand = sparse.random(100000, 20000, density=0.05,dtype=np.float32, format="csr")
adata = AnnData(X= cpsparse.csr_matrix(rand))
adata = adata[:,:5000]
X = cp.random.rand(100000,5000, dtype= cp.float32)
adata.X = X

This works though

adata = AnnData(X= cpsparse.csr_matrix(rand))
adata = adata[:,:5000].copy()
X = cp.random.rand(100000,5000, dtype= cp.float32)
adata.X = X

@ivirshup
Copy link
Member

In this instance I think you can do:

rand = sparse.random(100000, 20000, density=0.05,dtype=np.float32, format="csr")
adata = AnnData(X= cpsparse.csr_matrix(rand))
adata = adata[:,:5000]
X = cp.random.rand(100000,5000, dtype= cp.float32)

del adata.X
adata.X = X

But yeah the behavior is weird, but not a bug. I'm not really sure what a more graceful way to handle this would be here.

I think I could be up for a inplace conversion to actual though. Can you open an issue for this?

@Intron7
Copy link
Member

Intron7 commented Jul 28, 2023

For very small matrices it works but is super slow.

Open an issue for the feature #1082

conftest.py Outdated Show resolved Hide resolved
Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Once the TODOs are done, I think we’re good to go!

@ivirshup ivirshup merged commit 8b1a7e4 into main Jul 31, 2023
14 checks passed
@ivirshup ivirshup deleted the feature/gpu_ci branch July 31, 2023 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants