Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cupy support in anndata #1080

Closed
2 tasks done
ivirshup opened this issue Jul 27, 2023 · 8 comments
Closed
2 tasks done

Cupy support in anndata #1080

ivirshup opened this issue Jul 27, 2023 · 8 comments

Comments

@ivirshup
Copy link
Member

ivirshup commented Jul 27, 2023

Splitting out from:

Mostly implemented in

Opening this issue to discuss what else is needed for cupy support in anndata.

Priority follow up

  • "How to" example of usage

Other

@ivirshup ivirshup added this to the 0.10.0 milestone Jul 27, 2023
@ivirshup
Copy link
Member Author

Responding to:

So I have some more small Ideas that I think would be good. I can also start implementing some of them:

  • Check if .nnz for cpx is less than 2^31-1 since cupy only supports int32 indptr
  • make a .todevice() like torch. To transform X or .layers from and to GPU. Maybe add an all option to dump everything into > RAM.
  • add a flag/property for .X and layers if its in RAM or VRAM

Originally posted by @Intron7 in #1066 (comment)

  • make a .todevice()

What would this do? I was hoping that it would let us move things between CPU and GPU, but I don't think that's consistent with the semantics of cupy and numpy. Maybe it can be handled on the array level for now?

  • add a flag/property for .X and layers if its in RAM or VRAM

For now could this just be "whether it's a cupy type or not"? I don't think I'd want to add a property to the AnnData object for this if you can do isinstance.

  • Check if .nnz for cpx is less than 2^31-1 since cupy only supports int32 indptr

Is this in the case where the scipy matrix has 64 bit index types? I think this would make sense there.

@Intron7
Copy link
Member

Intron7 commented Jul 27, 2023

Is this in the case where the scipy matrix has 64 bit index types? I think this would make sense there.

Yes there are 64 bit indptr for scipy matrices. If you go above the limit the indptr will become negative and you run into issues with converting the matrix etc.

What would this do? I was hoping that it would let us move things between CPU and GPU, but I don't think that's consistent with the semantics of cupy and numpy. Maybe it can be handled on the array level for now?

Indeed, we could let the user handle this, and the syntax would likely be more similar to pytorch. I really came to like a simple oneliner that moves everything over to RAM and not manually converting every layer.

For now could this just be "whether it's a cupy type or not"? I don't think I'd want to add a property to the AnnData object for this if you can do isinstance.

Using isinstance is entirely sufficient. I was just considering the scenario where someone wants to run a scanpy function on GPU AnnData or an rsc function with a scipy matrix. I'm currently making all my rapids-singlecell functions compatible with CPU AnnData and was contemplating such a flag.

@ivirshup
Copy link
Member Author

@Intron7, I'm thinking about how we document this here. I think it would be useful to link out to the rapids-single cell docs. What do you think? I see the latest build still is referring to CunnData

@Intron7
Copy link
Member

Intron7 commented Aug 22, 2023

@ivirshup I have a PR (scverse/rapids_singlecell#60) ready to go once the anndata 0.10.0 comes out. The only things I'll have to do is rerun the notebooks with benchmarks and depreciate cunndata and change the docs with the usage principle. So I think I can have the updated out at most 2 days after you publish.
Edit:
I also updated the tests to work with anndata. Thats why I didn't merge yet.

@flying-sheep
Copy link
Member

That’s awesome!

@ivirshup
Copy link
Member Author

ivirshup commented Sep 7, 2023

I'll split the others off into new issues, but will point to rapids-single cell for usage examples. Almost closable

@Intron7
Copy link
Member

Intron7 commented Sep 7, 2023

You can point to the documentation notebooks in https://rapids-singlecell.readthedocs.io/en/latest/notebooks.html
The notebooks will work with anndata once 0.10.0 is released

@ivirshup
Copy link
Member Author

ivirshup commented Oct 4, 2023

Closing as complete in anndata 0.10

@ivirshup ivirshup closed this as completed Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants