Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

moving tensors back and forth between CPU and GPU? #207

Open
sflc6 opened this issue Apr 4, 2018 · 8 comments
Open

moving tensors back and forth between CPU and GPU? #207

sflc6 opened this issue Apr 4, 2018 · 8 comments

Comments

@sflc6
Copy link

sflc6 commented Apr 4, 2018

Super sorry if this is obvious, but -- how do I copy a tensor from CPU -> GPU and vice versa? I've been looking through the documentation, and can't seem to find how to do this?

@stites
Copy link

stites commented Apr 5, 2018

There might be a better way to do this, but ATen compiles the following functions in THCTensorCopy:
https://github.com/zdevito/ATen/blob/master/src/THC/generic/THCTensorCopy.h#L37

There might also be some convenience functions like how pytorch lets you tensor.cuda() and tensor.cpu()

@c-hofer
Copy link

c-hofer commented Jun 21, 2018

I'm also running against a wall here. The api is a little confusing for me here. For me it looks like one should do sthg like

data int32_t[] = ...; // data contains not only zeros
t_cpu = CPU(kInt).tensorFromBlob(&data[0], {10, 10}); // here content is fine
t_gpu = t_cpu.toType(t_cpu.type().toBackend(kCUDA).toScalarType(kInt)); // contains just zero 

but this seems wrong as t_gpu contains just zeros after the operation.

@ezyang What am i missing?
@ezyang @zdevito It would be really great if a how to could be added to the readme file of ATen since loading data and the moving to gpu is a very common workflow imho :)

many thanks chofer

@ezyang
Copy link
Contributor

ezyang commented Jun 24, 2018

If you are running reasonably recent master, I think the following should work:

at::Tensor t_gpu = t_cpu.to(at::kCUDA);

We should make t_cpu.cuda() work though...

CC @goldsborough

@c-hofer
Copy link

c-hofer commented Jun 25, 2018

Hi,

my HEAD is 372d1d67356f054db64bdfb4787871ecdbbcbe0b.

to is not yet implemented, so it seems.

However, it looks like to problem is the creation with fromBlob(...). If i create a Tensor differently I can move it between CPU and GPU by using the toBackend method of the Tensor class, eg.
my_cpu_tensor.toBackend(Backend::CUDA); .

My workaround to bring externally allocated cpu data on the gpu in a tensor:

  1. create array data on cpu
  2. use cuda malloc, memcopy to bring it on the gpu
  3. create a tensor with fromBlob from the allocated data
  4. clone the tensor (in order not to mess with ATens memory management engine?)
  5. cudafree the allocated space.

So from my point of view it seems that there is a transportation issue of the memory from the wild to the ATen controlled regime. But its just a guess ;)

cheers c.hofer

@soumith
Copy link
Collaborator

soumith commented Jun 25, 2018

@c-hofer look at https://github.com/zdevito/ATen/blob/31d00ab7fdf00c258b0fad5b1b05af77e92b64a9/aten/src/ATen/test/dlconvertor_test.cpp

You can use the DLPack format which is a cross-framework, well-specified and simple format that we support importing from: https://github.com/dmlc/dlpack/

@c-hofer
Copy link

c-hofer commented Jun 25, 2018

Thx, that's a valuable hint :)

@goldsborough
Copy link
Contributor

You can also clone on the CPU first and then move it to GPU, if that's feasible: CPU(kInt).tensorFromBlob(&data[0], {10, 10}).clone().toBackend(at::kCUDA).
The to() functions landed 6 days ago and are on master here: https://github.com/zdevito/ATen/blob/master/aten/src/ATen/templates/Tensor.h#L90

@c-hofer
Copy link

c-hofer commented Jun 25, 2018

thx, this is surely more elegant ... by the way, any plans when the new ATen api will be more or less stable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants