Skip to content

Latest commit

 

History

History
45 lines (23 loc) · 1.21 KB

CUDA.md

File metadata and controls

45 lines (23 loc) · 1.21 KB

CUDA Implementation Details

1. Define error handler macro

GPU error handler macro code snippet image

2. Define kernel

kernel definition code snippet image

3. Allocate space on GPU for input image

GPU space allocation for input image code snippet image

4. Copy input image from RAM to GPU DRAM

copy input image to GPU code snippet image

5. Allocate space on GPU for kernel arguments

allocate space on GPU for kernel arguments code snippet image

6. Copy kernel arguments to GPU DRAM

copy kernel arguments to GPU code snippet image

7. Allocate space on GPU for output image

GPU space allocation for output image code snippet image

8. Launch the kernel

kernel launch code snippet image

9. Handle kernel launch error

kernel launch code snippet image

10. Wait for GPU to finish

GPU space allocation for input image code snippet image

11. Copy the result from GPU DRAM to RAM

GPU space allocation for input image code snippet image