description
#GPU_sharing #GPU_time_sharing #API_remoting #kernel_burst

Gemini: Enabling multi-tenant GPU sharing based on kernel burst estimation

Meta Info

Presented in TCC 2021.

Authors: Hung-Hsin Chen, En-Te Lin, Yu-Min Chou, Jerry Chou (National Tsing Hua University).

Code: https://github.com/NTHU-LSALAB/Gemini

Understanding the paper

This paper proposes Gemini, a user-space runtime scheduling framework to enable fine-grained GPU allocation control with support.
- Introduce Kernel burst, a group of consecutive kernels launched together without being interrupted by synchronous events.
  - Typical GPU programming model
    1. copy data to GPU device memory
    2. launch CUDA kernels without data dependency
    3. wait for kernels to complete
    4. copy results back to CPU host memory
- Propose a low overhead event-driven monitor and a dynamic time-sharing scheduler.