Skip to content

Latest commit

 

History

History
13 lines (10 loc) · 615 Bytes

README.md

File metadata and controls

13 lines (10 loc) · 615 Bytes

CUDA-Learning-Journal

Embarking on the journey of CUDA learning is both intriguing and extensive. To record my progress and insights, I will maintain a journal within this repository. My goal is to execute all the code snippets within the collaborative environment of Google Colab, ensuring accessibility and ease of experimentation.

step

  1. shared memory
  2. no bank conflict
  3. multiple elements in one thread
  4. vectorized memory access

Streaming MultiProcessor

  1. SM have shared memory, register file , warp sechduler ans so on.
  2. One block can only be in one SM, but one SM contain multiple blocks.