What is CUDA

CUDA is a parallel processing platform model developed by NVIDIA developer for performing general computing operation on their GPU.

CUDA abbrevation is Compute Unified Device Architecture
CUDA platform provide a software layer which give access to the GPU's virtual instruction set along with parallel computing element for execution of computer kernel.
This platform is designed to work with programming language such as C,C++ and Fortan

CUDA 8.0 comes with the following libraries (for compilation & runtime, in alphabetical order):

CUDA 8.0 comes with these other software components:

CUDA 9.0–9.2 comes with these other components:

CUTLASS 1.0 – custom linear algebra algorithms,
~~NVCUVID~~ – NVIDIA Video Decoder was deprecated in CUDA 9.2; it is now available in NVIDIA Video Codec SDK

CUDA 10 comes with these other components:

Advantages[edit]

CUDA has several advantages over traditional general-purpose computation on GPUs (GPGPU) using graphics APIs:

Scattered reads – code can read from arbitrary addresses in memory.
Unified virtual memory (CUDA 4.0 and above)
Unified memory (CUDA 6.0 and above)
Shared memory – CUDA exposes a fast shared memory region that can be shared among threads. This can be used as a user-managed cache, enabling higher bandwidth than is possible using texture lookups.^[16]
Faster downloads and readbacks to and from the GPU
Full support for integer and bitwise operations, including integer texture lookups
On RTX 20 and 30 series cards, the CUDA cores are used for a feature called "RTX IO" Which is where the CUDA cores dramatically decrease game-loading times.