Skip to content

Latest commit

 

History

History
70 lines (61 loc) · 1.47 KB

README.md

File metadata and controls

70 lines (61 loc) · 1.47 KB

neural-network

Convolutional Neural Network with CUDA, updated to run on CUDA Toolkit 12.3

Layers

  • Linear
  • Conv2D
  • MaxPool2D
  • ReLU
  • Softmax
  • Sigmoid
  • NLLLoss

Optimizer

  • RMSProp

Prerequisites

  • CMake 3.8+
  • MSVC14.00/GCC6+
  • CUDA 12

Run

mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j10
mkdir mnist_data && cd mnist_data
wget -c http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wget -c http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wget -c http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wget -c http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
gunzip train-images-idx3-ubyte.gz 
gunzip train-labels-idx1-ubyte.gz 
gunzip t10k-labels-idx1-ubyte.gz 
gunzip t10k-images-idx3-ubyte.gz 
cd .. && ./mnist

Performance

conv 1 32 5 relu
maxpool 2
conv 32 64 5 relu
maxpool 2
conv 64 128 3 relu
fc 4 * 128 128 relu
fc 128 10 relu
softmax

shuffle = true
batch_size = 128
learning_rate = 0.003
L2 = 0.0001
beta = 0.99
  • 1 epoch 93%
  • 10 epochs 99.12%
  • 30 epochs 99.23%
  • 10s / epoch(GTX1070)

TODO

  • Faster matmul kernel function
  • CUDA Streams

References