Skip to content

Commit

Permalink
dvorak
Browse files Browse the repository at this point in the history
  • Loading branch information
Aayush Bajaj authored and Aayush Bajaj committed Dec 29, 2024
1 parent 86285a9 commit 5ed4ea3
Show file tree
Hide file tree
Showing 32 changed files with 7,862 additions and 147 deletions.
7 changes: 4 additions & 3 deletions content/projects/_index.org
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ These projects are all those that have had a lifecycle.
:PROPERTIES:
:CUSTOM_ID: deep-learning
:END:
- [[/projects/dl/benchmarking][Hardware Benchmarking]]
- [[/projects/dl/KiTS19][KiTS19 Kidney and Kidney Tumour Segmentation]]
- [[/projects/dl/llm-tune][Fine Tuning LLM]]
- [[/projects/dl/rag][RAG]]
Expand All @@ -130,7 +131,7 @@ These projects are all those that have had a lifecycle.
- [[/projects/dl/Kanye-West-RNN][RNN on the Music of Kanye West]]
- [[/projects/ai/sentiment-analysis][Sentiment Analysis]]
- [[/projects/dl/cartpole][CartPole]]
- Neetcode.io
- [[/projects/dl/neetcode][Neetcode.io]]
- [[/projects/dl/micrograd.org][Micrograd - Andrej Karpathy]]
- minGPT - Karpathy
- nanoGPT - Karpathy
- [[/projects/dl/mingpt][minGPT - Karpathy]]
- [[/projects/dl/nanogpt][nanoGPT - Karpathy]]
73 changes: 73 additions & 0 deletions content/projects/ccs/LANGUAGE-python/mastering-python.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "I shouldn't need any libraries for the below 100 exercises.",
"id": "9e47e101da00ec62"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "https://www.w3resource.com/python-exercises/python_100_exercises_with_solutions.php",
"id": "6ad058ae25c2c50a"
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"exercise 1)\n",
"create a list with values ranging from 0 to 9"
],
"id": "3cadb4f2e40cfe1"
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-23T06:29:55.440674Z",
"start_time": "2024-09-23T06:29:55.438348Z"
}
},
"cell_type": "code",
"source": "e1 = list(range(10))",
"id": "c368c93a1cb3f2ae",
"outputs": [],
"execution_count": 4
},
{
"metadata": {},
"cell_type": "markdown",
"source": "ex2) convert a list of integers to a list of strings",
"id": "278e6c22023d448a"
},
{
"metadata": {},
"cell_type": "code",
"outputs": [],
"execution_count": null,
"source": "",
"id": "38374e7179eb042d"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
42 changes: 23 additions & 19 deletions content/projects/dl/Kanye-West-RNN.org
Original file line number Diff line number Diff line change
Expand Up @@ -16,26 +16,28 @@ This document contains the code to create an RNN chatbot that emulates Kanye Wes
I am starting from scratch on this machine:

#+BEGIN_SRC sh
/opt/homebrew/bin/neofetch --stdout
$ neofetch
#+END_SRC

#+RESULTS:
| [email protected] | | | | | | |
| ------------------------------------- | | | | | | |
| OS: | macOS | 15.2 | 24C101 | arm64 | | |
| Host: | MacBookPro17,1 | | | | | |
| Kernel: | 24.2.0 | | | | | |
| Uptime: | 1 | day, | 22 | hours, | 56 | mins |
| Shell: | zsh | 5.9 | | | | |
| Resolution: | 3840x2160 | @ | UHDHz, | 2560x1600 | | |
| DE: | Aqua | | | | | |
| WM: | Quartz | Compositor | | | | |
| WM | Theme: | Blue | (Dark) | | | |
| Terminal: | Emacs-arm64-11 | | | | | |
| CPU: | Apple | M1 | | | | |
| GPU: | Apple | M1 | | | | |
| Memory: | 1369MiB | / | 8192MiB | | | |
| | | | | | | |
#+BEGIN_SRC
​​ 'c. [email protected]
,xNMM. -------------------------------------
.OMMMMo OS: macOS 15.2 24C101 arm64
OMMM0, Host: MacBookPro17,1
.;loddo:' loolloddol;. Kernel: 24.2.0
cKMMMMMMMMMMNWMMMMMMMMMM0: Uptime: 2 days, 42 mins
.KMMMMMMMMMMMMMMMMMMMMMMMWd. Packages: 182 (brew)
XMMMMMMMMMMMMMMMMMMMMMMMX. Shell: zsh 5.9
;MMMMMMMMMMMMMMMMMMMMMMMM: Resolution: 1440x900
:MMMMMMMMMMMMMMMMMMMMMMMM: DE: Aqua
.MMMMMMMMMMMMMMMMMMMMMMMMX. WM: Quartz Compositor
kMMMMMMMMMMMMMMMMMMMMMMMMWd. WM Theme: Blue (Dark)
.XMMMMMMMMMMMMMMMMMMMMMMMMMMk Terminal: tmux
.XMMMMMMMMMMMMMMMMMMMMMMMMK. CPU: Apple M1
kMMMMMMMMMMMMMMMMMMMMMMd GPU: Apple M1
;KMMMMMMMWXXWMMMMMMMk. Memory: 1485MiB / 8192MiB
.cooc,. .,coo:.
#+END_SRC

It is why I first need to run install conda first. I went with the whole suite from https://www.anaconda.com/download.

Expand All @@ -46,8 +48,10 @@ Then I initialised my environment and installed the correct packages:
conda activate nlp
conda install numpy
conda install pandas
conda install matplotlib
conda install scikit-learn
pip install tensorflow-macos
pip install lyricsgenius
pip install lyricsgenius
#+END_SRC

* Sourcing data and cleaning:
Expand Down
331 changes: 331 additions & 0 deletions content/projects/dl/YOUTUBE-andrej-karpathy/backprop.ipynb

Large diffs are not rendered by default.

Large diffs are not rendered by default.

74 changes: 74 additions & 0 deletions content/projects/dl/benchmarking.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
+++
title = "Benchmarking hardware for ML/DL"
tags = ["benchmark", "ml", "dl", "macos", "m1"]
toc = "true"
+++

{{< collapse >}}

This page contains results and explanations of benchmarking metrics for my hardware:

* 2020 M1 Macbook Pro 8GB

#+ATTR_HTML: :width 300px
[[/images/m1-about.png]]

** Stream

#+BEGIN_SRC sh
gcc -O stream.c -o stream
./stream
#+END_SRC

#+CAPTION: [[https://github.com/jeffhammond/STREAM][STREAM]] output
#+BEGIN_SRC
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 10000000 (elements), Offset = 0 (elements)
Memory per array = 76.3 MiB (= 0.1 GiB).
Total memory required = 228.9 MiB (= 0.2 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 15047 microseconds.
(= 15047 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 60496.6 0.002718 0.002645 0.002956
Scale: 49323.0 0.003346 0.003244 0.003575
Add: 56036.1 0.004534 0.004283 0.005507
Triad: 55933.4 0.004587 0.004291 0.006148
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
#+END_SRC

** GPU Info

#+CAPTION: [[https://github.com/philipturner/applegpuinfo.git][GPUInfo]] output
#+BEGIN_SRC
Build of product 'gpuinfo' complete! (38.30s)
GPU name: Apple M1
GPU vendor: Apple
GPU core count: 8
GPU clock frequency: 1.278 GHz
GPU bandwidth: 68.3 GB/s
GPU FLOPS: 2.617 TFLOPS
GPU IPS: 1.309 TIPS
GPU system level cache: 8 MB
GPU memory: 8 GB
GPU family: Apple 7
#+END_SRC

* NVIDIA Orin Nano Super 8GB
62 changes: 62 additions & 0 deletions content/projects/dl/benchmarking.org~
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
+++
title = "Machine Learning on the M1 Macbook Pro"
+++

This page just contains some benchmarking metrics for the 2020 M1 Macbook Pro 8GB:

[[m1.png]]

#+BEGIN_SRC sh
~/Code/STREAM/stream
#+END_SRC

[[https://github.com/jeffhammond/STREAM][STREAM]]
#+CAPTION: STREAM output
#+BEGIN_SRC
#+RESULTS:
| ------------------------------------------------------------- | | | | | | | | | | |
| STREAM | version | $Revision: | 5.1 | $ | | | | | | |
| ------------------------------------------------------------- | | | | | | | | | | |
| This | system | uses | 8 | bytes | per | array | element. | | | |
| ------------------------------------------------------------- | | | | | | | | | | |
| Array | size | = | 10000000 | (elements), | Offset | = | 0 | (elements) | | |
| Memory | per | array | = | 76.3 | MiB | (= | 0.1 | GiB). | | |
| Total | memory | required | = | 228.9 | MiB | (= | 0.2 | GiB). | | |
| Each | kernel | will | be | executed | 10 | times. | | | | |
| The | *best* | time | for | each | kernel | (excluding | the | first | iteration) | |
| will | be | used | to | compute | the | reported | bandwidth. | | | |
| ------------------------------------------------------------- | | | | | | | | | | |
| Your | clock | granularity/precision | appears | to | be | 1 | microseconds. | | | |
| Each | test | below | will | take | on | the | order | of | 14700 | microseconds. |
| (= | 14700 | clock | ticks) | | | | | | | |
| Increase | the | size | of | the | arrays | if | this | shows | that | |
| you | are | not | getting | at | least | 20 | clock | ticks | per | test. |
| ------------------------------------------------------------- | | | | | | | | | | |
| WARNING | -- | The | above | is | only | a | rough | guideline. | | |
| For | best | results, | please | be | sure | you | know | the | | |
| precision | of | your | system | timer. | | | | | | |
| ------------------------------------------------------------- | | | | | | | | | | |
| Function | Best | Rate | MB/s | Avg | time | Min | time | Max | time | |
| Copy: | 60721.0 | 0.002672 | 0.002635 | 0.002711 | | | | | | |
| Scale: | 49384.7 | 0.003294 | 0.00324 | 0.003409 | | | | | | |
| Add: | 56428.8 | 0.004324 | 0.004253 | 0.004528 | | | | | | |
| Triad: | 56708.5 | 0.004283 | 0.004232 | 0.004332 | | | | | | |
| ------------------------------------------------------------- | | | | | | | | | | |
| Solution | Validates: | avg | error | less | than | 1e-13 | on | all | three | arrays |
| ------------------------------------------------------------- | | | | | | | | | | |
#+END_SRC

Output of [[https://github.com/philipturner/applegpuinfo.git][GPUInfo]]
#+BEGIN_SRC
Build of product 'gpuinfo' complete! (38.30s)
GPU name: Apple M1
GPU vendor: Apple
GPU core count: 8
GPU clock frequency: 1.278 GHz
GPU bandwidth: 68.3 GB/s
GPU FLOPS: 2.617 TFLOPS
GPU IPS: 1.309 TIPS
GPU system level cache: 8 MB
GPU memory: 8 GB
GPU family: Apple 7
#+END_SRC
47 changes: 47 additions & 0 deletions content/projects/dl/llm-run.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
+++
title = "Running LLM's locally"
tags = ["llm", "mac", "ram"]
+++

* Context:
Running LLM's (large-language-models) locally is now possible [fn:1] due to the abundance of highly parallelised compute (GPU's) at affordable prices and also the advances of Deep Learning in the past decade.

As such, even slightly powerful consumer devices such as my M1 Macbook Pro with 8GB of RAM can run a small LLM. The purpose of this post is to investigate the token speed and accuracy of a variety of LLM's on my Machines.


* Instructions

Courtesy of Alex Ziskind's [[https://www.youtube.com/watch?v=bp2eev21Qfo][tutorial]]:
1. clone the frontend https://github.com/open-webui/open-webui.git
2. install ollama from https://ollama.com
3. in the command line pull whichever model you like: =ollama pull llama3=

* Results

#+BEGIN_SOURCE
>>> write me a 500 word summary on Dante's divine comedy
**The Divine Comedy: A Journey Through Hell, Purgatory, and
Paradise**

Dante Alighieri's masterpiece, The Divine Comedy, is an epic poem
#+END_SOURCE

The above output arises from asking the prompt: =write me a 500 word...=, it was trivially easy to configure.

The 7 Billion parameter model (llama3), utilised ~90% of my available RAM.

* Opinion

I did not even create a new =conda= environment and =pip install -r requirements.txt= for the /open-webui/ framework.
It just feels as if everything has already been done for me.

* Conclusion

I shall come back and rework this document once I receive my NVIDIA Orin Nano Super. I think this page will have some utility in the sense of benchmarking my hardware, but cloning and running pre-implemented LLM's is an intellectually trivial task not worth doing.

I am now implementing Andrej Karpathy's [[https://github.com/karpathy/minGPT][MinGPT]] [[/projects/dl/mingpt.org][here]].


* Footnotes

[fn:1]as of 2024
23 changes: 23 additions & 0 deletions content/projects/dl/llm-run.org~
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
+++
title = "Running LLM's locally"
tags = ["llm", "mac", "ram"]
+++

* Context:
Running LLM's (large-language-models) locally is now possible [fn:1] due to the abundance of highly parallelised compute (GPU's) at affordable prices and also the advances of Deep Learning in the past decade.

As such, even slightly powerful consumer devices such as my M1 Macbook Pro with 8GB of RAM can run a small LLM. The purpose of this post is to investigate the token speed and accuracy of a variety of LLM's on my Machines.


* Instructions

Courtesy of Alex Ziskind's [[https://www.youtube.com/watch?v=bp2eev21Qfo][tutorial]]:
1. clone the frontend https://github.com/open-webui/open-webui.git
2. install ollama from https://ollama.com
3. in the command line pull whichever model you like: =ollama pull llama3=

*

* Footnotes

[fn:1]as of 2024
6 changes: 6 additions & 0 deletions content/projects/dl/mingpt.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
+++
title = "MinGPT"
tags = ["llm", "andrej-karpathy", "scratch"]
+++

ttt
5 changes: 5 additions & 0 deletions content/projects/dl/mingpt.org~
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
+++
title = "MinGPT"
tags = ["llm", "andrej-karpathy", "scratch"]
+++

3 changes: 3 additions & 0 deletions content/projects/dl/nanogpt.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
+++
title = "NanoGPT - Min with Teeth"
+++
3 changes: 3 additions & 0 deletions content/projects/dl/neetcode.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
+++
title = "Neetcode Solutions"
+++
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
6 changes: 6 additions & 0 deletions content/projects/ml/m1-macbook.org~
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
+++
title = "Machine Learning on the M1 Macbook Pro"
+++

This page just contains some benchmarking metrics for the 2020 M1 Macbook Pro 8GB:

Loading

0 comments on commit 5ed4ea3

Please sign in to comment.