Skip to content

Commit

Permalink
Merge branch 'main' into pascal-eigval
Browse files Browse the repository at this point in the history
  • Loading branch information
jlevine18 authored Feb 23, 2024
2 parents a83c6d6 + 6da0478 commit 7d89630
Show file tree
Hide file tree
Showing 13 changed files with 220 additions and 44 deletions.
2 changes: 2 additions & 0 deletions .bundle/config
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
---
BUNDLE_DEPLOYMENT: "true"
31 changes: 16 additions & 15 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ permissions:
# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
concurrency:
group: "build_pr"
cancel-in-progress: true
cancel-in-progress: false

jobs:
# Build job
Expand All @@ -27,19 +27,20 @@ jobs:
uses: actions/checkout@v4
with:
submodules: 'true'
- name: Setup Ruby
uses: ruby/setup-ruby@v1 # v1.161.0
- uses: docker/setup-buildx-action@v2
- uses: docker/build-push-action@v4
with:
ruby-version: '3.1' # Not needed with a .ruby-version file
cache-version: 0 # Increment this number if you need to re-download cached gems
- name: Setup Node
uses: actions/setup-node@v4
context: .
file: "Dockerfile.devel"
tags: cs357:local
load: true
cache-from: type=gha
cache-to: type=gha,mode=max
push: false
- name: Run build
uses: addnab/docker-run-action@v3
with:
node-version: '18.x'
- name: Setup Pages
id: pages
uses: actions/configure-pages@v4
- name: Install
run: make install
- name: Build with Jekyll
run: make build
image: cs357:local
options: -v ${{ github.workspace }}:/srv/jekyll
shell: bash
run: cd /srv/jekyll && ls && make install && make build
26 changes: 26 additions & 0 deletions .github/workflows/cbtfgen.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: Build CBTF Artifact from public site

on:
workflow_dispatch:

jobs:
# Build job
download:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
submodules: 'true'
- uses: docker/setup-buildx-action@v2
- name: Run build
uses: addnab/docker-run-action@v3
with:
image: ghcr.io/cs357/textbook-devel:latest
options: -v ${{ github.workspace }}:/srv/jekyll
shell: bash
run: cd /srv/jekyll/_cbtf && ls && ./create.sh
- uses: actions/upload-artifact@v4
with:
name: build
path: /srv/jekyll/_cbtf/output/ # or path/to/artifact
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,5 @@ node_modules/
Gemfile.lock

.DS_Store

_cbtf/output/
2 changes: 1 addition & 1 deletion Dockerfile.devel
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
FROM ruby:2.7.7
ENV NODE_VERSION=20.11.0
RUN apt update && apt install -y curl build-essential
RUN apt update && apt install -y curl build-essential net-tools httrack
RUN curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
ENV NVM_DIR=/root/.nvm
RUN . "$NVM_DIR/nvm.sh" && nvm install ${NODE_VERSION}
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,4 @@ build: dist
@${DEBUG} JEKYLL_ENV=production bundle exec jekyll build --safe --profile

server: dist
@${DEBUG} bundle exec jekyll server --safe --livereload
@${DEBUG} bundle exec jekyll server --safe --livereload
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# Welcome to the CS 357 Notes
# Welcome to the CS 357 Textbook

For the main course page, visit [{{site.courseUrl}}]({{site.courseUrl}}).

This website is in active development. If you encounter any issues or bugs, please file a bug report [here]({{site.issuesUrl}}).
6 changes: 6 additions & 0 deletions _cbtf/create.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash
mkdir -p output/
rm -rf output/
httrack https://cs357.github.io/textbook/ -O output/
echo '<meta HTTP-EQUIV="Refresh" CONTENT="0; URL=cs357.github.io/textbook/index.html">' > output/index.html
echo "Build done! Upload the concents of the 'output/' directory *without modification* to PrairieLearn to be used in the CBTF."
8 changes: 4 additions & 4 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ semester: sp2024
courseUrl: https://courses.engr.illinois.edu/cs357

addons:
- github
- gems
- analytics

Expand All @@ -29,12 +28,13 @@ exclude:
- node_modules
- README.txt
- cs357-rtd-theme
- _cbtf
- vendor
readme_index:
with_frontmatter: true

kramdown:
input: GFM
markdown: kramdown

# Sass/SCSS
sass:
style: compressed # https://sass-lang.com/documentation/file.SASS_REFERENCE.html#output_style
style: compressed # https://sass-lang.com/documentation/file.SASS_REFERENCE.html#output_style
2 changes: 1 addition & 1 deletion cs357-rtd-theme
2 changes: 1 addition & 1 deletion notes/random-monte-carlo.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ A **_linear congruential generator_** (LCG) is pseudorandom number generator of

where <span>\\(a\\) (the multiplier)</span> and <span>\\(c\\) (the increment)</span> are given integers, and <span>\\(x_0\\)</span> is called the **_seed_**. The period of an LCG cannot exceed <span>\\(M\\) (the modulus)</span>. The period may be less than <span>\\(M\\)</span> depending on the values of <span>\\(a\\)</span> and <span>\\(c\\)</span>. The quality depends on both <span>\\(a\\)</span> and <span>\\(c\\)</span>.

### Example of an LCG
### Example of a LCG

Below is the Python code for an example LCG that generates the numbers \\(1,3,7,5,1,3,7,5,\dots\\) given an initial seed of <span>\\(1\\)</span>.
To follow the pattern, we double the previous number, add \\(1\\), and mod by 10, so \\(a=2\\), \\(c=1\\), and \\(M=10\\).
Expand Down
156 changes: 142 additions & 14 deletions notes/sparse.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,35 +2,117 @@
title: Sparse Matrices
description: How to store and solve problems with many zeros
sort: 10
author:
- CS 357 Course Staff
changelog:
-
name: Kriti Chandak
netid: kritic3
date: 2024-02-14
message: added information from slides and additional examples
-
name: Victor Zhao
netid: chenyan4
date: 2022-03-06
message: added instructions on how to interpret COO and CSR
-
name: Peter Sentz
netid: sentz2
date: 2020-03-01
message: extracted material from previous reference pages
---

# Sparse Matrices

* * *

## Dense Matrices

A \\(n \times n\\) matrix is called dense if it has <span>\\(O(n^2)\\)</span> non-zero entries. For example:
<div>\[\mathbf{A} = \begin{bmatrix} 1.0 & 2.0 & 3.0 \\ 4.0 & 5.0 & 6.0 \\ 7.0 & 8.0 & 9.0 \end{bmatrix}.\]</div>
A dense matrix stores all values including zeros. So, a \\(m \times n\\) matrix would have to store <span>\\(O(m \times n)\\)</span> entries. For example:
<div>\[A = \begin{bmatrix} 1.0 & 0 & 0 & 2.0 & 0 \\ 3.0 & 4.0 & 0 & 5.0 & 0 \\ 6.0 & 0 & 7.0 & 8.0 & 9.0 \\ 0 & 0 & 10.0 & 11.0 & 0 \\ 0 & 0 & 0 & 0 & 12.0 \end{bmatrix}.\]</div>

To store the matrix, all components are saved in row-major order. For <span>\\(\mathbf{A}\\)</span> given above, we would store:
<div>\[AA = \begin{bmatrix} 1.0 & 2.0 & 3.0 & 4.0 & 5.0 & 6.0 & 7.0 & 8.0 & 9.0 \end{bmatrix}.\]</div>
To store the matrix, all components are generally saved in row-major order. For <span>\\(\mathbf{A}\\)</span> given above, we would store:
<div>\[A_{dense} = \begin{bmatrix} 1.0 & 2.0 & 3.0 & 4.0 & 5.0 & 6.0 & 7.0 & 8.0 & 9.0 & 10.0 & 11.0 & 12.0 \end{bmatrix}.\]</div>

The dimensions of the matrix are stored separately.
The dimensions of the matrix are stored separately:
<div>\[A_{shape} = \\((nrow, ncol)\\).\]</div>

## Sparse Matrices

A \\(n \times n\\) matrix is called sparse if it has <span>\\(O(n)\\)</span> non-zero entries. For example:
<div>\[A = \begin{bmatrix} 1.0 & 0 & 0 & 2.0 & 0 \\ 3.0 & 4.0 & 0 & 5.0 & 0 \\ 6.0 & 0 & 7.0 & 8.0 & 9.0 \\ 0 & 0 & 10.0 & 11.0 & 0 \\ 0 & 0 & 0 & 0 & 12.0 \end{bmatrix}.\]</div>
Some types of matrices contain too many zeros, and storing all those zero entries is wasteful. A sparse matrix is a matrix with few non-zero entries.

A \\(m \times n\\) matrix is called sparse if it has <span>\\(O(min(m, n))\\)</span> non-zero entries.

## Goals

A sparse matrix aims to store large matrices efficiently without storing many zeros, which allows for more economical computations. It also saves storage space, which can reduce memory overhead on a system.

The number of operations required to add two dense matrices <span>\\(\mathbf{P}\\)</span>, <span>\\(\mathbf{Q}\\)</span> is <span>\\(O(n(\mathbf{P}) \times n(\mathbf{Q}))\\)</span> where <span>\\((n(\mathbf{X}))\\)</span> is the number of elements in <span>\\(\mathbf{X}\\)</span>.

The number of operations required to add two sparse matrices <span>\\(\mathbf{P}\\)</span>, <span>\\(\mathbf{Q}\\)</span> is <span>\\(O(nnz(\mathbf{P}) \times nnz(\mathbf{Q}))\\)</span> where <span>\\((nnz(\mathbf{X}))\\)</span> is the number of non-zero elements in <span>\\(\mathbf{X}\\)</span>.

## Storage Solutions

There are many ways to store sparse matrices such as Coordinate (COO), Compressed Sparse Row (CSR), Block Sparse Row (BSR), Dictionary of Keys (DOK), etc.

We will focus on Coordinate and Compressed Sparse Row.

Let's explore ways to store the following example:
<div>\[\mathbf{A}= \begin{bmatrix} 1.0 & 0 & 0 & 2.0 & 0 \\ 3.0 & 4.0 & 0 & 5.0 & 0 \\ 6.0 & 0 & 7.0 & 8.0 & 9.0 \\ 0 & 0 & 10.0 & 11.0 & 0 \\ 0 & 0 & 0 & 0 & 12.0 \end{bmatrix}.\]</div>

### Coordinate Format (COO)

**COO** (Coordinate Format) stores arrays of row indices, column indices and the corresponding non-zero data values in any order. This format provides fast methods to construct sparse matrices and convert to different sparse formats. For <span>\\({\bf A}\\)</span> the COO format is:
COO stores arrays of row indices, column indices and the corresponding non-zero data values in any order. This format provides fast methods to construct sparse matrices and convert to different sparse formats. For <span>\\({\bf A}\\)</span> the COO format is:

$$\textrm{data} = \begin{bmatrix} 12.0 & 9.0 & 7.0 & 5.0 & 1.0 & 2.0 & 11.0 & 3.0 & 6.0 & 4.0 & 8.0 & 10.0\end{bmatrix}$$

$$\textrm{row} = \begin{bmatrix} 4 & 2 & 2 & 1 & 0 & 0 & 3 & 1 & 2 & 1 & 2 & 3 \end{bmatrix}, \\ \textrm{col} = \begin{bmatrix} 4 & 4 & 2 & 3 & 0 & 3 & 3 & 0 & 0 & 1 & 3 & 2 \end{bmatrix} $$
$$\textrm{row} = \begin{bmatrix} 4 & 2 & 2 & 1 & 0 & 0 & 3 & 1 & 2 & 1 & 2 & 3 \end{bmatrix}$$

$$\textrm{col} = \begin{bmatrix} 4 & 4 & 2 & 3 & 0 & 3 & 3 & 0 & 0 & 1 & 3 & 2 \end{bmatrix} $$

\\(\textrm{row}\\) and \\(\textrm{col}\\) are arrays of <span>\\(nnz\\)</span> integers.

\\(\textrm{data}\\) is an <span>\\(nnz\\)</span> array of the data type of the original matrix, in this case doubles.

How to interpret: The first entries of \\(\textrm{data}\\), \\(\textrm{row}\\), \\(\textrm{col}\\) are 12.0, 4, 4, respectively, meaning there is a 12.0 at position (4, 4) of the matrix; second entries are 9.0, 2, 4, so there is a 9.0 at (2, 4).

**CSR** (Compressed Sparse Row) encodes rows offsets, column indices and the corresponding non-zero data values. This format provides fast arithmetic operations between sparse matrices, and fast matrix vector product. The row offsets are defined by the followign recursive relationship (starting with \\(\textrm{rowptr}[0] = 0\\)):
A COO matrix stores \\(3 \times nnz\\) elements. This method can be sorted as each index in the row, col, and data arrays describe the same element.

Converting this matrix into COO format in python can be done using the `scipy.sparse` library.

```python
import scipy.sparse as sparse

A = [[1., 0., 0., 2., 0.],
[ 3., 4., 0., 5., 0.],
[ 6., 0., 7., 8., 9.],
[ 0., 0., 10., 11., 0.],
[ 0., 0., 0., 0., 12.]]

COO = sparse.coo_matrix(A)
data = COO.data
row = COO.row
col = COO.col
```

You can also recreate a sparse matrix in COO format to a dense or CSR matrix in python using the `scipy.sparse` library.

```python
import scipy.sparse as sparse

data = [12.0, 9.0, 7.0, 5.0, 1.0, 2.0, 11.0, 3.0, 6.0, 4.0, 8.0, 10.0]
row = [4, 2, 2, 1, 0, 0, 3, 1, 2, 1, 2, 3]
col = [4, 4, 2, 3, 0, 3, 3, 0, 0, 1, 3, 2]

COO = sparse.coo_matrix((data, (row, col)))

A = COO.todense()
CSR = COO.tocsr()
```

### Compressed Sparse Row (CSR)

CSR, also commonly known as the Yale format, encodes rows offsets, column indices and the corresponding non-zero data values. The row offsets are defined by the followign recursive relationship (starting with \\(\textrm{rowptr}[0] = 0\\)):

<div>\[ \textrm{rowptr}[j] = \textrm{rowptr}[j-1] + \mathrm{nnz}(\textrm{row}_{j-1}), \\ \]</div>

Expand All @@ -42,8 +124,48 @@ $$\textrm{col} = \begin{bmatrix} 0 & 3 & 0 & 1 & 3 & 0 & 2 & 3 & 4 & 2 & 3 & 4\e

$$\textrm{rowptr} = \begin{bmatrix} 0 & 2 & 5 & 9 & 11 & 12 \end{bmatrix}$$


\\(\textrm{rowptr}\\) contains the row offset (array of <span>\\(m + 1\\)</span> integers).

\\(\textrm{col}\\) contains column indices (array of <span>\\(nnz\\)</span> integers).

\\(\textrm{data}\\) contains non-zero elements (array of <span>\\(nnz\\)</span> type of data values, in this case doubles).

How to interpret: The first two entries of \\(\textrm{rowptr}\\) gives us the elements in the first row. Interval [0, 2) of \\(\textrm{data}\\) and \\(\textrm{col}\\), corresponding to two (data, column) pairs: (1.0, 0) and (2.0, 3), means the first row has 1.0 at column 0 and 2.0 at column 3. The second and third entries of \\(\textrm{rowptr}\\) tells us [2, 5) of \\(\textrm{data}\\) and \\(\textrm{col}\\) corresponds to the second row. The three pairs (3.0, 0), (4.0, 1), (5.0, 3) means in the second row, there is a 3.0 at column 0, a 4.0 at column 1, and a 5.0 at column 3.

A CSR matrix stores \\(2 \times nnz + m + 1\\) elements. It also provides fast arithmetic operations between sparse matrices, and fast matrix vector product.

Converting this matrix into CSR format in python can be done using the `scipy.sparse` library.

```python
import scipy.sparse as sparse

A = [[1., 0., 0., 2., 0.],
[ 3., 4., 0., 5., 0.],
[ 6., 0., 7., 8., 9.],
[ 0., 0., 10., 11., 0.],
[ 0., 0., 0., 0., 12.]]

CSR = sparse.csr_matrix(A)
data = CSR.data
col = CSR.indices
rowptr = CSR.indptr
```

You can also recreate a sparse matrix in CSR format to a dense or COO matrix in python using the `scipy.sparse` library.

```python
data = [ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12.]
col = [ 0, 3, 0, 1, 3, 0, 2, 3, 4, 2, 3, 4]
rowptr = [ 0, 2, 5, 9, 11, 12]

CSR = sparse.csr_matrix((data, col, rowptr))

A = CSR.todense()
COO = CSR.tocoo()
```


## CSR Matrix Vector Product Algorithm

The following code snippet performs CSR matrix vector product for square matrices:
Expand All @@ -60,8 +182,14 @@ def csr_mat_vec(A, x):

## Review Questions

- See this [review link](/cs357/fa2020/reviews/rev-11-sparse.html)
## ChangeLog
1. What does it mean for a matrix to be sparse?

2. What factors might you consider when deciding how to store a sparse matrix? (Why would you store a matrix in one format over another?)

3. Given a sparse matrix, put the matrix in CSR format.

4. Given a sparse matrix, put the matrix in COO format.

5. For a given matrix, how many bytes total are required to store it in CSR format?

* 2022-03-06 Victor Zhao [[email protected]](mailto:[email protected]): Added instructions on how to interpret COO and CSR
* 2020-03-01 Peter Sentz: extracted material from previous reference pages
6. For a given matrix, how many bytes total are required to store it in COO format?
Loading

0 comments on commit 7d89630

Please sign in to comment.