Skip to content

Commit

Permalink
rearrange readings/miscs; add analysis part of math; fixed some typos
Browse files Browse the repository at this point in the history
  • Loading branch information
ZhouTimeMachine committed Nov 1, 2024
1 parent 15c1d71 commit bdc546d
Show file tree
Hide file tree
Showing 19 changed files with 332 additions and 14 deletions.
3 changes: 3 additions & 0 deletions docs/courses/info-theory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Information Theory

!!! warning "该页面还在施工中"
2 changes: 1 addition & 1 deletion docs/courses/probability/prob_lim.md
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@ $$

也就得到了依分布收敛。

!!! info "依概率收敛 $\nRightarrow$ 依分布收敛"
!!! info "依概率收敛 $\Rightarrow$ 依分布收敛"

??? general "Counterexample"

Expand Down
24 changes: 22 additions & 2 deletions docs/math/DEs/Intro2SDE/Brownian_noise.md
Original file line number Diff line number Diff line change
Expand Up @@ -846,10 +846,30 @@ $$

## Sample Path Properties

!!! warning "本节还在施工中"
布朗运动的采样路径具有一定的 Hölder 连续性,为了详细阐述与证明,需要首先阐明 Hölder 连续性的定义。

> 可以参考 [Hölder condition - Wikipedia](https://en.wikipedia.org/wiki/H%C3%B6lder_condition),另外对于各种连续性在本笔记的 [Continuities](../../../analysis/continuities) 中也有详细的阐述
!!! info "Hölder Continuity"
考虑函数 $f:[0, T]\to \mathbb{R}$ 与 $0 < \gamma \leqslant 1$:

**(1)** 如果存在常数 $K$ 使得下式成立,则称 $f$ 是 $\:\gamma$-Hölder 一致连续 (uniformly $\:\gamma$-Hölder continuous) 的:

$$
|f(t) - f(s)| \leqslant K|t-s|^\gamma, \quad \forall t, s\in [0, T]
$$

**(2)** 如果存在常数 $K$ 使得下式成立,则称 $f$ 是在 $s$ 点是 $\:\gamma$-Hölder 连续的:

$$
|f(t) - f(s)| \leqslant K|t-s|^\gamma, \quad \forall t\in [0, T]
$$

对于随机过程的采样路径的 Hölder 连续性,Kolmogorov 连续性定理常常被使用:

!!! abstract "Kolmogorov continuity theorem"

asd

> 详见 [Kolmogorov continuity theorem - Wikipedia](https://en.wikipedia.org/wiki/Kolmogorov_continuity_theorem)
## Markov Property
File renamed without changes.
5 changes: 5 additions & 0 deletions docs/math/analysis/funcRvar/LebesgueMeasure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
<link rel="stylesheet" href="../../../../css/counter.css" />

# Lebesgue Measure

!!! warning "本页面还在施工中"
41 changes: 41 additions & 0 deletions docs/math/analysis/funcRvar/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Functions of a Real Variable

> *real analysis* (实分析) is too difficult for me, so I just try to learn *functions of a real variable* (实变函数) first.
!!! warning "本页面还在施工中。目前计划在[智云课堂](https://classroom.zju.edu.cn)学习贾厚玉老师开设的《实变函数》"

## Contents

- [Basics of Set Theory](sets.md)
- [Lebesgue Measure](LebesgueMeasure.md)
- ......(to be continued)

## Objective: Lebesgue Integral

Riemann 可积函数列存在一个问题,极限运算是不封闭的,也即可积函数列的极限函数可能不可积。

!!! example "极限运算不封闭"
记 $[0, 1]$ 上的可积函数类为 $R[0, 1]$,则考虑 $[0, 1]$ 上的“二进有理数”数列:

$$
0, 1, \frac{1}{2}, \frac{1}{4}, \frac{3}{4}, \frac{1}{8}, \frac{3}{8}, \frac{5}{8}, \frac{7}{8}, \cdots
$$

即该数列由 $0, 1$ 以及形式为 $1/2^n, 3/2^n, \cdots, (2^n-1)/2^n$ 的数构成,记该数列为 $\{r_n\}$。则定义函数列 $f_k(x)$ 如下:

$$
f_k(x) = \begin{cases}
1, & x = r_n, k\geqslant n \in \mathbb{N} \\
0, & \text{otherwise}
\end{cases}
$$

对于其极限函数 $f(x)$,对于任意细的划分,每个子区间一定有取值为 1 的离散点和取值为 0 的离散点,因此类似于 Dirichlet 函数,有 $f\notin R[0, 1]$。

引入 Lebesgue 可积函数类,是因为其在极限运算下是封闭的,即积分和极限运算可以交换:

$$
\lim_{n\to \infty} \int_a^b f_n(x)\mathrm{d}x = \int_a^b \lim_{n\to \infty} f_n(x)\mathrm{d}x
$$

而且 Lebesgue 可积是对 Riemann 可积的扩展,即原来 Riemann 可积的函数依然也是 Lebesgue 可积的。因此,Lebesgue 积分对 Riemann 积分进行了完备化。Lebesgue 积分正是实变函数的核心内容。
38 changes: 38 additions & 0 deletions docs/math/analysis/funcRvar/sets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
<link rel="stylesheet" href="../../../../css/counter.css" />

# Basics of Set Theory

## Common Symbols

- 幂集 (power set):$\mathcal{P}(X) = \{A: A\subseteq X\}$
- 用指标集 $\Lambda$ 选取 $X$ 的子集构成集族:$\mathcal{A} = \{A_{\alpha}\subseteq X: \alpha\in\Lambda\}$
- 用指标集可以方便地表示多个集合的并、交等运算:

$$
\begin{gathered}
\bigcup_{\alpha\in\Lambda}A_{\alpha} = \{x: \exists \alpha\in \Lambda, x\in A_{\alpha}\}, \\
\bigcap_{\alpha\in\Lambda}A_{\alpha} = \{x: \forall \alpha\in \Lambda, x\in A_{\alpha}\}. \\
\end{gathered}
$$

- 差集 $A \backslash B = \{x: x\in A, x\notin B\}$,补集 $A^c = \Omega \backslash A$(用 $\Omega$ 表示全集)

??? example "基本集合论练习"
- $\{x: \sup _n f_n(x) > t\} = \bigcup_{n=1}^{\infty}\{ x: f_n(x) > t \}$
- $\{x: \sup _n f_n(x) \leqslant t\} = \bigcap_{n=1}^{\infty}\{ x: f_n(x) \leqslant t \}$

第二行由第一行应用 De Morgan 律容易得到。令 $A=\{x: \sup _n f_n(x) > t\}$, $A_n=\{ x: f_n(x) > t \}$

- $\forall x\in A$,如果不存在 $n_0$ 使得 $f_{n_0}(x) = \sup_n f_n(x)$,则有 $f_n(x) < \sup_n f_n(x)$,两边同取 $\sup_n$ 后发现矛盾,因此有 $f_{n_0}(x) > t$, $x\in A_{n_0}$,也就有 $x\in \bigcup A_n\Rightarrow A\subseteq \bigcup A_n$
- $\forall x\in \bigcup A_n$,存在 $n_0$ 使得 $x\in A_{n_0}$,也就有 $\sup_n f_n(x) \geqslant f_{n_0}(x) > t$,因此 $x\in A\Rightarrow \bigcup A_n\subseteq A$

## Limitation of Set Sequence

像定义数列的极限一样,讨论对集列的极限的定义之前,先考虑单调集列这一特殊情况。

- 单增集列:$A_k\subseteq A_{k+1}$, $k\in \mathbb{N}$,一定在全集 $\Omega$ 中,其极限一定存在,为 $\bigcup_{k=1}^{\infty} A_k\subseteq \Omega$
- 单减集列:$A_k\supseteq A_{k+1}$, $k\in \mathbb{N}$,其极限一定存在,为 $\bigcap_{k=1}^{\infty} A_k$



!!! warning "本页面还在施工中"
7 changes: 7 additions & 0 deletions docs/math/analysis/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Analysis

- [Functions of a Real Variables](funcRvar/index.md)
- [Basics of Set Theory](funcRvar/sets.md)
- [Lebesgue Measure](funcRvar/LebesgueMeasure.md)
- ......
- [Continuities](continuities.md)
4 changes: 2 additions & 2 deletions docs/math/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,5 @@
- [Partial Differential Equations](DEs/PDE/index.md)
- [Stochastic Differential Equations](DEs/Intro2SDE/index.md)
- [Probability & Statistics](probability/index.md)
- Misc
- [Continuities](misc/continuities.md)
- [Analysis](analysis/index.md)
- [Functions of a Real Variable](analysis/funcRvar/index.md)
4 changes: 2 additions & 2 deletions docs/readings/ICCV2023/DDPM_latent.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ $$
- PSNR: Quality
- SSIM: Similarity

Details in [Metrics](../metrics.md).
Details in [Metrics](../miscs/metrics.md).

### Zero-shot Image Editing

Expand All @@ -109,7 +109,7 @@ $$
- PSNR: Quality
- SSIM: Similarity between real image and generated image

Details in [Metrics](../metrics.md).
Details in [Metrics](../miscs/metrics.md).

<div style="text-align:center;">
<img src="../../imgs/ICCV2023/DDPM_latent_5.png" alt="DDPM_latent_5" style="zoom:80%;" />
Expand Down
2 changes: 2 additions & 0 deletions docs/readings/diffusion/SGM.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
<link rel="stylesheet" href="../../../css/counter.css" />

# Score-based Generative Models

!!! info "Reference"
Expand Down
Binary file added docs/readings/imgs/miscs/bn_vs_ln.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/readings/imgs/miscs/normalizations.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 1 addition & 2 deletions docs/readings/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,4 @@

- [Diffusion Models](diffusion/index.md)
- [ICLR2024](ICLR2024/index.md)
- [ICCV2023](ICCV2023/index.md)
- [Metrics](metrics.md)
- [ICCV2023](ICCV2023/index.md)
File renamed without changes.
130 changes: 130 additions & 0 deletions docs/readings/miscs/einsum.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
<link rel="stylesheet" href="../../../css/counter.css" />

# Einsum

!!! info "Reference: [Einsum Is All You Need: NumPy, PyTorch and TensorFlow](https://www.youtube.com/watch?v=pkVwUVEHmfI)"

## Example Uses in PyTorch

```python hl_lines="4 10 14 18"
>>> x = torch.tensor([[1, 2, 3],
[4, 5, 6]])

# permutation
>>> torch.einsum("ij->ji", x)
tensor([[1, 4],
[2, 5],
[3, 6]])

# summation
>>> torch.einsum("ij->", x)
tensor(21)

# column sum
>> torch.einsum("ij->j", x)
tensor([5, 7, 9])

# row sum
>> torch.einsum("ij->i", x)
tensor([ 6, 15])
```

```python
>>> x = torch.tensor([[1, 2, 3],
[4, 5, 6]])
>>> v = torch.tensor([[1, 0, -1]])

# martix-vector multiplication: xv^T
>>> torch.einsum("ij,kj->ik", x, v)
tensor([[-2],
[-2]])

# martix-matrix multiplication: xx^T
>>> torch.einsum("ij,kj->ik", x, x) # 2*2: (2*3) @ (3*2)
tensor([[14, 32],
[32, 77]])

# dot product first row with first row of matrix
>>> torch.einsum("i,i->", x[0], x[0])
tensor(14)
```

```python
>>> x = torch.tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

# dot product with matrix
>>> torch.einsum("ij,ij->", x, x)
tensor(285)

# Hadamard product (element-wise multiplication)
>>> torch.einsum("ij,ij->ij", x, x)
tensor([[ 1, 4, 9],
[16, 25, 36],
[49, 64, 81]])
```

```python
# outer product
>>> a = torch.tensor([1, 0, -1])
>>> b = torch.tensor([1, 2, 3, 4, 5])
>>> torch.einsum("i,j->ij", a, b)
tensor([[ 1, 2, 3, 4, 5],
[ 0, 0, 0, 0, 0],
[-1, -2, -3, -4, -5]])

# batch matrix multiplication
>>> generator = torch.manual_seed(12)
>>> a = torch.rand((3, 2, 5), generator=generator)
tensor([[[0.4657, 0.2328, 0.4527, 0.5871, 0.4086],
[0.1272, 0.6373, 0.2421, 0.7312, 0.7224]],

[[0.1992, 0.6948, 0.5830, 0.6318, 0.5559],
[0.1262, 0.9790, 0.8443, 0.1256, 0.4456]],

[[0.6601, 0.0554, 0.1573, 0.8137, 0.7216],
[0.2717, 0.3003, 0.6099, 0.5784, 0.6083]]])
>>> b = torch.rand((3, 5, 3), generator=generator)
tensor([[[0.4339, 0.8813, 0.3216],
[0.2604, 0.2566, 0.1872],
[0.6423, 0.1786, 0.1435],
[0.7490, 0.7275, 0.1641],
[0.3273, 0.1239, 0.6138]],

[[0.4535, 0.7659, 0.1800],
[0.3338, 0.9526, 0.8919],
[0.9859, 0.6348, 0.8811],
[0.9391, 0.1173, 0.1342],
[0.9405, 0.6803, 0.5556]],

[[0.8713, 0.0782, 0.8578],
[0.7540, 0.6698, 0.5817],
[0.3829, 0.7163, 0.8930],
[0.5597, 0.2803, 0.2476],
[0.4738, 0.1306, 0.2024]]])
>>> torch.einsum("ijk,ikl->ijl", a, b)
tensor([[[1.1270, 1.0287, 0.6055],
[1.1608, 0.9403, 0.7584]],

[[2.0132, 1.6369, 1.5629],
[1.7535, 1.8831, 1.9043]],

[[1.4744, 0.5236, 1.0864],
[1.3086, 0.9008, 1.2188]]])
```


```python
>>> x = torch.tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

# matrix diagonal
>>> torch.einsum("ii->i", x)
tensor([1, 5, 9])

# matrix trace
>>> torch.einsum("ii->", x)
tensor(15)
```
File renamed without changes.
64 changes: 64 additions & 0 deletions docs/readings/miscs/normalization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
<link rel="stylesheet" href="../../../css/counter.css" />

# Normalization

<div style="text-align:center;">
<img src="../../imgs/miscs/normalizations.png" alt="normalizations" style="50%;" />
</div>

## Batch Normalization

对每个 mini batch $(z^{(1)}, \cdots, z^{(b)})$, $b$ 是 batch size。计算各个样本特征向量的均值 $\mu$ 和方差 $\sigma^2$

$$
\mu = \frac{1}{b}\sum_{i=1}^{b}z^{(i)}, \quad \sigma^2 = \frac{1}{b}\sum_{i=1}^{b}(z^{(i)}-\mu)^2
$$

随后有(为了数值稳定性引入一个很小的 $\varepsilon>0$)

$$
z^{(i)}_{\text{norm}}=\frac{z^{(i)}-\mu}{\sqrt{\sigma^2+\varepsilon}}
$$

用以替换 $z^{(i)}$ 的 $\tilde{z}^{(i)}$ 还需要对标准化得到的 $z^{(i)}_{\text{norm}}$ 进行线性(仿射)变换

$$
\tilde{z}^{(i)} = \gamma z^{(i)}_{\text{norm}} + \beta
$$

这里线性变换的参数 $\gamma$ 和 $\beta$ 也相当于网络参数 $w$,参与前向传播和反向传播的参数更新过程。线性变换的意义在于让每层的直接输出的分布更加多元化,而不总是被标准化所限制。

> 有趣的是,每一层的偏置项 (bias) 和 $\beta$ 是重复的,所以可以去掉偏置项。
于是,使用了 Batch Normalization 之后,只是把某一层的直接输出 $z^{(i)}$ 替换为 $\tilde{z}^{(i)}$,然后再应用激活函数后得到 $a^{(i)}$,再输入下一层。

> 这里先 BN 还是先应用激活函数是一个问题,吴恩达认为经常先 BN 再使用激活函数。
## Layer Normalization

Layer Normalization 和 Batch Normalization 的不同之处只在于 $\mu$ 和 $\sigma^2$ 的计算方法。 Batch Normalization 是沿着 mini batch 这一维计算均值 $\mu$ 和方差 $\sigma^2$,而 Layer Normalization 则是单个样本内部进行 $\mu$ 和 $\sigma^2$ 的计算。

!!! warning "该页面还在建设中"

$$
\mu = \frac{1}{b}\sum_{i=1}^{b}z^{(i)}, \quad \sigma^2 = \frac{1}{b}\sum_{i=1}^{b}(z^{(i)}-\mu)^2
$$

可以从下图简单看到计算维度的差别:

<div style="text-align:center;">
<img src="../../imgs/miscs/bn_vs_ln.png" alt="bn_vs_ln" style="50%;" />
</div>

将 Batch Normalization 和 Layer Normalization 举例如下:

\begin{figure}[H]
\centering
\includegraphics[scale=0.4]{graph/6.2.png}
\includegraphics[scale=0.4]{graph/6.3.png}
\caption{examples of Batch Normalization(left) and Layer Normalization(right)}
\end{figure}

## Instance Normalization

## Group Normalization
Loading

0 comments on commit bdc546d

Please sign in to comment.