-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
rearrange readings/miscs; add analysis part of math; fixed some typos
- Loading branch information
1 parent
15c1d71
commit bdc546d
Showing
19 changed files
with
332 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Information Theory | ||
|
||
!!! warning "该页面还在施工中" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
<link rel="stylesheet" href="../../../../css/counter.css" /> | ||
|
||
# Lebesgue Measure | ||
|
||
!!! warning "本页面还在施工中" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# Functions of a Real Variable | ||
|
||
> *real analysis* (实分析) is too difficult for me, so I just try to learn *functions of a real variable* (实变函数) first. | ||
!!! warning "本页面还在施工中。目前计划在[智云课堂](https://classroom.zju.edu.cn)学习贾厚玉老师开设的《实变函数》" | ||
|
||
## Contents | ||
|
||
- [Basics of Set Theory](sets.md) | ||
- [Lebesgue Measure](LebesgueMeasure.md) | ||
- ......(to be continued) | ||
|
||
## Objective: Lebesgue Integral | ||
|
||
Riemann 可积函数列存在一个问题,极限运算是不封闭的,也即可积函数列的极限函数可能不可积。 | ||
|
||
!!! example "极限运算不封闭" | ||
记 $[0, 1]$ 上的可积函数类为 $R[0, 1]$,则考虑 $[0, 1]$ 上的“二进有理数”数列: | ||
|
||
$$ | ||
0, 1, \frac{1}{2}, \frac{1}{4}, \frac{3}{4}, \frac{1}{8}, \frac{3}{8}, \frac{5}{8}, \frac{7}{8}, \cdots | ||
$$ | ||
|
||
即该数列由 $0, 1$ 以及形式为 $1/2^n, 3/2^n, \cdots, (2^n-1)/2^n$ 的数构成,记该数列为 $\{r_n\}$。则定义函数列 $f_k(x)$ 如下: | ||
|
||
$$ | ||
f_k(x) = \begin{cases} | ||
1, & x = r_n, k\geqslant n \in \mathbb{N} \\ | ||
0, & \text{otherwise} | ||
\end{cases} | ||
$$ | ||
|
||
对于其极限函数 $f(x)$,对于任意细的划分,每个子区间一定有取值为 1 的离散点和取值为 0 的离散点,因此类似于 Dirichlet 函数,有 $f\notin R[0, 1]$。 | ||
|
||
引入 Lebesgue 可积函数类,是因为其在极限运算下是封闭的,即积分和极限运算可以交换: | ||
|
||
$$ | ||
\lim_{n\to \infty} \int_a^b f_n(x)\mathrm{d}x = \int_a^b \lim_{n\to \infty} f_n(x)\mathrm{d}x | ||
$$ | ||
|
||
而且 Lebesgue 可积是对 Riemann 可积的扩展,即原来 Riemann 可积的函数依然也是 Lebesgue 可积的。因此,Lebesgue 积分对 Riemann 积分进行了完备化。Lebesgue 积分正是实变函数的核心内容。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
<link rel="stylesheet" href="../../../../css/counter.css" /> | ||
|
||
# Basics of Set Theory | ||
|
||
## Common Symbols | ||
|
||
- 幂集 (power set):$\mathcal{P}(X) = \{A: A\subseteq X\}$ | ||
- 用指标集 $\Lambda$ 选取 $X$ 的子集构成集族:$\mathcal{A} = \{A_{\alpha}\subseteq X: \alpha\in\Lambda\}$ | ||
- 用指标集可以方便地表示多个集合的并、交等运算: | ||
|
||
$$ | ||
\begin{gathered} | ||
\bigcup_{\alpha\in\Lambda}A_{\alpha} = \{x: \exists \alpha\in \Lambda, x\in A_{\alpha}\}, \\ | ||
\bigcap_{\alpha\in\Lambda}A_{\alpha} = \{x: \forall \alpha\in \Lambda, x\in A_{\alpha}\}. \\ | ||
\end{gathered} | ||
$$ | ||
|
||
- 差集 $A \backslash B = \{x: x\in A, x\notin B\}$,补集 $A^c = \Omega \backslash A$(用 $\Omega$ 表示全集) | ||
|
||
??? example "基本集合论练习" | ||
- $\{x: \sup _n f_n(x) > t\} = \bigcup_{n=1}^{\infty}\{ x: f_n(x) > t \}$ | ||
- $\{x: \sup _n f_n(x) \leqslant t\} = \bigcap_{n=1}^{\infty}\{ x: f_n(x) \leqslant t \}$ | ||
|
||
第二行由第一行应用 De Morgan 律容易得到。令 $A=\{x: \sup _n f_n(x) > t\}$, $A_n=\{ x: f_n(x) > t \}$ | ||
|
||
- $\forall x\in A$,如果不存在 $n_0$ 使得 $f_{n_0}(x) = \sup_n f_n(x)$,则有 $f_n(x) < \sup_n f_n(x)$,两边同取 $\sup_n$ 后发现矛盾,因此有 $f_{n_0}(x) > t$, $x\in A_{n_0}$,也就有 $x\in \bigcup A_n\Rightarrow A\subseteq \bigcup A_n$ | ||
- $\forall x\in \bigcup A_n$,存在 $n_0$ 使得 $x\in A_{n_0}$,也就有 $\sup_n f_n(x) \geqslant f_{n_0}(x) > t$,因此 $x\in A\Rightarrow \bigcup A_n\subseteq A$ | ||
|
||
## Limitation of Set Sequence | ||
|
||
像定义数列的极限一样,讨论对集列的极限的定义之前,先考虑单调集列这一特殊情况。 | ||
|
||
- 单增集列:$A_k\subseteq A_{k+1}$, $k\in \mathbb{N}$,一定在全集 $\Omega$ 中,其极限一定存在,为 $\bigcup_{k=1}^{\infty} A_k\subseteq \Omega$ | ||
- 单减集列:$A_k\supseteq A_{k+1}$, $k\in \mathbb{N}$,其极限一定存在,为 $\bigcap_{k=1}^{\infty} A_k$ | ||
|
||
|
||
|
||
!!! warning "本页面还在施工中" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Analysis | ||
|
||
- [Functions of a Real Variables](funcRvar/index.md) | ||
- [Basics of Set Theory](funcRvar/sets.md) | ||
- [Lebesgue Measure](funcRvar/LebesgueMeasure.md) | ||
- ...... | ||
- [Continuities](continuities.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
<link rel="stylesheet" href="../../../css/counter.css" /> | ||
|
||
# Score-based Generative Models | ||
|
||
!!! info "Reference" | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,130 @@ | ||
<link rel="stylesheet" href="../../../css/counter.css" /> | ||
|
||
# Einsum | ||
|
||
!!! info "Reference: [Einsum Is All You Need: NumPy, PyTorch and TensorFlow](https://www.youtube.com/watch?v=pkVwUVEHmfI)" | ||
|
||
## Example Uses in PyTorch | ||
|
||
```python hl_lines="4 10 14 18" | ||
>>> x = torch.tensor([[1, 2, 3], | ||
[4, 5, 6]]) | ||
|
||
# permutation | ||
>>> torch.einsum("ij->ji", x) | ||
tensor([[1, 4], | ||
[2, 5], | ||
[3, 6]]) | ||
|
||
# summation | ||
>>> torch.einsum("ij->", x) | ||
tensor(21) | ||
|
||
# column sum | ||
>> torch.einsum("ij->j", x) | ||
tensor([5, 7, 9]) | ||
|
||
# row sum | ||
>> torch.einsum("ij->i", x) | ||
tensor([ 6, 15]) | ||
``` | ||
|
||
```python | ||
>>> x = torch.tensor([[1, 2, 3], | ||
[4, 5, 6]]) | ||
>>> v = torch.tensor([[1, 0, -1]]) | ||
|
||
# martix-vector multiplication: xv^T | ||
>>> torch.einsum("ij,kj->ik", x, v) | ||
tensor([[-2], | ||
[-2]]) | ||
|
||
# martix-matrix multiplication: xx^T | ||
>>> torch.einsum("ij,kj->ik", x, x) # 2*2: (2*3) @ (3*2) | ||
tensor([[14, 32], | ||
[32, 77]]) | ||
|
||
# dot product first row with first row of matrix | ||
>>> torch.einsum("i,i->", x[0], x[0]) | ||
tensor(14) | ||
``` | ||
|
||
```python | ||
>>> x = torch.tensor([[1, 2, 3], | ||
[4, 5, 6], | ||
[7, 8, 9]]) | ||
|
||
# dot product with matrix | ||
>>> torch.einsum("ij,ij->", x, x) | ||
tensor(285) | ||
|
||
# Hadamard product (element-wise multiplication) | ||
>>> torch.einsum("ij,ij->ij", x, x) | ||
tensor([[ 1, 4, 9], | ||
[16, 25, 36], | ||
[49, 64, 81]]) | ||
``` | ||
|
||
```python | ||
# outer product | ||
>>> a = torch.tensor([1, 0, -1]) | ||
>>> b = torch.tensor([1, 2, 3, 4, 5]) | ||
>>> torch.einsum("i,j->ij", a, b) | ||
tensor([[ 1, 2, 3, 4, 5], | ||
[ 0, 0, 0, 0, 0], | ||
[-1, -2, -3, -4, -5]]) | ||
|
||
# batch matrix multiplication | ||
>>> generator = torch.manual_seed(12) | ||
>>> a = torch.rand((3, 2, 5), generator=generator) | ||
tensor([[[0.4657, 0.2328, 0.4527, 0.5871, 0.4086], | ||
[0.1272, 0.6373, 0.2421, 0.7312, 0.7224]], | ||
|
||
[[0.1992, 0.6948, 0.5830, 0.6318, 0.5559], | ||
[0.1262, 0.9790, 0.8443, 0.1256, 0.4456]], | ||
|
||
[[0.6601, 0.0554, 0.1573, 0.8137, 0.7216], | ||
[0.2717, 0.3003, 0.6099, 0.5784, 0.6083]]]) | ||
>>> b = torch.rand((3, 5, 3), generator=generator) | ||
tensor([[[0.4339, 0.8813, 0.3216], | ||
[0.2604, 0.2566, 0.1872], | ||
[0.6423, 0.1786, 0.1435], | ||
[0.7490, 0.7275, 0.1641], | ||
[0.3273, 0.1239, 0.6138]], | ||
|
||
[[0.4535, 0.7659, 0.1800], | ||
[0.3338, 0.9526, 0.8919], | ||
[0.9859, 0.6348, 0.8811], | ||
[0.9391, 0.1173, 0.1342], | ||
[0.9405, 0.6803, 0.5556]], | ||
|
||
[[0.8713, 0.0782, 0.8578], | ||
[0.7540, 0.6698, 0.5817], | ||
[0.3829, 0.7163, 0.8930], | ||
[0.5597, 0.2803, 0.2476], | ||
[0.4738, 0.1306, 0.2024]]]) | ||
>>> torch.einsum("ijk,ikl->ijl", a, b) | ||
tensor([[[1.1270, 1.0287, 0.6055], | ||
[1.1608, 0.9403, 0.7584]], | ||
|
||
[[2.0132, 1.6369, 1.5629], | ||
[1.7535, 1.8831, 1.9043]], | ||
|
||
[[1.4744, 0.5236, 1.0864], | ||
[1.3086, 0.9008, 1.2188]]]) | ||
``` | ||
|
||
|
||
```python | ||
>>> x = torch.tensor([[1, 2, 3], | ||
[4, 5, 6], | ||
[7, 8, 9]]) | ||
|
||
# matrix diagonal | ||
>>> torch.einsum("ii->i", x) | ||
tensor([1, 5, 9]) | ||
|
||
# matrix trace | ||
>>> torch.einsum("ii->", x) | ||
tensor(15) | ||
``` |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
<link rel="stylesheet" href="../../../css/counter.css" /> | ||
|
||
# Normalization | ||
|
||
<div style="text-align:center;"> | ||
<img src="../../imgs/miscs/normalizations.png" alt="normalizations" style="50%;" /> | ||
</div> | ||
|
||
## Batch Normalization | ||
|
||
对每个 mini batch $(z^{(1)}, \cdots, z^{(b)})$, $b$ 是 batch size。计算各个样本特征向量的均值 $\mu$ 和方差 $\sigma^2$ | ||
|
||
$$ | ||
\mu = \frac{1}{b}\sum_{i=1}^{b}z^{(i)}, \quad \sigma^2 = \frac{1}{b}\sum_{i=1}^{b}(z^{(i)}-\mu)^2 | ||
$$ | ||
|
||
随后有(为了数值稳定性引入一个很小的 $\varepsilon>0$) | ||
|
||
$$ | ||
z^{(i)}_{\text{norm}}=\frac{z^{(i)}-\mu}{\sqrt{\sigma^2+\varepsilon}} | ||
$$ | ||
|
||
用以替换 $z^{(i)}$ 的 $\tilde{z}^{(i)}$ 还需要对标准化得到的 $z^{(i)}_{\text{norm}}$ 进行线性(仿射)变换 | ||
|
||
$$ | ||
\tilde{z}^{(i)} = \gamma z^{(i)}_{\text{norm}} + \beta | ||
$$ | ||
|
||
这里线性变换的参数 $\gamma$ 和 $\beta$ 也相当于网络参数 $w$,参与前向传播和反向传播的参数更新过程。线性变换的意义在于让每层的直接输出的分布更加多元化,而不总是被标准化所限制。 | ||
|
||
> 有趣的是,每一层的偏置项 (bias) 和 $\beta$ 是重复的,所以可以去掉偏置项。 | ||
于是,使用了 Batch Normalization 之后,只是把某一层的直接输出 $z^{(i)}$ 替换为 $\tilde{z}^{(i)}$,然后再应用激活函数后得到 $a^{(i)}$,再输入下一层。 | ||
|
||
> 这里先 BN 还是先应用激活函数是一个问题,吴恩达认为经常先 BN 再使用激活函数。 | ||
## Layer Normalization | ||
|
||
Layer Normalization 和 Batch Normalization 的不同之处只在于 $\mu$ 和 $\sigma^2$ 的计算方法。 Batch Normalization 是沿着 mini batch 这一维计算均值 $\mu$ 和方差 $\sigma^2$,而 Layer Normalization 则是单个样本内部进行 $\mu$ 和 $\sigma^2$ 的计算。 | ||
|
||
!!! warning "该页面还在建设中" | ||
|
||
$$ | ||
\mu = \frac{1}{b}\sum_{i=1}^{b}z^{(i)}, \quad \sigma^2 = \frac{1}{b}\sum_{i=1}^{b}(z^{(i)}-\mu)^2 | ||
$$ | ||
|
||
可以从下图简单看到计算维度的差别: | ||
|
||
<div style="text-align:center;"> | ||
<img src="../../imgs/miscs/bn_vs_ln.png" alt="bn_vs_ln" style="50%;" /> | ||
</div> | ||
|
||
将 Batch Normalization 和 Layer Normalization 举例如下: | ||
|
||
\begin{figure}[H] | ||
\centering | ||
\includegraphics[scale=0.4]{graph/6.2.png} | ||
\includegraphics[scale=0.4]{graph/6.3.png} | ||
\caption{examples of Batch Normalization(left) and Layer Normalization(right)} | ||
\end{figure} | ||
|
||
## Instance Normalization | ||
|
||
## Group Normalization |
Oops, something went wrong.