-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
1bf4318
commit 9448db0
Showing
6 changed files
with
155 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,11 @@ | ||
# Coursework (1) | ||
|
||
## Exercise 1 | ||
|
||
Please provide several main optimization forms in reinforcement learning | ||
|
||
不是列举优化方法,**必须写出优化目标和约束条件**,可以选择一个 RL 中会用到的算法。优化目标和约束条件建议表达为 General Formulation of The Optimization Problem。 | ||
|
||
## Exercise 3 | ||
|
||
不能直接去掉向下取整的符号,也不能简单地说约等于,需要通过放缩进行处理。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,16 @@ | ||
# Coursework (2) | ||
# Coursework (2) | ||
|
||
## Exercise 1 | ||
|
||
- $f$ 的输入是 $n$ 维向量,其函数值是 $m$ 维向量,即多元向量函数 | ||
- 对多元向量函数求导得到的是 Jacobian 矩阵,而不是梯度 | ||
- Jacobian 矩阵是不能和向量进行内积的 | ||
- 假如退化到常见的 $m=1$ 情况,Jacobian 矩阵退化为梯度,考虑一下维度匹配 | ||
- 对梯度以标量为积分变量积分得到的还是一个向量,$f(x)=f(x_0)+\int_{0}^1...\mathrm dt$ 从维度匹配的角度上就是不对的 | ||
- 如果你以向量 $t$ 作为积分变量,写成 $\int_{x_0}^x \nabla f(t) \mathrm dt$,首先两个向量默认都是列向量,直接乘是不行的 | ||
- 即使写成内积或者转置相乘,也是不对的,一元到多元的扩展没有这么简单粗暴,其中有一个原点偏移的差别 | ||
- 推荐使用标量作为积分变量 | ||
|
||
## Exercise 3 | ||
|
||
使用 $\sup$ 比 $\max$ 更好。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,32 @@ | ||
# Coursework (3) | ||
|
||
第三次作业的第三题大家可能存在困难,注意分别取 inf 是可能比一起取 inf 更小的,也就是 inf f(x_1,y) + inf f(x_2,y) <= inf [ f(x_1,y)+f(x_2,y) ] | ||
回顾实数情况下的 inf f(x),如果 inf f(x) ≠ -∞,那么可以取到一个 non-decreasing 的序列 x_n,使得 \lim_{n\to ∞} f(x_n) = inf f(x)。可以参考一下。 | ||
实在做不出来的同学,可以处理为存在 x_0 使得 f(x_0) = inf f(x),这也比伪证和错证好一些。 | ||
## 可微条件 | ||
|
||
Exercise 1-3,$f$ 不一定一阶可微;Exercise 5-7,默认 $f$ 一阶连续可微。 | ||
|
||
## Exercise 3 | ||
|
||
注意分别取 inf 是可能比一起取 inf 更小的,即 | ||
|
||
$$ | ||
\inf_{y\in Y} f(x_1,y) + \inf_{y\in Y} f(x_2,y) \leqslant \inf_{y\in Y} [ f(x_1,y)+f(x_2,y) ] | ||
$$ | ||
|
||
考虑实数域上,如果 $\inf f(x) \neq -\infty$,那么可以取到一个 non-decreasing 的序列 $x_n$,使得 $\lim\limits_{n\to \infty} f(x_n) = \inf f(x)$。可以参考一下。 | ||
|
||
如果能力有限,可以处理为 $\exists x_0$ 使得 $f(x_0) = \inf f(x)$,但优先尝试作为 $\inf$ 处理。 | ||
|
||
## Exercise 4 | ||
|
||
- $|x|^{p-2}$ 在 $x=0$ 处存在问题,考虑 $p=1$ | ||
- (2)(3)(4) 自己思考的时候需要考虑 $x=0$ 处应该如何求导,但作为简化,提交的作业中可以直接写出求导结果,但不可导的情况需要考虑 | ||
|
||
## Exercise 5-7 | ||
|
||
$\mathcal{F}_L^{1, 1}(\mathbb{R})$ 包含的含义有 | ||
|
||
- $f$ 处处连续可微 | ||
- $f$ 的凸性 | ||
- $\nabla f$ 以 $L$ 为系数的 Lipschitz 连续性 | ||
|
||
完整的证明需要说明以上三点,不过重点在于第三点。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
# 数学公式输入 | ||
|
||
!!! info "建议自行查询相关资料学习,这里提供常用的内容" | ||
|
||
## 数学环境 | ||
对于数学环境排版,这里简单介绍两种环境:`aligned` 和 `gathered` | ||
|
||
`aligned` 环境常用于公式的对齐,`\\` 用于换行,`&` 标注对齐的位置。 | ||
=== "效果" | ||
$$ | ||
\begin{aligned} | ||
\int_0^1 x^2\mathrm{d}x | ||
&= \frac{1}{3}x^3\bigg|_0^1\\ | ||
&= \frac{1}{3} | ||
\end{aligned} | ||
$$ | ||
|
||
=== "LaTeX 源码" | ||
```latex | ||
$$ | ||
\begin{aligned} | ||
\int_0^1 x^2\mathrm{d}x | ||
&= \frac{1}{3}x^3\bigg|_0^1\\ | ||
&= \frac{1}{3} | ||
\end{aligned} | ||
$$ | ||
``` | ||
|
||
`gathered` 环境常用于居中排列多行公式,`\\` 用于换行。 | ||
=== "效果" | ||
$$ | ||
\begin{gathered} | ||
\int_0^1 x^2\mathrm{d}x = \frac{1}{3}\\ | ||
a^2+b^2=c^2 | ||
\end{gathered} | ||
$$ | ||
|
||
=== "LaTeX 源码" | ||
```latex | ||
$$ | ||
\begin{gathered} | ||
\int_0^1 x^2\mathrm{d}x = \frac{1}{3}\\ | ||
a^2+b^2=c^2 | ||
\end{gathered} | ||
$$ | ||
``` | ||
|
||
可以将 `aligned` 改成 `align`, `gathered` 改成 `gather`, 这样就可以让每一行公式在右侧出现自动编号,可以根据需求自行选择。 | ||
|
||
## 常用数学符号 | ||
|
||
给出在课程作业中可能会用到但是可能不知道的数学符号: | ||
|
||
- 大于等于 $\geqslant$,小于等于 $\leqslant$:`\geqslant`, `\leqslant` | ||
- 内积的尖括号 $\langle\rangle$:`\langle`, `\rangle` | ||
- 偏导符号 $\partial$:`\partial` | ||
- 微分的 $\mathrm{d}$:建议使用 `\mathrm{d}` | ||
- 范数 $\|a\|$:可以写作 `\| a \|` | ||
- 当范数内部高度比较大(例如有分式 `\frac{a}{x}` 时)可以使用 `\left\| \frac{a}{x} \right\|` 使其得到适配。 | ||
- 这种 `\left \right` 的方法也同理可应用于 `() [] ||` 等 | ||
- 梯度算子 $\nabla$:`\nabla` | ||
- 正三角形 $\Delta$:`\Delta` | ||
|
||
## 快捷输入常用数学符号 | ||
|
||
有的数学符号比较长,但是经常会输入,比如说 `\mathrm{d}`,每次都打一遍太浪费时间。 | ||
|
||
这时候常常使用自定义命令的方法,类似于 C 语言中的 `#define`,我们可以把 `\mathrm{d}` 定义为 `\di`。在所有 `\usepackage` 的后面,加入如下命令 | ||
|
||
```latex | ||
\newcommand{\di}{\mathrm{d}} | ||
``` | ||
|
||
这样就可以使用 `\di` 来代替 `\mathrm{d}` 了,注意自己起的名字不要和已有的命令重复,否则编译会无法通过。 | ||
|
||
下面给出一些我使用的缩写: | ||
```latex | ||
\newcommand{\inprod}[1]{\left\langle#1\right\rangle} | ||
\newcommand{\norm}[1]{\left\|#1\right\|} | ||
\newcommand{\dif}[2]{\frac{\mathrm{d} #1}{\mathrm{d} #2}} | ||
\newcommand{\pard}[2]{\frac{\partial #1}{\partial #2}} | ||
\newcommand{\trans}{^{\top}} | ||
\newcommand{\di}{\mathrm{d}} | ||
\newcommand{\R}{\mathbb{R}} | ||
\newcommand{\F}{\mathcal{F}} | ||
\newcommand{\Sc}{\mathcal{S}} | ||
\newcommand{\xB}{\bm{x}} | ||
\newcommand{\yB}{\bm{y}} | ||
\newcommand{\gB}{\bm{g}} | ||
\newcommand{\dom}{\mathrm{dom}} | ||
\newcommand{\epi}{\mathrm{epi}} | ||
``` | ||
|
||
这里还有一种带参数的命令,例如 `\pard`,可以示例使用为 `\pard{f}{x_1}`,显示效果为 | ||
|
||
$$ | ||
\frac{\partial f}{\partial x_1} | ||
$$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters