From fe86cb653b87fe0c46bef72f519dbddffb49d305 Mon Sep 17 00:00:00 2001 From: zengchen <1239571995@qq.com> Date: Mon, 16 Oct 2023 18:07:43 +0800 Subject: [PATCH 1/2] =?UTF-8?q?=E5=AE=8C=E5=96=84Chapter8.5=E7=A9=BA?= =?UTF-8?q?=E9=97=B4=E5=A4=8D=E6=9D=82=E5=BA=A6=E8=A1=A8=E7=A4=BA?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- chapter_recurrent-modern/gru.md | 1 - chapter_recurrent-neural-networks/rnn-scratch.md | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/chapter_recurrent-modern/gru.md b/chapter_recurrent-modern/gru.md index 46ab4bd16..5c08b75b5 100644 --- a/chapter_recurrent-modern/gru.md +++ b/chapter_recurrent-modern/gru.md @@ -70,7 +70,6 @@ $\mathbf{H}_{t-1} \in \mathbb{R}^{n \times h}$ (隐藏单元个数$h$)。 那么,重置门$\mathbf{R}_t \in \mathbb{R}^{n \times h}$和 更新门$\mathbf{Z}_t \in \mathbb{R}^{n \times h}$的计算如下所示: - $$ \begin{aligned} \mathbf{R}_t = \sigma(\mathbf{X}_t \mathbf{W}_{xr} + \mathbf{H}_{t-1} \mathbf{W}_{hr} + \mathbf{b}_r),\\ diff --git a/chapter_recurrent-neural-networks/rnn-scratch.md b/chapter_recurrent-neural-networks/rnn-scratch.md index 5d6da152b..8650ebb6d 100644 --- a/chapter_recurrent-neural-networks/rnn-scratch.md +++ b/chapter_recurrent-neural-networks/rnn-scratch.md @@ -537,7 +537,7 @@ predict_ch8('time traveller ', 10, net, vocab) ## [**梯度裁剪**] 对于长度为$T$的序列,我们在迭代中计算这$T$个时间步上的梯度, -将会在反向传播过程中产生长度为$\mathcal{O}(T)$的矩阵乘法链。 +将会在反向传播过程中产生长度为$O(T)$的矩阵乘法链。 如 :numref:`sec_numerical_stability`所述, 当$T$较大时,它可能导致数值不稳定, 例如可能导致梯度爆炸或梯度消失。 From 62b272a723c20d98613ec9f3801ba2a39a791746 Mon Sep 17 00:00:00 2001 From: zengchen <1239571995@qq.com> Date: Mon, 16 Oct 2023 18:15:20 +0800 Subject: [PATCH 2/2] =?UTF-8?q?=E5=AE=8C=E5=96=84Chapter8.5=E7=A9=BA?= =?UTF-8?q?=E9=97=B4=E5=A4=8D=E6=9D=82=E5=BA=A6=E8=A1=A8=E7=A4=BA?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- chapter_recurrent-modern/gru.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/chapter_recurrent-modern/gru.md b/chapter_recurrent-modern/gru.md index 5c08b75b5..93c46afc9 100644 --- a/chapter_recurrent-modern/gru.md +++ b/chapter_recurrent-modern/gru.md @@ -70,6 +70,8 @@ $\mathbf{H}_{t-1} \in \mathbb{R}^{n \times h}$ (隐藏单元个数$h$)。 那么,重置门$\mathbf{R}_t \in \mathbb{R}^{n \times h}$和 更新门$\mathbf{Z}_t \in \mathbb{R}^{n \times h}$的计算如下所示: + + $$ \begin{aligned} \mathbf{R}_t = \sigma(\mathbf{X}_t \mathbf{W}_{xr} + \mathbf{H}_{t-1} \mathbf{W}_{hr} + \mathbf{b}_r),\\