From 3fc9cabb876074b8804df1fb35d960b1b706b233 Mon Sep 17 00:00:00 2001 From: James MacGlashan Date: Sun, 17 Nov 2024 14:05:10 -0500 Subject: [PATCH] typos --- content/posts/q_learning_doesnt_need_importance_sampling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/posts/q_learning_doesnt_need_importance_sampling.md b/content/posts/q_learning_doesnt_need_importance_sampling.md index 3f39bbb..dab95e6 100644 --- a/content/posts/q_learning_doesnt_need_importance_sampling.md +++ b/content/posts/q_learning_doesnt_need_importance_sampling.md @@ -87,7 +87,7 @@ $$ \end{align*} $$ -Ahah! We turned our expected value of Bob's distribution $\pi$ into an expected value of Alice's distribution $\mu$! We just had to weigh our payout values $r$ by $\frac{\pi(a)}{\mu(a)}$. Now we're back to an expected values of a joint distribution and we can approximate it with the mean of our samples: +Ahah! We turned our expected value of Bob's distribution $\pi$ into an expected value of Alice's distribution $\mu$! We just had to weigh our payout values $r$ by $\frac{\pi(a)}{\mu(a)}$. Now we're back to an expected value of a joint distribution and we can approximate it with the mean of our samples: $$ E_{a \sim \mu} \left [ E_{r \sim p} \left[ \frac{\pi(a)}{\mu(a)} r \right] \right]