Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typos and minor mistakes #4

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
30 changes: 17 additions & 13 deletions 2_mathematical_spaces/spaces.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ sophisticated sets but also sets that are equipped with additional structure.
These combinations of sets and structure are also known as _spaces_.

In this chapter we will discuss the basic conceptual features of general spaces
before reviewing some of prototypical spaces that are particularly common in
before reviewing some of the prototypical spaces that are particularly common in
practical applications. This presentation will include not only the properties
of sets with an arbitrary number of elements but also a survey of some of the
most fundamental structures that we can endow onto those sets.
Expand Down Expand Up @@ -104,7 +104,7 @@ In many circumstances we will need to distinguish between variables that refer
to _arbitrary_ elements and variables that refer to _particular_ but unspecified
elements. Following the computer science canon I will refer to these as
**unbound variables** and **bound variables**, respectively. To distinguish
between the two I will decorate bound variables with a tilde; in words $x$
between the two I will decorate bound variables with a tilde; in other words $x$
denotes any element of the space $X$ while $\tilde{x}$ denotes a fixed but
unspecified element.

Expand Down Expand Up @@ -154,7 +154,7 @@ single element, and a **full set** consisting of the entire set (@fig-subsets).
Most subsets, however, contain an intermediate number of elements. One of the
key features of uncountable spaces is that most subsets also contain an
uncountable number of elements. To visually represent subsets containing an
uncountable number of elements I will used filled shapes to contrast against
uncountable number of elements I will use filled shapes to contrast against
individual points.

::: {#fig-subsets layout="[ [-5, 45, 45, -5], [-5, 45, 45, -5]]"}
Expand Down Expand Up @@ -787,7 +787,7 @@ most subsets will be neither open nor closed.

Unlike open balls these metric-derived open subsets are _closed_ under unions
and intersections. If $\mathsf{x}_{1}$ and $\mathsf{x}_{2}$ are both open
subsets then $\mathsf{x}_{1} \cup \mathsf{x}_{2}$ will also an open subset. In
subsets then $\mathsf{x}_{1} \cup \mathsf{x}_{2}$ will also be an open subset. In
fact the union of _any_ number of open subsets will be open. Likewise if
$\mathsf{x}_{1}$ and $\mathsf{x}_{2}$ are both open subsets then
$\mathsf{x}_{1} \cap \mathsf{x}_{2}$ will also be an open subset. Indeed the
Expand Down Expand Up @@ -964,7 +964,7 @@ figures/structures/general_topology/convergence/convergence){
width=90% #fig-general-convergence}

A subtle benefit of this topological definition of convergence is that because
it doesn't require a metric is also doesn't require us to define the positive
it doesn't require a metric, it also doesn't require us to define the positive
real numbers. This can be helpful for avoiding circular logic in more technical
mathematical analyses.

Expand Down Expand Up @@ -1021,10 +1021,10 @@ algebra, or metric just builds on top of that foundation.
Mathematically it is much easier to work with structures that are _compatible_
with each other. For example if we want to equip a set with both a topology
and a metric then the resulting space will be particularly well-behaved if
the we use a metric topology. At the same time ambient structure can also
we use a metric topology. At the same time ambient structure can also
distinguish certain compatible subsets.

For example is a set is equipped with an ordering then we can define **interval**
For example if a set is equipped with an ordering then we can define **interval**
subsets that contain all elements above and below two boundary elements. An
**open interval** excludes both boundary elements,
$$
Expand All @@ -1045,7 +1045,7 @@ intervals that contain only one boundary,
\end{align*}
Note that these notions of "open" and "closed" subsets are in general distinct
from the open and closed subsets defined by a topology. Only when an ordering
is compatible with a topology will these the open and closed intervals also be
is compatible with a topology will these open and closed intervals also be
topologically open and closed.

As we saw in [Section 1.2.4.1](@sec:open-balls) a metric distinguishes subsets
Expand Down Expand Up @@ -1086,8 +1086,8 @@ $\mathsf{x}_{1} \subset X$ is smaller than a subset $\mathsf{x}_{2} \subset X$
if $\mathsf{x}_{1} \subset \mathsf{x}_{2}$, and larger if
$\mathsf{x}_{2} \subset \mathsf{x}_{1}$. Two subsets that only partially
overlap are incomparable, and hence fall into the same place in the sequential
ordering. For any set the empty set will always the smallest subset and the
full set will always the largest.
ordering. For any set the empty set will always be the smallest subset and the
full set will always be the largest.

The union and intersection operations introduce algebraic structure, known as a
**Boolean algebraic structure**, to the power set. They are both commutative,
Expand Down Expand Up @@ -1311,6 +1311,10 @@ figures/real_line_grid/real_line_grid){width=50% #fig-real-line-grid}

## Extended Real Lines

<!---
Two weird formulations in the next 2 sentences.
-->

One limitation of a real line is that it does not contains points that
_approach_ either negative or positive infinity, but not points that represent
those limits directly. An **extended real line** resolves introduces two new
Expand Down Expand Up @@ -1567,7 +1571,7 @@ Using this notation we can define an inverse function a bit more compactly as
$$
\text{Id} = f^{-1} \circ f.
$$
In words the composition of a bijective function with its inverse function is
In other words the composition of a bijective function with its inverse function is
the identify function.

## Relating Structures
Expand Down Expand Up @@ -1852,7 +1856,7 @@ Pushforward and pullback functions allow us to _lift_ a transformation between
sets into a transformation between spaces. For structure that can be pushed
forward along the function $f : X \rightarrow Y$ any input space
$(X, \mathfrak{x})$ automatically defines a compatible output space
$(Y, f_{*}(\mathfrak{x}))$. Similarly for structure that can be pulled back
$(Y, f_{*}(\mathfrak{x}))$. Similarly for a structure that can be pulled back
against $f$ any output space $(Y, \mathfrak{y})$ automatically defines a
compatible input space $(X, f^{*}(\mathfrak{y}))$.

Expand Down Expand Up @@ -1923,7 +1927,7 @@ latter not (@fig-monoticity).

Monotonically increasing functions preserve orderings so that larger inputs
always imply larger outputs. The function (a) $f_{1} : x \mapsto x^{3}$ is
monotonic but the function (b) $f_{1} : x \mapsto -x^{3}$ is not.
monotonic but the function (b) $f : x \mapsto -x^{2}$ is not.
:::

#### Algebra-Preserving Relations
Expand Down
8 changes: 4 additions & 4 deletions 3_product_spaces/product_spaces.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ $X_{2}$.

Each element of the product set is uniquely specified by one element of $X_{1}$
and one element of $X_{2}$. Consequently every variable taking values in the
product set $x \in X_{1} \times X_{2}$ is compromised of an ordered pair of
product set $x \in X_{1} \times X_{2}$ is comprised of an ordered pair of
variables from each component space,
$$
x = (x_{1}, x_{2}),
Expand Down Expand Up @@ -282,7 +282,7 @@ X_{1} \times \ldots \times X_{i} \times \ldots \times X_{I}
=
\times_{i = 1}^{I} X_{i}
$$
where every product variable $x \in \times_{i = 1}^{I} X_{i}$ is compromised of
where every product variable $x \in \times_{i = 1}^{I} X_{i}$ is comprised of
a ordered collection of component variables
$$
x = ( x_{1}, \ldots, x_{i}, \ldots, x_{I})
Expand Down Expand Up @@ -609,7 +609,7 @@ component full sets we will always be able to construct the product empty set
and product full set from this procedure.

Moreover because the component open subsets are finite intersections these
productsubsets will be as well. For example given any finite collection of open
product subsets will be as well. For example given any finite collection of open
component subsets
$$
\{ \mathsf{x}_{1, i}, \ldots, \mathsf{x}_{j, i}, \ldots \mathsf{x}_{J, i} \}
Expand Down Expand Up @@ -870,7 +870,7 @@ $$
( 1, \ldots, i_{1} - 1, i_{1} + 1, \ldots, i_{J} - 1, i_{J} + 1, \ldots, J )
$$
define yet another product set. Replicating this second product set once for
each element of first product set defines a collection of **cross sections sets**
each element of the first product set defines a collection of **cross sections sets**
(@fig-conditioning),
\begin{align*}
\times_{i' = 1}^{I} X_{i'} \mid (x_{i_{1}}, \ldots, x_{i_{J}})
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ sufficiently useful for practical application or if we need to consider
countably additive measures, let alone measures that might be additive
over even larger collections of subsets.

For example a common problem that arises is practice is reconstructing
For example a common problem that arises in practice is reconstructing
the measure allocated to a general subset from the measures allocated to
particularly nice subsets that are easier with which to work. If we
could always decompose a generic subset into the disjoint union of a
Expand All @@ -217,7 +217,7 @@ Potentially some subsets might be decomposable only into an uncountably
infinite number of subsets in which case we would need even stronger
notions of additivity!

Fortunately for us we don't have to go to that last extreme. In turns
Fortunately for us we don't have to go to that last extreme. It turns
out that on most spaces that we'll encounter in practice, and typical
notions of "nice" subsets, countable additivity is sufficient for
reconstructing the measure allocated to more general subsets.
Expand All @@ -228,9 +228,9 @@ allocations to **rectangular** subsets (@fig-disk-decomposition). In
general a non-rectangular subset, in this case a disk, can be crudely
approximated by a single rectangular subset. The disk can be
approximated more precisely as the disjoint union of many different
rectangular subsets, but that will never exact reconstruct the disk.
rectangular subsets, but that will never exactly reconstruct the disk.
Only when we incorporate a countably infinite number of rectangular
subsets can be reconstruct the disk without any error.
subsets can we reconstruct the disk without any error.

![On a two-dimensional real plane $\mathbb{R}^{2}$ a non-rectangular
disc can be approximated, but not exactly reconstructed, by the finite
Expand Down Expand Up @@ -339,7 +339,7 @@ Similarly the elements of a $\sigma$-algebra are known as
**measurable subsets** while any subsets in the power set but not in the
$\sigma$-algebra are referred to as **non-measurable** subsets.

When non-measurable subsets are misbehaving subsets they reveals the
When non-measurable subsets are misbehaving subsets they reveal the
subtle, and often counterintuitive, pathologies inherent to that space.
By working with $\sigma$-algebras directly we can avoid these awkward
pathologies entirely.
Expand Down Expand Up @@ -415,7 +415,7 @@ behaviors that we have to avoid at all! I will refer to any measurable
space $(X, 2^{X})$ compatible with a discrete topology as
**discrete measurable spaces**.

On the the other hand the Borel $\sigma$-algebra derived from the
On the other hand the Borel $\sigma$-algebra derived from the
topology that defines the real line filters out all of the
non-constructive subsets and their undesired behaviors while keeping all
of the interval subsets and the subsets that we can derive from them.
Expand Down Expand Up @@ -973,7 +973,7 @@ figures/interval_partitions/interval_partitions){
width=90% #fig-equal-length-intervals}

The easiest way to accomplish this uniformity is to allocate to each
interval a measure directly equal to the its length,
interval a measure directly equal to its length,
$$
\lambda( \, [x_{1}, x_{2}] \, )
= L( \, [x_{1}, x_{2}] \, )
Expand Down
34 changes: 17 additions & 17 deletions 5_expectation_values/expectation_values.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ from calculus. This measure-informed integration operation summarizes
the interaction between a measure and a given function, allowing us to
use one to learn about the other.

We will being our exploration of measure-informed integration with a
We will begin our exploration of measure-informed integration with a
heuristic construction on finite measure spaces before considering a
more formal, but also more abstract, construction that applies to any
measure space. Next we'll investigate how the specification of
Expand All @@ -56,7 +56,7 @@ exceptional measures whose integrals can be computed algorithmically.
# Integration on Finite Measure Spaces {#sec:finite_integration}

To start our discussion of measure-informed integration as simply as
possible let's begin by considering a finite measure space compromised
possible let's begin by considering a finite measure space comprised
of the finite set
$$
X = \{ \Box, \clubsuit, \diamondsuit, \heartsuit, \spadesuit \},
Expand Down Expand Up @@ -409,9 +409,9 @@ corresponding simple function decomposition,
In general a non-negative, measurable function can be represented by
more than one simple function decomposition. Fortunately the
measure-informed integral derived from any of them will always be the
same. Consequently there's no worry for ambiguous of otherwise
same. Consequently there's no worry for ambiguous or otherwise
inconsistent answers, and measure-informed integrals for non-negative,
measurable function are completely well-behaved.
measurable functions are completely well-behaved.

This procedure for defining measure-informed integrals through simple
functions representations is known as **Lebesgue integration** in the
Expand Down Expand Up @@ -722,7 +722,7 @@ this form.
Because $L(X, \mathcal{X}, \mu)$ contains all of the indicator functions
this functional relationship between $L(X, \mathcal{X}, \mu)$ and
$\mathbb{R}$ determines the allocations to every measurable subset,
and hence full determines the measure $\mu$. At the same time
and hence fully determines the measure $\mu$. At the same time
$L(X, \mathcal{X}, \mu)$ also contains many integrands that are not
indicator functions, and hence quite a bit of redundant information
about $\mu$.
Expand Down Expand Up @@ -875,7 +875,7 @@ which is not, in general, equal to $1$. In other words scaling a
probability distribution results not in another probability distribution
but rather a generic measure.

If we want transform one probability distribution into another then we
If we want to transform one probability distribution into another then we
need to correct for the modified normalization, defining
\begin{align*}
\mathbb{E}_{g \ast \pi} [ f ]
Expand Down Expand Up @@ -1039,8 +1039,8 @@ important in practice.

### The Mean

If an embedding function is an integrand than we can evaluate its
measure-informed integral, $\mathbb{I}_{\mu}[\iota]$. The ultimately
If an embedding function is an integrand then we can evaluate its
measure-informed integral, $\mathbb{I}_{\mu}[\iota]$. The ultimate
utility of this measure-informed integral, however, depends on what
information about the ambient measure it extracts.

Expand Down Expand Up @@ -1272,7 +1272,7 @@ working with spaces like circles, spheres, torii, and more. Many
analyses on these spaces have been undermined by attempts to summarize
measures with moments that don't actually exist!

All of this said we still to take care with the necessary conditions
All of this said we still have to take care with the necessary conditions
when working with more familiar spaces as well. For example in
[Section 5.2.2](@sec:practical_lebesgue) we'll learn that the identify
function from a real line into itself is not integrable with respect to
Expand Down Expand Up @@ -1354,7 +1354,7 @@ skewed towards smaller or larger values.

![](figures/histograms/varying_behaviors/multimodal/multimodal){#fig-hist-multimodal}

Histogram are extremely effective at communicating the basic features of
Histograms are extremely effective at communicating the basic features of
a measure. The measure in (a) is diffuse but decaying, allocating more
measure at smaller points than larger points. Conversely the measure in
(b) concentrates around a single point while the measure in (c)
Expand Down Expand Up @@ -1409,7 +1409,7 @@ M :\; & X & &\rightarrow& \; &[0, \mu(X)]&
\\
& x & &\mapsto& & M_{\mu}(x) = \mu(\mathsf{I}_{x}) = \mathbb{I}_{\mu}[I_{\mathsf{I}_{x}}] &.
\end{alignat*}
According this mapping is known as a
This mapping is known as a
**cumulative distribution function** (@fig-cdf-basics).

![A cumulative distribution function quantifies how measure is allocated
Expand Down Expand Up @@ -1451,7 +1451,7 @@ are any gaps in the allocation, intermediate intervals with zero
allocated measure, then the cumulative distribution function will
flatten out completely (@fig-cdf-gap).

::: {#fig-hist-examples layout="[-5, 30, 30, 30, -5]"}
::: {#fig-cdf-examples layout="[-5, 30, 30, 30, -5]"}
![](figures/cdfs/cdf_behaviors/unimodal/unimodal){#fig-cdf-unimodal}

![](figures/cdfs/cdf_behaviors/narrow_unimodal/narrow_unimodal){#fig-cdf-narrow-unimodal}
Expand All @@ -1461,7 +1461,7 @@ flatten out completely (@fig-cdf-gap).
A careful survey of a cumulative distribution function can communicate a
wealth of information about the ambient measure. (a) Here the ambient
measure is unimodal with the cumulative distribution function
appreciably increasingly only one we reach the central neighborhood
appreciably increasing only once we reach the central neighborhood
where the measure allocation is concentrated. (b) A narrower
concentration results in a steeper cumulative distribution function.
(c) A cumulative distribution function flattens if there are any gaps
Expand Down Expand Up @@ -1607,7 +1607,7 @@ accumulated measure below $m$,
$$
x_{m-} = \underset{x \in X}{\mathrm{argmax}} M(x) < m,
$$
and bounded above by the point $x_{+}$ that achieves the smallest
and bounded above by the point $x_{m+}$ that achieves the smallest
accumulated measure above $m$ (@fig-quantile-inverse-problems),
$$
x_{m+} = \underset{x \in X}{\mathrm{argmin}} M(x) > m.
Expand Down Expand Up @@ -1715,7 +1715,7 @@ $$
$$

The integral of any real-valued function $f: X \rightarrow \mathbb{R}$
with respect to counting measure is given by over summing all of the
with respect to counting measure is given by summing over all of the
output values,
\begin{align*}
\mathbb{I}_{\chi}[f]
Expand Down Expand Up @@ -1934,7 +1934,7 @@ When a real-valued function has a well-defined Riemann integral then we
can apply the tools of calculus to evaluate Lebesgue integrals. The
exceptional Riemann integrals that can be evaluated analytically allow
us to compute the corresponding Lebesgue integrals exactly. More
generally we can use to numerical integration techniques to approximate
generally we can use numerical integration techniques to approximate
the Riemann integrals, and hence approximately evaluate Lebesgue
integrals.

Expand All @@ -1959,7 +1959,7 @@ the sign of the Riemann integral. In order to properly relate Lebesgue
integrals to Riemann integrals we have to fix the _orientation_ of the
intervals.

Similarly the mean of a Lebesgue measure would by given by the integral
Similarly the mean of a Lebesgue measure would be given by the integral
of the identity function,
\begin{align*}
\mathbb{I}_{\lambda}[\iota]
Expand Down
Loading