A random variable
In this chapter, we discuss discrete random variables, which take values in a
range that is finite or countably infinite. This means even though we define
The term "random variable" is a bit of a misnomer: it is a function and there is nothing random about it, nor is it a variable! What is random is which sample point of the experiment is realized, and hence the value that the random variable maps the sample point to.
Like with basic probability space, the most important things about any random variable are:
- The set of values it can take.
- The probabilities with which it takes on the values.
Since a random variable is defined on a probability space, we can calculate
these probabilities given the probabilities of the sample points. Let
is an event in the sample space because it is a subset of
Formally, the distribution of a discrete random variable
Note the collection of events
- Any two events
$X=a_1$ and$X=a_2$ with$a1\not=a_2$ are disjoint. - The union of these events is equal to the entire sample space
$\Omega$ .
The collection of events thus form a partition of the sample space.
A simple yet useful probability distribution is the Bernoulli distribution
of a random variable which takes value in
where
The binomial distribution is the discrete probability distribution of the number
of successes in a sequence of
A random variable with this distribution is called a binomial random variable,
and we write
The hypergeometric distribution is a discrete probability distribution that
describes the probability of
A random variable with this distribution is called a hypergeometric random
variable with parameters
We are often interested in multiple random variables on the same sample space.
The joint distribution for two discrete random variables
When we are a joint distribution for
The marginal distribution for
Random variables
The expectation of a discrete random variable
where the sum is over all possible values taken by the random variable.
The expectation can be thought of as summarizing the distribution into a more
compact, convenient form that is also easier to compute. The expectation can be
thought of as a "typical value" for the random variable (though note it may not
be a discrete value that
For any two random variables
For any constant
$$\mathbb{E}[X]=\sum_{a\in\mathscr{A}}a \times\mathbb{P}[X=a]$$
Consider a particular $a\times\mathbb{P}[X=a]$ in the above sum. By definition,
$\mathbb{P}[X=a]$ is the sum of $\mathbb{P}[\omega]$ over those sample points
$\omega$ for which $X(\omega)=a$. We know every sample point $\omega\in\Omega$
is in exactly one of those events $X=a$. This means we can write out the above
definition as
$$\mathbb{E}[X]=\sum_{\omega\in\Omega} X(\omega)\times\mathbb{P}[\omega]$$.
Now we apply this to $\mathbb{E}[X+Y]$:
$$\mathbb{E}[X+Y]=\sum_{\omega\in\Omega}(X+Y)(\omega)\times\mathbb{P}[\omega]\\
=\sum_{\omega\in\Omega}(X(\omega)+Y(\omega))\times\mathbb{P}[\omega]\\
=\sum_{\omega\in\Omega}(X(\omega)\times\mathbb{P}[\omega])+
\sum_{\omega\in\Omega}(Y(\omega)\times\mathbb{P}[\omega])\\
=\mathbb{E}[X]+\mathbb{E}[Y]$$.
This completes the proof of the first equality; the proof of the second is left
as an exercise.