Conditional probabilities

Although the marginal distributions of \(X\) and \(Y\) are important, they do not capture the relationship between the two variables. If the value of one variable is known, it may provide information about the likely values of the other variable. The relationship is captured by the conditional probabilities about \(Y\) given \(X\).

\[ P(Y = y \mid X=x) \;\;=\;\; \frac{P(Y = y \text{ and } X=x)}{P(X=x)} \;\;=\;\; \frac{p(x,y)}{p_X(x)} \]

For any fixed value of \(x\), these conditional probabilities add to 1.0, so they are a valid discrete probability function.

Definition

The conditional distribution of \(Y\) given \(X=x\) is the distribution with probability function

\[ p_{Y \mid X=x}(y) \;\;=\;\; \frac{p(x,y)}{p_X(x)} \]

Note that there are separate conditional distributions of \(Y\) for each possible value of \(X\).

The conditional distribution of \(X\) given \(Y=y\) can be similarly defined as

\[p_{X \mid Y=y}(x) = \large\frac{p(x,y)}{p_Y(y)}\]

Minimum and maximum of three dice

We previously showed the joint probability function of the minimum, \(Y\), and maximum, \(X\), of the values when three fair six-sided dice are rolled.

\[ p(x,y) \;\;=\;\; \begin{cases} {\frac 1 {6^3}} & \quad\text{if }x = y \;\;\text{ and }\;\; 1 \le x,y \le 6 \\[0.4em] {\frac {x-y}{6^2}} & \quad\text{if } 1 \le y \lt x \le 6 \\[0.4em] 0 & \quad\text{otherwise} \end{cases} \]

These joint probabilities are shown in tabular form below.

  Maximum, x
Minimum, y 1 2 3 4 5 6
1 \(\small\diagfrac 1{6^3}\) \(\small\diagfrac 1{6^2}\) \(\small\diagfrac 2{6^2}\) \(\small\diagfrac 3{6^2}\) \(\small\diagfrac 4{6^2}\) \(\small\diagfrac 5{6^2}\)
2 0 \(\small\diagfrac 1{6^3}\) \(\small\diagfrac 1{6^2}\) \(\small\diagfrac 2{6^2}\) \(\small\diagfrac 3{6^2}\) \(\small\diagfrac 4{6^2}\)
3 0 0 \(\small\diagfrac 1{6^3}\) \(\small\diagfrac 1{6^2}\) \(\small\diagfrac 2{6^2}\) \(\small\diagfrac 3{6^2}\)
4 0 0 0 \(\small\diagfrac 1{6^3}\) \(\small\diagfrac 1{6^2}\) \(\small\diagfrac 2{6^2}\)
5 0 0 0 0 \(\small\diagfrac 1{6^3}\) \(\small\diagfrac 1{6^2}\)
6 0 0 0 0 0 \(\small\diagfrac 1{6^3}\)

We will now find the conditional distribution of the maximum value, \(Y\), if it is known that the minimum is \(X = 3\). The marginal probability for the lowest value being 3 is the sum of the probabilities in the highlighted row above.

\[ p_Y(3) = \sum_{x=1}^{6} p(x,y) = \frac 1{6^3} + \frac 1 6 \]

The conditional probabilities for \(X\) are

\[ p_{X\mid Y=3}(x) =\frac {p(x,3)}{p_X(3)} \]

This divides the highlighted row in the table of joint probabilities by a constant, \(p_X(3)\), that is chosen to make them add to one. As a result, the row of conditional probabilities becomes a valid univariate probability function for \(X\). This is illustrated in the diagram below.

The diagram initially shows the joint probabilities. Click Conditional for X. This scales the row of probabilities for each \(Y\) to add to one.

Note that

Conditional mean and variance

The conditional distributions of \(X\) given \(Y=y\) and of \(Y\) given \(X=x\) are valid univariate distributions and therefore have means and variances. For example,

Definition

The conditional mean of \(Y\) given \(X=x\) is

\[ E[Y \mid X=x] \;\;=\;\; \sum_{\text{all }y} {y \times p_{Y \mid X=x}(y)} \;\;=\;\; \sum_{\text{all }y} {y \times \frac{p(x,y)}{p_X(x)}} \]

Note that this conditional mean is a function of \(x\). The conditional mean of \(Y\) may depend on the x-value that we are conditioning on.

The conditional variance of \(Y\) given \(X=x\) is similarly defined as the variance of this conditional distribution.