Probabilities for two discrete random variables
The probability function of a single discrete random variable gives probabilities for all of its possible values. Probabilities for more complex events about the variable such as \(P(X \lt 8)\) can be found by summing the probability function over the relevant values.
A similar quantity describes the joint distribution of two discrete random variables.
Definition
For two discrete random variables X and Y, the joint probability function gives the probabilities for all possible combinations of values of the two variables,
\[ p(x, y) \;=\; P(X=x \textbf{ and } Y=y) \]The joint distribution of X and Y is completely defined by their joint probability function.
Example
Consider a weighted six-sided die for which the value "6" has twice the probability of the other values. If the die is rolled twice, with X and Y being the values that appear on the first and second rolls, what is the joint probability function of the two variables?
Since the probabilities for the first roll of the die must add up to exactly 1,
\[ P(X = x) \;\;=\;\; \begin{cases} {\small\diagfrac 1 7} & \quad\text{if }x = 1, 2, 3, 4, 5\\ {\small\diagfrac 2 7} & \quad\text{if }x = 6\\ 0 & \quad\text{otherwise} \end{cases} \]The second roll has the same probabilities. Assuming that the second roll is independent of the first, the joint probabilities for \(X\) and \(Y\) are the product of those of the two variables,
\[ p(x,y) \;\;=\;\; P(X=x) \times P(Y=y) \]These are shown in the following table.
First roll, x | ||||||
---|---|---|---|---|---|---|
Second roll, y |
1 | 2 | 3 | 4 | 5 | 6 |
1 | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 2{49}\) |
2 | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 2{49}\) |
3 | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 2{49}\) |
4 | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 2{49}\) |
5 | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 2{49}\) |
6 | \(\small\diagfrac 2{49}\) | \(\small\diagfrac 2{49}\) | \(\small\diagfrac 2{49}\) | \(\small\diagfrac 2{49}\) | \(\small\diagfrac 2{49}\) | \(\small\diagfrac 4{49}\) |
The joint probability function can often be expressed with a mathematical formula.
In order to be the joint probability function of two discrete random variables, a function \(p(x,y)\) must satisfy two properties.
Properties of joint probability functions
\[ p(x,y) \ge 0 \text{ for all } x,y \] \[ \sum_{\text{all } x,y} p(x,y) = 1 \]The first property is a consequence of the first axiom of probability — all probabilities must be between 0 and 1.
The second property arises because \(X\) and \(Y\) are certain to have one of the values in the summation.
Probabilities for events about \(X\) and \(Y\)
The probabilities of any other events relating to \(X\) and \(Y\) can be found from the joint probability function. Any event, \(A\), is a subset of the sample space, so it corresponds to a set of pairs of values, \((x, y)\). Its probability can therefore be found as the sum of the joint probability function over the pairs \((x, y)\) that comprise \(A\),
\[ P(A) \;\;=\;\; \sum_{(x,y) \in A} {p(x,y)} \]Example
In the above weighted dice example, what is the probability that the sum of the two dice will be ten or more?
The highlighted cells in the following table are those that correspond to the event \(X + Y \ge 10\)
First roll, x | ||||||
---|---|---|---|---|---|---|
Second roll, y |
1 | 2 | 3 | 4 | 5 | 6 |
1 | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 2{49}\) |
2 | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 2{49}\) |
3 | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 2{49}\) |
4 | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 2{49}\) |
5 | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 1{49}\) | \(\small\diagfrac 2{49}\) |
6 | \(\small\diagfrac 2{49}\) | \(\small\diagfrac 2{49}\) | \(\small\diagfrac 2{49}\) | \(\small\diagfrac 2{49}\) | \(\small\diagfrac 2{49}\) | \(\small\diagfrac 4{49}\) |
Therefore \(P(X+Y \ge 10)\) is the sum of these six probabilities, \(\frac {13}{49} = 0.265\).