Independence
Two events are called independent if knowledge that one event has happened does not provide any information about whether or not the other has also happened. The formal definition is:
Definition
Two events A and B are called independent if
\[ P(A \textbf{ and } B) = P(A) \times P(B) \]We now show two artificial examples, the first of which exhibits independence.
Mathematical performance and weight
Consider the relationship between the weight of students and their performance in a mathematics test. Here are the (artificial) joint probabilities for all combinations of weight group and performance group.
Mathematical performance | ||||
---|---|---|---|---|
Poor | Satisfactory | Above average | Marginal | |
Underweight | 0.0225 | 0.1125 | 0.0150 | 0.1500 |
Normal | 0.0825 | 0.4125 | 0.0550 | 0.5500 |
Overweight | 0.0300 | 0.1500 | 0.0200 | 0.2000 |
Obese | 0.0150 | 0.0750 | 0.0100 | 0.1000 |
Marginal | 0.1500 | 0.7500 | 0.1000 | 1.0000 |
Are the weight and mathematical performance categories independent?
The marginal probabilities are shown in red — they are the sums of joint probabilities across rows and down columns of the table.
The joint probability for Underweight and Above average performance satisfies:
\[ P(Underweight \textbf{ and } Above \text{ } average) = P(Underweight) \times P(Above \text{ } average) \]so these two events are independent, and the same holds for all other weight and performance categories.
Weight and mathematical performance are independent
In the next artificial example, events are not independent.
Athletic performance and weight
If we considered the relationship between athletic performance and weight, the joint probabilities might be as shown in the following table.
Athletic performance | ||||
---|---|---|---|---|
Poor | Satisfactory | Above average | Marginal | |
Underweight | 0.0450 | 0.0900 | 0.0150 | 0.1500 |
Normal | 0.0825 | 0.3025 | 0.1650 | 0.5500 |
Overweight | 0.0500 | 0.1200 | 0.0300 | 0.2000 |
Obese | 0.0300 | 0.0650 | 0.0050 | 0.1000 |
Marginal | 0.1700 | 0.5400 | 0.2900 | 1.0000 |
Are the weight and athletic performance categories independent?
In this illustration, athletic performance is not independent of weight. None of the joint probabilities are equal to the product of the marginal probabilities.
Weight and athletic performance are not independent
Independence and conditional probabilities
Since \(P(A \textbf{ and } B) = P(A \mid B) \times P(B) \) from the definition of conditional probability, if two events are independent,
\[ P(A \mid B) = P(A) \]and similarly
\[ P(B \mid A) = P(B) \]Our definition of independence therefore means that knowing that either \(A\) or \(B\) happened gives no information about whether or not the other event also occurred.
Mathematical performance and weight
In the above model for mathematical performance and weight, the performance and weight categories were independent. The conditional probabilities for performance, given weight, are:
Mathematical performance | ||||
---|---|---|---|---|
Poor | Satisfactory | Above average | Total | |
Underweight | 0.15 | 0.75 | 0.10 | 1.0 |
Normal | 0.15 | 0.75 | 0.10 | 1.0 |
Overweight | 0.15 | 0.75 | 0.10 | 1.0 |
Obese | 0.15 | 0.75 | 0.10 | 1.0 |
Because weight and performance were independent, the conditional probabilities in each row are the same — knowing that a student is, say, obese does not affect the probability of being above-average in mathematics. The proportional Venn diagram below illustrates this — it consists of a grid of horizontal and vertical lines.
Athletic performance and weight
Athletic performance and weight were not independent in the model above, so the conditional probabilities for performance given weight are different for different weight categories. The proportional Venn diagram below illustrates this.