Independence

Two events are called independent if knowledge that one event has happened does not provide any information about whether or not the other has also happened. The formal definition is:

Definition

Two events A and B are called independent if

\[ P(A \textbf{ and } B) = P(A) \times P(B) \]

We now show two artificial examples, the first of which exhibits independence.

Mathematical performance and weight

Consider the relationship between the weight of students and their performance in a mathematics test. Here are the (artificial) joint probabilities for all combinations of weight group and performance group.

Joint Probabilities
Mathematical performance
    Poor     Satisfactory Above average Marginal
Underweight 0.0225 0.1125 0.0150 0.1500
Normal 0.0825 0.4125 0.0550 0.5500
Overweight 0.0300 0.1500 0.0200 0.2000
Obese 0.0150 0.0750 0.0100 0.1000
Marginal 0.1500 0.7500 0.1000 1.0000

Are the weight and mathematical performance categories independent?

The marginal probabilities are shown in red — they are the sums of joint probabilities across rows and down columns of the table.

The joint probability for Underweight and Above average performance satisfies:

\[ P(Underweight \textbf{ and } Above \text{ } average) = P(Underweight) \times P(Above \text{ } average) \]

so these two events are independent, and the same holds for all other weight and performance categories.

Weight and mathematical performance are independent

In the next artificial example, events are not independent.

Athletic performance and weight

If we considered the relationship between athletic performance and weight, the joint probabilities might be as shown in the following table.


Joint Probabilities
Athletic performance
Poor Satisfactory Above average Marginal
Underweight 0.0450 0.0900 0.0150 0.1500
Normal 0.0825 0.3025 0.1650 0.5500
Overweight 0.0500 0.1200 0.0300 0.2000
Obese 0.0300 0.0650 0.0050 0.1000
Marginal 0.1700 0.5400 0.2900 1.0000

Are the weight and athletic performance categories independent?

In this illustration, athletic performance is not independent of weight. None of the joint probabilities are equal to the product of the marginal probabilities.

Weight and athletic performance are not independent

Independence and conditional probabilities

Since \(P(A \textbf{ and } B) = P(A \mid B) \times P(B) \) from the definition of conditional probability, if two events are independent,

\[ P(A \mid B) = P(A) \]

and similarly

\[ P(B \mid A) = P(B) \]

Our definition of independence therefore means that knowing that either \(A\) or \(B\) happened gives no information about whether or not the other event also occurred.

Mathematical performance and weight

In the above model for mathematical performance and weight, the performance and weight categories were independent. The conditional probabilities for performance, given weight, are:

Conditional Probabilities
Mathematical performance
    Poor     Satisfactory Above average Total
Underweight 0.15 0.75 0.10 1.0
Normal 0.15 0.75 0.10 1.0
Overweight 0.15 0.75 0.10 1.0
Obese 0.15 0.75 0.10 1.0

Because weight and performance were independent, the conditional probabilities in each row are the same — knowing that a student is, say, obese does not affect the probability of being above-average in mathematics. The proportional Venn diagram below illustrates this — it consists of a grid of horizontal and vertical lines.

Athletic performance and weight

Athletic performance and weight were not independent in the model above, so the conditional probabilities for performance given weight are different for different weight categories. The proportional Venn diagram below illustrates this.