Testing whether a finals series changes the probability of topping the league
After running 100 simulations of a league in which Team A has a probability 0.55 of winning each match, we might obtain the following contingency table describing the results.
Top after finals |
Not top after finals |
Total | |
---|---|---|---|
Top of league | 42 | 15 | 57 |
Not top of league | 7 | 36 | 43 |
Total | 49 | 51 | 100 |
A 2-sample test of whether the marginal proportions (57/100 and 49/100) are from binomial distributions with the same π should not be used to test whether the probability of Team A winning has changed, since we do not have two independent samples — we really have 100 paired categorical measurements.
Note that the two diagonal cell counts (42 and 36) correspond to runs of the simulation where the position of Team A did not change after the finals series. They therefore do not hold any information about whether Team A's probability of winning has changed after the finals series. We therefore base our test only on the two off-diagonal cell counts (15 and 7).
If the probability of winning is the same before and after the finals series, the table of expected cell counts will be symmetric — both off-diagonal cell counts will have the same expected values. Each run of the simulation in which the position of Team A changes is therefore equally likely to be in the top right or bottom left cells of the table. As a result, the count in the top right cell, 15, should be a random value from a binomial distribution with n = (15+7) and π = 0.5.
To test for whether the probability of Team A winning has changed after the finals series, we can therefore refer to this binomial distribution to find the probability of 15 or more in the top right cell. Since we are performing a 2-tailed test, the p-value is double this.
The diagram below illustrates.
Click Run League. The ranks of Team A in the ordinary league and after the finals series are shown on the top left. This simulation contributes a '1' to a single cell of the contingency table on the right.
Click Accumulate then perform another 10 or 20 simulations of the league. The grey cells of the contingency table do not contribute to our test. The barchart under the table shows a binomial distribution with π = 0.5. The red bars are for counts as extreme as that in the top right cell of the contingency table. Double this tail probability gives the p-value for the test.
Hold the button Run League down until about 200 simulations have been performed. You should observe that the p-value is close to zero — there is strong evidence that the probability of Team A winning has changed. (The two off-diagonal cells would be unlikely to be so different if the probability stayed the same.)
If enough simulations were performed, the p-value would become very close to zero, allowing us to state definitely that these probabilities are different.
Types of hypothesis test based on simulations
Simulations are used to perform two distinct types of hypothesis test.