We now develop a general framework for applying tests to data.
Data and model
Hypothesis tests are based on data that are collected by some random mechanism. We can usually specify some characteristics of this random mechanism — a model for the data.
In this e-book, we assume that the data are a random sample from some distribution and may even be able to argue that this distribution belongs to a specific family such as a Poisson distribution. We concentrate on some specific characteristic of this family of distributions — a parameter of the distribution whose value is unknown.
Example: Telepathy experiment
An experiment is conducted to investigate whether one subject can telepathically pass shape information to another subject. A deck of cards containing equal numbers of cards with circles, squares and crosses is shuffled. One subject selects 90 cards at random and attempts to 'send' the shape on the card to the other subject who is seated behind a screen; this second subject reports the shape imagined for the card.
This situation can be modeled as random sampling of 90 values (correct or wrong) from a categorical population in which the probability of correctly identifying the card is π. Whether or not there has been telepathy is determined by the value of the parameter \(\pi\).
Example: Aircraft air-conditioner failures
The table below shows the number of operating hours between successive failures of air-conditioning equipment in ten aircraft.
Aircraft number | |||||||||
---|---|---|---|---|---|---|---|---|---|
2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 12 | 13 |
413 14 58 37 100 65 9 169 447 184 36 201 118 34 31 18 18 67 57 62 7 22 34 |
90 10 60 186 61 49 14 24 56 20 79 84 44 59 29 118 25 156 310 76 26 44 23 62 130 208 70 101 208 |
74 57 48 29 502 12 70 21 29 386 59 27 153 26 326 |
55 320 65 104 220 239 47 246 176 182 33 15 104 35 |
23 261 87 7 120 14 62 47 225 71 246 21 42 20 5 12 120 11 3 14 71 11 14 11 16 90 1 16 52 95 |
97 51 11 4 141 18 142 68 77 80 1 16 106 206 82 54 31 216 46 111 39 63 18 191 18 163 24 |
50 44 102 72 22 39 3 15 197 188 79 88 46 5 5 36 22 139 210 97 30 23 13 14 |
359 9 12 270 603 3 104 2 438 |
487 18 100 7 98 5 85 91 43 230 3 130 |
102 209 14 57 54 32 67 59 134 152 27 14 230 66 61 34 |
Assuming that each aircraft has the same failure rate for its air-conditioning equipment, and that the occurrence of a failure in any hour is the same however long it has been since its most recent failure and repair, then failures will be a Poisson process with rate \(\lambda\) per hour for each aircraft. The times above will then be a random sample of size \(n = 199\) from an \(\ExponDistn(\lambda)\) distribution.
This is a model that we might use for the data and \(\lambda\) is the unknown parameter of interest.
In hypothesis testing, we want to compare two statements about an unknown parameter in the model.
Null hypothesis
This is the more restrictive of the two hypotheses and often specifies a single value for the unknown parameter such as \(\alpha = 0\). It is a 'default' value that can be accepted as holding if there is no evidence against it. A researcher often collects data with the express hope of disproving the null hypothesis.
Alternative hypothesis
If the null hypothesis is not true, we say that the alternative hypothesis holds. (You can understand most of hypothesis testing without paying much attention to the alternative hypothesis however!)
Either the null hypothesis or the alternative hypothesis must be true.
Example: Telepathy experiment
In the telepathy experiment, the probability of correctly identifying any card is π. Since there were three different shapes on the cards, guessing would result in a probability \(\pi = \diagfrac {\small 1} {\small 3}\) of choosing the correct shape.
To test whether there was telepathy, the two hypotheses would therefore be
The researchers would need clear evidence against guessing before concluding that there was telepathy. The default position would be that telepathy did not exist, so this should be the null hypothesis.
Example: Aircraft air-conditioner failures
In the aircraft air-conditioner failure data, we might be interested in testing the manufacturer's claim that the rate of failures is no more than one per 110 hours of use. This would correspond to the exponential distribution's parameter \(\lambda\), the rate per hour, being no higher than \(\diagfrac {\small 1} {\small 110}\).
This would be tested with the following two hypotheses.
The null hypothesis gives the values that we will accept unless there is strong evidence against them being correct — default values that the data could possibly contradict.
Simplifying the null hypothesis
In some situations, both the null and alternative hypotheses cover ranges of values for the parameter. To simplify the analysis, we do the test as though the null hypothesis specified the single value closest to the alternative hypothesis range.
Example: Aircraft air-conditioner failures
The two hypotheses were:
In practice, we do the test as though the hypotheses were