We now develop a general framework for applying tests to data.

Data and model

Hypothesis tests are based on data that are collected by some random mechanism. We can usually specify some characteristics of this random mechanism — a model for the data.

In this e-book, we assume that the data are a random sample from some distribution and may even be able to argue that this distribution belongs to a specific family such as a Poisson distribution. We concentrate on some specific characteristic of this family of distributions — a parameter of the distribution whose value is unknown.

Example: Telepathy experiment

An experiment is conducted to investigate whether one subject can telepathically pass shape information to another subject. A deck of cards containing equal numbers of cards with circles, squares and crosses is shuffled. One subject selects 90 cards at random and attempts to 'send' the shape on the card to the other subject who is seated behind a screen; this second subject reports the shape imagined for the card.

This situation can be modeled as random sampling of 90 values (correct or wrong) from a categorical population in which the probability of correctly identifying the card is π. Whether or not there has been telepathy is determined by the value of the parameter \(\pi\).

Example: Aircraft air-conditioner failures

The table below shows the number of operating hours between successive failures of air-conditioning equipment in ten aircraft.

Aircraft number
2 3 4 5 6 7 8 9 12 13
413
14
58
37
100
65
9
169
447
184
36
201
118
34
31
18
18
67
57
62
7
22
34
90
10
60
186
61
49
14
24
56
20
79
84
44
59
29
118
25
156
310
76
26
44
23
62
130
208
70
101
208
74
57
48
29
502
12
70
21
29
386
59
27
153
26
326
55
320
65
104
220
239
47
246
176
182
33
15
104
35
23
261
87
7
120
14
62
47
225
71
246
21
42
20
5
12
120
11
3
14
71
11
14
11
16
90
1
16
52
95
97
51
11
4
141
18
142
68
77
80
1
16
106
206
82
54
31
216
46
111
39
63
18
191
18
163
24
50
44
102
72
22
39
3
15
197
188
79
88
46
5
5
36
22
139
210
97
30
23
13
14
359
9
12
270
603
3
104
2
438
487
18
100
7
98
5
85
91
43
230
3
130
102
209
14
57
54
32
67
59
134
152
27
14
230
66
61
34

Assuming that each aircraft has the same failure rate for its air-conditioning equipment, and that the occurrence of a failure in any hour is the same however long it has been since its most recent failure and repair, then failures will be a Poisson process with rate \(\lambda\) per hour for each aircraft. The times above will then be a random sample of size \(n = 199\) from an \(\ExponDistn(\lambda)\) distribution.

This is a model that we might use for the data and \(\lambda\) is the unknown parameter of interest.

In hypothesis testing, we want to compare two statements about an unknown parameter in the model.

Null hypothesis

This is the more restrictive of the two hypotheses and often specifies a single value for the unknown parameter such as \(\alpha = 0\). It is a 'default' value that can be accepted as holding if there is no evidence against it. A researcher often collects data with the express hope of disproving the null hypothesis.

Alternative hypothesis

If the null hypothesis is not true, we say that the alternative hypothesis holds. (You can understand most of hypothesis testing without paying much attention to the alternative hypothesis however!)

Either the null hypothesis or the alternative hypothesis must be true.


Example: Telepathy experiment

In the telepathy experiment, the probability of correctly identifying any card is π. Since there were three different shapes on the cards, guessing would result in a probability \(\pi = \diagfrac {\small 1} {\small 3}\) of choosing the correct shape.

To test whether there was telepathy, the two hypotheses would therefore be

The researchers would need clear evidence against guessing before concluding that there was telepathy. The default position would be that telepathy did not exist, so this should be the null hypothesis.

Example: Aircraft air-conditioner failures

In the aircraft air-conditioner failure data, we might be interested in testing the manufacturer's claim that the rate of failures is no more than one per 110 hours of use. This would correspond to the exponential distribution's parameter \(\lambda\), the rate per hour, being no higher than \(\diagfrac {\small 1} {\small 110}\).

This would be tested with the following two hypotheses.

The null hypothesis gives the values that we will accept unless there is strong evidence against them being correct — default values that the data could possibly contradict.

Simplifying the null hypothesis

In some situations, both the null and alternative hypotheses cover ranges of values for the parameter. To simplify the analysis, we do the test as though the null hypothesis specified the single value closest to the alternative hypothesis range.

Example: Aircraft air-conditioner failures

The two hypotheses were:

In practice, we do the test as though the hypotheses were