Differences

The key to analysing paired data is to recognise that the differences between X and Y hold all the information about whether their means are the same. Writing

D = Y - X

the hypotheses

H0 :   μX = μY
HA :   μXμY

can be expressed as

H0 :   μD = 0
HA :   μD ≠ 0

This reduces the paired data set to a univariate data set of differences. The test also becomes a simpler hypothesis test about the mean of these differences.

Music and work efficiency

The increase in efficiency for each employee (after the music system was installed) is shown in the final column below.

  Efficiency rating  
Employee   Before     After   Difference
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
21
35
40
38
23
27
28
39
22
35
28
20
39
28
34
32
35
38
57
37
30
39
28
40
48
33
33
39
41
40
11
0
-2
19
14
3
11
-11
18
13
5
13
0
13
6

Is the mean of the differences zero?

Twin studies

The final column below shows the difference in IQ for each pair (good minus poor)

  IQ  
Family Poor environment Good environment Difference
1
2
3
4
5
6
7
8
9
10
100
65
60
125
85
145
55
180
60
135
125
95
100
120
120
185
80
210
105
175
25
30
40
-5
35
40
25
30
45
40

Is the mean of the differences zero?

Garage repair estimates

The final column shows the amount that garage A overcharges, compared to garage B.

  Estimate for car repair  
Car Garage A Garage B Difference
1
2
3
4
5
6
7
8
9
10
420
900
1260
630
240
1080
1460
1900
2020
1520
380
760
1180
560
260
1000
1300
1720
1800
1440
40
240
80
70
-20
80
160
180
220
80

Is the mean of the differences zero?


Analysis of paired data

By taking differences, much of the variability between the individuals is eliminated. This provides considerably more information to help assess the null and alternative hypotheses.

The benefits of pairing will be explained more fully in a later page.

Garage repair estimates

The diagram below shows the repair estimates from garages A and B. The two distributions overlap considerably due to variability in the amounts of damage to the cars, so it initially appears that there will be little evidence against equal means.

Click on individual crosses to show the difference between the estimates for individual cars. Most estimates are higher for garage A.

Click Show Pairing to draw lines between the pairs of crosses and display the differences in a jittered dot plot. The differences give much clearer evidence that the mean estimate is higher for garage A — it seems that the mean difference is positive.

Note that it would be wrong to analyse this as two separate samples:

The data are paired because each pair of repair estimates is for the same car.