Bad experimental design

We noted earlier that the experimental units often have considerable variability. If the treatments are allocated to experimental units in a way that is associated with their characteristics, these varying characteristics can distort the apparent relationship between the treatments and the response.

In a badly designed experiment, the characteristics of the experimental units act in the same way as lurking variables in observational studies.

Good experimental design

Since variability in the experimental units is usually unavoidable, we cannot prevent their effect on the response. However, in an experiment it is possible to allocate treatments to the experimental units in a way that either eliminates, or at least reduces, the relationship between the treatment, X, and characteristics of the experimental units.

Good experimental design can avoid the potential effect of lurking variables.

Speed of checkout operators — a badly designed experiment

The diagram below relates to the same 16 supermarket checkout operators who were examined on the previous page. In the experiment, the 8 operators who choose to work on a Saturday morning shift will be provided with a packer to load customer bags — the other 8 operators who work in the afternoon shift that day will pack customer bags themselves (the 'control' group). We will conduct a simulation of this experiment in which the packer decreases packing time by exactly 0.1 seconds per dollar.

The circles on the left of the diagram below represent the 16 operators with their initial ages represented by the colours of the circles.

Click Allocate treatments to simulate the selection of eight of the operators to go on the morning shift (and be given the packer). Now click Run experiment to simulate the speeds of the operators in 3 hours.

Repeat the experiment a few times and observe that most runs of the experiment estimate the effect of the packer to be a decrease in packing time of between 0.12 and 0.18 seconds per dollar.

Why does the experiment consistently over-estimate the effect of the packer?

The problem lies in the method of choosing the operators who got a packer. Checkout operators working in the morning tend to be older than those in the afternoon, so it is older operators who tend to get the packer.

We saw on the previous page that older checkout operators tend to be faster so the operators getting packers will tend to be faster even if the packer does not affect speed.

Difference in speeds if there was no packer effect

The reason for the misleading results is clearer in the following simulation in which the packer has zero effect.

Repeat the simulation a few times and observe that the packer is usually estimated to increase speed even though we know that it has no effect.

Observe that the operators getting a packer tend to be older — their circles are bluer. The difference between the means of the two groups of operators is caused by the difference in their average ages, not the packers.

Good experimental design means ensuring that there are no major differences between the two groups of experimental units.

Later pages in this section describe some strategies to follow when designing experiments that avoid the problem of lurking variables.