Summarising centre and spread

Most data sets exhibit variability — all values are not the same! Two aspects of the distribution of values are particularly important.

Centre
The centre of a distribution is a 'typical' value around which the data are located.
Spread
The spread of a distribution of values describes the distance of the individual values from the centre.

In this section, we examine how to describe centre and spread with numerical values called summary statistics. Numerical summaries of centre and spread give particularly concise and meaningful comparisons of different groups.

A pharmaceutical company is in the final stages of testing a new class of drugs that are effective at reducing high blood pressure. Some patients have however reported side effects — in particular some felt that their perception of distance had been affected.

The diagram below shows results from an experiment that was conducted to measure whether the ability to assess distance was worse for patients receiving the drugs. A 'control' group of 20 male patients were not given any drug, whereas two other groups were given drug A and drug B. Each subject was asked to position himself 3 metres from a wall and the actual distance was recorded.

There is considerable variation in the estimates of the 3-metre distance from the patients — their estimates were up to 1 metre in error.

  • The centre of the Control group's distribution is close to 3 metres — patients who got no drug were usually close to the correct distance from the wall.
  • Patients getting Drug A tended to choose a position that was too close to the wall. The centre of Drug A's distribution is about 2.5 metres.
  • Patients getting Drug B tended to position themselves too far from the wall. The centre of Drug B's distribution is about 3.5 metres.

A numerical measure of centre should describe this tendency to over- or under-estimate the distance.


After further development, a similar trial was conducted with two different drugs.

There is no tendency to over- or under-estimate a 3 metre distance with these drugs — the centres of all three distributions are close to zero. However

  • With Drug C, there is far more variability — patients can be in error by as much as 1 metre in their assessment of a 3-metre distance.
  • Patients getting Drug D can be wildly inaccurate in their assessment of difference — they sometimes over- or under-estimate the distance by as much as 2 metres.

A numerical measure of spread should describe this tendency for greater errors with drugs C and D.