Standard deviation and histogram

Students usually find the standard deviation a difficult concept. Luckily, understanding its definition is much less important than knowing its properties and having a feel for what its numerical value means.

If you have understood the 70-95-100 rule, you should be able to make a fairly accurate guess at the standard deviation of a batch of values from a histogram or dot plot (without doing any calculations). About 95% of the values should be within 2 standard deviations of the mean, so after dropping the top 2.5% and bottom 2.5% of the crosses (or area of the histogram), the remainder should span approximately 4 standard deviations. So dividing this range by 4 should approximate the standard deviation.

Similarly, given the mean and standard deviation for a data set, you should be able to draw a rough sketch of a symmetric histogram with that mean and standard deviation. (It would be centred on the mean and 95% of the area would be within 2 standard deviations of this.)

Exercise capacity of the elderly

The table below was published in the Official Journal of the American College of Sports Medicine. The table describes 'anthropometric data and maximal exercises capacity' of two groups of elderly men — 10 who continued to do regular exercise (athletes) and another 12 who had not continued with exercise into old age (controls). All values are printed in the form (mean ± standard deviation).

  Athletes
(N = 10)
Controls
(N = 12)
Age (yr)
Height (m)
Weight (kg)
BSA (sqr m)
BMI (kg per sqr m)
Systolic BP at rest (mm Hg)
Diastolic BP at rest (mm Hg)
Max VO2 (L)
Max VO2 (mL per kg per min)
Max Exercise capacity (W)
Max Exercise capacity (W per kg)
Max heart rate (bpm)
72.8 ± 2.9
1.79 ± 0.06
72.5 ± 8.7
1.90 ± 0.13
22.6 ± 2.1
151 ± 26
78 ± 7
2.91 ± 0.52
41 ± 7
254 ± 31
3.5 ± 0.4
150 ± 9
74.9 ± 2.4
1.75 ± 0.06
78.4 ± 11
1.93 ± 0.13
25.8 ± 3.5
148 ± 14
81 ± 7
2.10 ± 0.29
26 ± 5
172 ± 19
2.2 ± 0.4
153 ± 8
BSA, body size area; BMI, body mass index; Max VO2, maximal oxygen uptake

The table summarises the main differences between the groups.

Guessed histograms

Using the 70-95-100 rule of thumb, we can sketch a rough histogram to match each mean and standard deviation — about 70% of each histogram's area should be within s of the mean, 95% within 2s of the mean and about all within 3s. These can be used to compare the 'Athletes' and 'Controls' for any variable.

Use the pop-up menu to investigate the differences between the athletes and controls for the different variables.

The actual histograms may not be symmetric and would certainly be more 'boxy' than those above, but this is the best we can do from the available information.

A few interpretations...

The data set is small, but from the means and standard deviations, we can either use the approximate histograms or use the 70-95-100 rule directly to infer that...

In later chapters of CAST, you will meet statistical methods that allow you to properly compare two groups such as these, but we mention here that these methods will be based on the group means and standard deviations.