Frequency tables

A computer is normally used to draw histograms, but it is instructive to consider how one might be drawn by hand. The data are first summarised in a frequency table.

The histogram classes are defined in the first two columns. The first of these describes the range of data values that are included in each class. The second column extends these ranges to give touching ranges of values — the classes that will be used to draw the histogram. The final column shows the frequencies for the classes — the number of values within each class.

Provided all classes have the same width, the heights of the histogram rectangles are given by the frequencies.

Ages of patients admitted to cardiac unit

For the hospital admission data described at the start of this chapter, the ages of the patients ranged from 46 to 90, so the data covers 45 years (including both extremes). We will use a class width of 10 years for our initial histogram, giving 6 classes. Starting the initial class at 40 leads to the following frequency table.

Care should be taken when translating the column of data values into classes. For the hospital admissions data, the values are ages, so '40' corresponds to any age between 40 and 41.

In most other types of data set, the recorded values are rounded, rather than truncated, and the class boundaries should reflect this. Rounded values should never coincide with class boundaries.

Technical support time

For example, the Accounting Software Support data contain values that are rounded to whole numbers (61, 14, ...) so the value 61 could be anywhere in the range 60.5 to 61.5. A histogram might be drawn from the frequency table below.

The diagram below illustrates the problem with allowing data values to fall on class boundaries.

Use the mouse to identify the rectangle in the histogram corresponding to the value 20. You should observe that it has been included in the class (10 to 20), rather than the class (20 to 30). Although definitions can be given to ensure that values are consistently placed, the class boundaries should be shifted 0.5 to the left to avoid the visual ambiguity.

Click the checkbox Shift Left under the histogram to redraw the histogram correctly.

Drawing histograms with mixed class widths

When all classes do not have the same width, the rectangle heights are not found from the frequencies of the classes. (Otherwise the visual impact of the wider classes will be over-emphasised.) Instead, the rectangle height for a class is its density,

Since the area of a rectangle is given by its height (the density) times the class width, this definition ensures that area equals relative frequency.

If all classes have the same width, using frequency or density results in a histogram of the same shape, so this extra complication is only necessary when there are mixed class widths.

Lengths of wood chips

The histogram below shows chip lengths of 50 wood chips sampled from a batch delivered to a paper mill. Use the pop-up menu to base the histogram on density. Observe that the shape of the histogram is unchanged since all classes have the same width.