Flexibility in bin widths and bin starting positions
There is much more freedom in the choice of histogram bins than in the corresponding bins for stem and leaf plots. Indeed, any values can be used for the bin boundaries in a histogram.
We initially restrict attention to histograms where all bins are of the same width, but even then:
- The bin width can be any value.
- The bins can start at any value, not just multiples of the bin width.
Bins should be chosen for smoothness
As in stem and leaf plots, we aim for smoothness in the outline of the histogram rectangles. The histogram below of the ages when students reached reading age 8 is reasonably smooth — we informally interpret the histogram in the same way as the smooth blue curve that has been superimposed 'by eye' on it.

Histogram bins should therefore be chosen to make the outline of the histogram as smooth as possible. Adjusting bin width is most important in attaining this goal.
- When the bins are too narrow, the outline becomes jagged.
- When the bins are wide, the outline becomes blocky.
There is no substitution for trial-and-error in this process!
The histogram below shows the distribution of 200 values.
Use the buttons below the histogram to investigate the effect of narrowing and widening the histogram bins. Which histogram is smoothest (and therefore best)?
- When bin width is less than 4.0, the histogram starts to look jagged.
- When bin width is greater than 8.0, the histogram becomes blocky and shape information between 0 and 10 is lost by the grouping.
The general principle is to use the smallest bin width that is not jagged. This is a subjective judgment and any bin width between 4.0 and 8.0 would be acceptable, though a bin width at the lower end of this range is better.
Warning about histograms of small data sets
Adjusting the bin width and the starting position for the first bin can give a surprising amount of variability in histogram shape for small data sets. As a result, you must be wary of over-interpreting features such as clusters or skewness in such histograms.
Maths test mark data
The histogram below shows the 25 maths test marks that we examined earlier.
Use the buttons under the histogram to adjust the bin width and to shift the histogram bins to the left or right. Note that the appearance of splitting into clusters is only apparent in some of the histograms, but not in others.
Are the clusters real, or are they just an artifact of our choice of bins? Without further supporting evidence, the clusters are not pronounced enough for us to conclude that the students must form into two meaningful groups. However they do give an indication of clustering that a good 'data detective' would investigate further.
Dot plots should be used in preference to histograms for small data sets. They show the size of the data set more clearly and hence give some warning about the risk of over-interpretation.
Histograms of larger data sets are more representative
For large data sets, changes to the bins have less effect on the histogram shape — we would sketch a similar smooth 'canopy' over most resulting histograms. Since they provide a much less cluttered display of the data than dot plots or stem and leaf plots, histograms are good summaries of the distribution of values in a large data set.
Finally, the shape of the histogram is less variable when different data sets are measured from the same underlying process.
The histogram below shows the distribution of 300 marks.
Click the button Sample under the histogram to observe the distribution of another 300 marks recorded from similar students. Repeat several times and observe that although details of the distribution's shape vary, the following features are visible in most sample histograms:
- The distribution is fairly smooth and unimodal (with a single peak).
- The distribution is centred round 40.
- There are occasional values almost as low as 0 and as high as 100.
- The distribution is skew with a longer 'tail' of higher values.
Use the buttons under the histogram to adjust the bin width and shift the bins left or right, and observe that the above features persist.