Warning about over-interpreting histograms of small data sets
Adjusting the class width and the starting position for the first class can give a surprising amount of variability in histogram shape for small data sets. As a result, you must be extremely wary of over-interpreting features such as clusters or skewness in such histograms.
Indeed, it is probably better to avoid using histograms to display small data sets — stacked dot plots are far less likely to mislead you over minor features.
Accounting Software Support data
The histogram below shows the distribution of times taken to deal with 20 emailed technical support queries about an accounting program.
Use the buttons under the histogram to adjust the class width and to shift the histogram classes to the left or right. Note that the appearance of splitting into clusters is only apparent in some of the histograms, but not in others.
Are the clusters real, or are they just an artifact of our choice of classes?
Without further supporting evidence, the clusters are not pronounced enough for us to conclude that the emailed queries must form into two meaningful groups. However they do give an indication of clustering that a good 'data detective' would investigate further.
In this data set, the clusters did correspond largely to two different types of query (from new users and from experienced users).
Because the shape of a small data set's histogram is so dependent of the choice of classes,...
Dot plots or stem & leaf plots should be used in preference to histograms for small data sets.
Dot plots and stem & leaf plots show the size of the data set more clearly and hence give some warning about the risk of over-interpretation.