Warning about over-interpreting histograms of small data sets
Adjusting the class width and the starting position for the first class can give a surprising amount of variability in histogram shape for small data sets. As a result, you must be extremely wary of over-interpreting features such as clusters or skewness in such histograms.
Indeed, it is probably better to avoid using histograms to display small data sets — stacked dot plots are far less likely to mislead you over minor features.
Inventory of furniture shops
The histogram below shows the distribution of the value of stock owned by 20 furniture stores in a city.
Use the buttons under the histogram to adjust the class width and to shift the histogram classes to the left or right. Note that the shape of the histogram varies considerably. The appearance of splitting into clusters is clearer in some of the histograms, than in others.
Are the clusters real, or are they just an artifact of our choice of classes?
Without further supporting evidence, the clusters are not pronounced enough for us to conclude that the emailed queries must belong to two meaningful groups. However they do give an indication of clustering that a good 'data detective' would investigate further.
Because the shape of a small data set's histogram is so dependent of the choice of classes,...
Dot plots or stem & leaf plots should be used in preference to histograms for small data sets.
Dot plots and stem & leaf plots show the size of the data set more clearly and hence give some warning about the risk of over-interpretation.