What a box plot can show

Box plots are highly summarised descriptions of the distribution of values in a data set. They capture well:

What a box plot cannot show

These are the most important characteristics of distributions, but some distributions have features that a box plot cannot show. In particular, a box plot cannot give any indication of clusters in a data set.

Before a box plot is used, a dot plot, stem and leaf plot or a histogram must be examined to check that clusters do not exist.

If clustering is present, a box plot should not be used to summarise the data.

The diagram below illustrates the inability of box plots to show clusters

Drag the slider to separate the data into two clusters. There is no clear indication from the box plot that the data separate into two clusters with a 'gap' in the middle of the distribution. (The closeness of the quartiles to the extremes relative to the width of the central box does give a hint that there could be clusters. However clusters are an extremely important feature whose existence should be immediately obvious in any good graphical display.)

Eruptions of Old Faithful Geyser

The Old Faithful Geyser in the Yellowstone National Park in the USA erupts regularly. The dot plot below shows the durations of these eruptions in October 1980.

The dot plot clearly shows two clusters of eruption durations, so there seem to be two different types of eruption. However the box plot gives no indication of clustering and you would miss this important feature of the eruptions if you only examined a box plot of the data.