Five-number summary
The distribution of values in many data sets can be effectively summarised by a few numerical values called summary statistics. In this section we describe a graphical display that is based on five summary statistics called the 5-number summary.
Box plot
The box plot of a batch of values displays these five values graphically.
![]() |
A box plot therefore splits the data set into four quarters with (approximately) equal numbers of values.
The diagram below shows a batch of values as a jittered dot plot and a box plot.
Click on the different regions of the box plot to verify that the box plot does indeed split the batch into quarters.
Drag over the central box (click on the left half of the box and move the mouse to the right half with the button held down) to verify that half the values are between the upper and lower quartiles.
Details
We have skipped over some details in our description of the median and quartiles. You should usually rely on a computer to evaluate them, so a precise definition is not strictly necessary. The idea of splitting the data into 4 equal-sized groups allows you to interpret the shape of box plots.
However, for those interested, we fill in the details below.
Definition of median
If there is an even number of values, any value between the middle two will split the batch into two equal halves.
If there is an odd number of values, the batch cannot be split into two halves
Definition of quartiles
To define the lower quartile, we take all values lower than the median, m. The lower quartile is the median of these values. Note that we exclude the median itself from this calculation if there is an odd number of values in the data set. The upper quartile is similarly defined as the median of the upper half of the values.
Different authors give slightly different definitions for the upper and lower quartiles.
Provided you are consistent with your definitions, the box plots that you will draw should lead you to the same conclusions about the differences between groups.