Cumulative proportions and relationship to box plot
In any data set, approximately 1/4 of the values are lower than the lower quartile, 1/2 are lower than the median and 3/4 are lower than the upper quartile.
Value | Proportion below |
---|---|
Lower quartile | 0.25 |
Median | 0.5 |
Upper quartile | 0.75 |
For any other value, x, we can similarly find the proportion of values in the data set that are less than or equal to x. This is called the cumulative proportion for x.
Annual rainfall in Dodoma
In the diagram above, the vertical red line represents the value 'x'. The annual rainfalls that were lower than this are highlighted and the equation shows how the cumulative proportion for this value is obtained.
Drag the vertical red line to change 'x'. Observe that when x is the lower quartile, median and upper quartile (shown in the box plot), the cumulative proportions are approximately 0.25, 0.5 and 0.75.
Proportion greater than x
The proportion of values greater than x is one minus its cumulative proportion,
Pr(values > x) = 1 - Pr(values <= x)
Annual rainfall in Dodoma
In the diagram above, select larger values from the pop-up menu to highlight the values to the right of the red line. Observe that the proportion of highlighted values is one minus the proportion of smaller values to the left.
Equality
We have not distinguished between the proportion of values less than x and the proportion that are less than or equal to x. The two proportions are the same unless there are values at exactly x. For continuous measurements such as rainfall totals,
We therefore do not distinguish between the two terms in the rest of this section.
Note however that for discrete data (counts), it is important to be precise about the terms 'less than' and 'less than or equal to'.
Hurricanes in the North Atlantic
The table below shows the numbers of hurricanes in the North Atlantic each year from 1910 to 2009.
Decade | Year | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Beginning | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 |
3 4 2 4 11 4 5 9 8 8 |
3 4 2 4 8 8 6 7 4 9 |
4 2 6 4 6 3 3 2 4 4 |
3 3 9 5 6 7 4 3 4 7 |
0 5 6 7 6 6 4 5 3 9 |
4 1 5 5 9 4 6 7 11 15 |
11 8 7 3 4 7 6 4 9 5 |
2 4 3 5 3 6 5 3 3 5 |
3 4 3 6 7 5 5 5 10 8 |
1 3 3 7 7 12 5 7 8 3 |
These data are discrete.
However since there were 18 years with exactly 3 hurricanes,
Note that the latter is the cumulative proportion for x = 3.