Normal curves, histograms and underlying populations

A normal distribution curve is really a histogram — it can be thought of as the histogram of an extremely large population of marks that underlies the available data.

For example, we might consider a single class to be a 'randomly selected' collection of students from a large 'population' of potential students from similar backgrounds who have been taught in the same way. The normal curve therefore approximates the histogram the distribution of marks from similar students who have been taught in this way in general.

Area = proportion of values

A normal distribution curve therefore has the same properties as a histogram. In particular, the area under the curve above a particular range of values on the axis is equal to the proportion of values in that range.

When a normal distribution is used to describe an 'underlying population', we call the proportion of values in any range the probability of getting a value in that range.

This relationship between area and probability (or proportion of values) is central to the understanding of normal curves

 

The diagram below shows the histogram of 30 marks.

In histograms, each value is represented by a rectangle of the same area. As a result, the proportion of values in any histogram bin is given by the area of the rectangle above that bin.

Drag with the mouse over some of the histogram bins to highlight them. The area above these bins is equal to the proportion of students with marks in the selected range.


The same holds for a normal curve. The normal distribution below approximates the distribution of the 30 marks in the previous histogram.

Again drag with the mouse over the diagram to highlight an interval of values. The probability of getting a value from the interval is equal to the area above that interval.