Do the data come from a normal distribution?
A histogram of the data can be examined and may indicate that there is skewness or that the distribution separates into clusters. However if the data set is large, a normal probability plot can indicate more subtle departures from a normal distribution.
Normal probability plot
A normal probability plot is produced in the following way:
If the data set is from a normal distribution, the data should be spaced out in a similar way to the normal quantiles, so the crosses in the normal probability plot should lie close to a straight line.
Examples
Interpreting the shape of a normal probability plot is most easily explained with some examples.
The values on the horizontal axis are q1 < q2 < ... < qn which are spaced out as you would expect from a normal distribution. Those on the vertical axis are the actual data values. Data sets with different features can be chosen from the pop-up menu.
Observe how the distribution of the data set affects the shape of the probability plot.
How much curvature is needed to suggest non-normality?
In the examples above, linearity or nonlinearity in the probability plot was clear. In practice however, the randomness of real data means that the probability plot will not be exactly straight even when the data are sampled from a normal population.
How much curvature is needed to conclude that the underlying distribution is not normal?
This is a difficult question to answer and we will not address it here.