Do the data come from a normal distribution?

A histogram may indicate that a sample is unlikely to come from a normal distribution, but a normal probability plot can indicate more subtle departures from a normal distribution.

  1. Sort the data values into order, x(1) < x(2) < ... < x(n)
  2. Find ordered values that are spaced out as you would expect from a normal distribution, q1 < q2 < ... < qn. The quantiles of the normal distribution corresponding to probabilities 1/(n+1), 2/(n+1), ..., n/(n+1) are commonly used.
  3. Plot x(i) against qi

If the data set is from a normal distribution, the data should be spaced out in a similar way to the normal quantiles, so the crosses in the normal probability plot should lie close to a straight line.

How much curvature is needed to suggest non-normality?

This is a difficult question to answer and we will not address it here.