Test statistic and p-value

The statistical distance of an estimate to a hypothesised value of the underlying parameter is

This can be used at test statistic. If the null hypothesis holds, it approximately has a standard normal distribution — a normal distribution with zero mean and unit standard deviation.

The p-value for a test can be determined from the tail areas of this standard normal distribution.

In the above diagram, the null hypothesis is consistent with with estimates close to the hypothesised value and the alternative hypothesis is suggested by estimates that are either much bigger or smaller than this value (called a two-tailed test). For a two-tailed test, the p-value is the red tail area and can be looked up using either normal tables or in Excel.

Refinements

If the standard error of the estimate must itself be estimated from the sample data, the above test statistic is only approximately normally distributed. In some tests that we will describe in later sections, the test statistic has a t distribution (which has slightly longer tails than the standard normal distribution). This refinement will be described fully in the next section.

Home-based businesses owned by women

The diagram below repeats the simulation that we used earlier to test whether the proportion of home-based businesses owned by women was less than 0.5:

H0 :   π  =  0.5
HA :   π  <  0.5

The proportion owned by women in a sample of n = 899 businesses was 369/899 = 0.410.

Again click Accumulate and hold down the Simulate button until about 100 samples of 899 businesses have been generated with a population probability of being owned by women of 0.5.

Select Statistical distance from 0.5 from the top pop-up menu to translate the proportions of female owners in the simulated samples into z-scores. Observe that most of these 'statistical distances from 0.5' are between -1 and +1.

The observed proportion owned by females was 0.410, corresponding to a statistical distance of z = -5.37, an unlikely value if the population proportion was 0.5.

Select Normal distribution from the lower pop-up menu to show the theoretical distribution of the z-scores. The p-value for the test is the tail area of this normal(0, 1) distribution below -5.37 and is virtually zero, so we again conclude that:

It is almost certain that π is less than 0.5.


Relation to previous test

The p-value obtained in this way using a 'statistical distance' as the test statistic is identical to the p-value that was found from a normal approximation to the number of successes without a continuity correction. (The p-value is slightly different if a continuity correction is used.)

The use of 'statistical distances' does not add anything when testing a sample proportion, but it is a general method that will be used to obtain test statistics in many other situations later in this e-book.