Test statistic

When testing the value of a probability, π, the obvious statistic to use from our random sample is the corresponding sample proportion, p.

It is however more convenient to use the number of successes, x, rather than p since we know that X has a binomial distribution with parameters n (the sample size) and π.

X  ~  binomial (n , π)

When we know the distribution of the test statistic (at least after the null hypothesis has fixed the value of the parameters of interest), it becomes much easier to obtain the p-value for the test.

P-value

As in all other tests, the p-value is the probability of getting such an 'extreme' set of data if the null hypothesis is true. Depending on the null and alternative hypotheses, the p-value is therefore the probability that X is as big (or sometimes as small) as the recorded value.

Since we know the binomial distribution of X when the null hypothesis holds, the p-value can therefore be obtained by adding binomial probabilities.

The p-value is a sum of binomial probabilities

Note that the p-value can be obtained exactly without need for simulations or randomisation.

Australia Post example

A newspaper trying to assess Australia Post's assertion that 96 percent of letters arrive 'on time' posted 59 letters and observed that only 52 arrived on time.

H0:   π = 0.96

HA:   π < 0.96

In the diagram below, click Accumulate then hold down Simulate until about 100 samples of 59 letters have been generated. The proportion of these simulated samples in which 52 or fewer letters arrived on time is an approximation to the p-value for the test.

Since we know that the number arriving on time has a binomial (59, 0.96) distribution when the null hypothesis holds, the simulation is unnecessary. Select Binomial distribution from the pop-up menu. This binomial distribution is displayed, and the probability of 52 or fewer letters being delivered on time is shown to be 0.009 — the p-value for the test.

Since the p-value is so small, there would have been very little chance of the observed data arising if Australia Post's assertion had been correct. We can therefore conclude that there is strong evidence against their assertion. Note that this can be done without any simulations.