We now give a second example of the use of pivots to find confidence intervals, but one that also involves an approximation.

Pivot for binomial distribution

Consider a single value, \(X\), from a \(\BinomDistn(n, \pi)\) distribution. The maximum likelihood estimator of \(\pi\) is the observed proportion of successes.

\[ \hat{\pi} \;\;=\;\; \frac X n \]

We know that the binomial variable \(X\) has mean and variance

\[ E[X] = n\pi \spaced{and} \Var(X) = n\pi(1 - \pi) \]

Standardising \(X\) (subtracting its mean and dividing by its standard deviation) gives a random variable

\[ Y \;\;=\;\; \frac{X - E[X]}{\sqrt{\Var(X)}} \;\;=\;\; \frac{X - n\pi}{\sqrt{n \pi(1 - \pi)}} \]

that has mean zero and variance one. We also showed before that the binomial distribution approaches a normal distribution in shape as the sample size, \(n\), increases. Therefore

\[ \frac{X - n\pi}{\sqrt{n \pi(1 - \pi)}} \;\;\underset {\text{approx}}{\sim} \;\; \NormalDistn(0, 1) \]

This can be treated as a pivot to find a confidence interval.

Confidence interval

A 95% confidence interval can therefore be found by solving the inequalities

\[ -1.96 \;\;\lt\;\; \frac{x - n\pi}{\sqrt{n \pi(1 - \pi)}} \;\;\lt\;\; 1.96 \]

Example

A retail clothing outlet has collected the following data from random sampling of invoices for T-shirts over the past month.

  Small Medium Large XL Total
North Island 2 15 24 9 50
South Island 4 17 23 6 50

Find a 95% confidence interval for the probability that a T-shirt purchased from one of the store's North Island shops is Small.

In this example, \(x = 2\) and \(n = 50\), so the confidence interval is found by solving

\[ -1.96 \;\;\lt\;\; \frac{2 - 50\pi}{\sqrt{50 \pi(1 - \pi)}} \;\;\lt\;\; 1.96 \]

Squaring gives

\[ \frac{(2 - 50\pi)^2}{50\pi(1 - \pi)} \;\;\lt\;\; 1.96^2 \] \[ (2 - 50\pi)^2 \;\;\lt\;\; 1.96^2 \times 50 \pi(1 - \pi) \] \[ (50^2 + 50 \times 1.96^2) \pi^2 - (2 \times 2 \times 50 + 50 \times 1.96^2) \pi + 2^2 \;\;\lt\;\; 0 \] \[ 2692.08 \pi^2 - 392.08 \pi + 4 \;\;\lt\;\; 0 \]

Solving this quadratic gives the 95% confidence interval

\[ 0.011 \;\;\lt\;\; \pi \;\;\lt\;\; 0.135 \]

Note that the conventional Wald-type 95% confidence interval in this example is

\[ \hat{\pi} - 1.96 \sqrt{\frac{\hat{\pi}(1 - \hat{\pi})} n} \;\;\lt\;\; \pi \;\;\lt\;\; \hat{\pi} + 1.96 \sqrt{\frac{\hat{\pi}(1 - \hat{\pi})} n} \]

which evaluates to

\[ -0.014 \;\;\lt\;\; \pi \;\;\lt\;\; 0.094 \]

This includes impossible negative values for \(\pi\), so the confidence interval found from a pivot is better in this example.

(When the sample size and proportion of successes are larger, there is less difference between the two types of confidence interval.)