Consider a single value, \(X\), from a \(\BinomDistn(n, \pi)\) distribution. The MLE of \(\pi\) is

\[ \hat{\pi} \;\;=\;\; \frac X n \]

The binomial variable \(X\) has mean and variance

\[ E[X] = n\pi \spaced{and} \Var(X) = n\pi(1 - \pi) \]

Standardising \(X\) (subtracting its mean and dividing by its standard deviation) gives a distribution that is approximately normal as the sample size, \(n\), increases. Therefore

\[ \frac{X - n\pi}{\sqrt{n \pi(1 - \pi)}} \;\;\underset {\text{approx}}{\sim} \;\; \NormalDistn(0, 1) \]

This can be used as a pivot. An approximate 95% confidence interval is therefore the solution to

\[ -1.96 \;\;\lt\;\; \frac{x - n\pi}{\sqrt{n \pi(1 - \pi)}} \;\;\lt\;\; 1.96 \]

Question

A retail clothing outlet has collected the following data from random sampling of invoices for T-shirts over the past month.

  Small Medium Large XL Total
North Island 2 15 24 9 50
South Island 4 17 23 6 50

Find a 95% confidence interval for the probability that a T-shirt purchased from one of the store's North Island shops is Small.

(Solved in full version)

Using a pivot, the above 95% confidence interval is

\[ 0.011 \;\;\lt\;\; \pi \;\;\lt\;\; 0.135 \]

whereas the conventional Wald-type 95% confidence interval would be

\[ -0.014 \;\;\lt\;\; \pi \;\;\lt\;\; 0.094 \]

This includes impossible negative values for \(\pi\), so the confidence interval found from a pivot is better.