We now give a second example of the use of pivots to find confidence intervals, but one that also involves an approximation.
Pivot for binomial distribution
Consider a single value, \(X\), from a \(\BinomDistn(n, \pi)\) distribution. The maximum likelihood estimator of \(\pi\) is the observed proportion of successes.
\[ \hat{\pi} \;\;=\;\; \frac X n \]We know that the binomial variable \(X\) has mean and variance
\[ E[X] = n\pi \spaced{and} \Var(X) = n\pi(1 - \pi) \]Standardising \(X\) (subtracting its mean and dividing by its standard deviation) gives a random variable
\[ Y \;\;=\;\; \frac{X - E[X]}{\sqrt{\Var(X)}} \;\;=\;\; \frac{X - n\pi}{\sqrt{n \pi(1 - \pi)}} \]that has mean zero and variance one. We also showed before that the binomial distribution approaches a normal distribution in shape as the sample size, \(n\), increases. Therefore
\[ \frac{X - n\pi}{\sqrt{n \pi(1 - \pi)}} \;\;\underset {\text{approx}}{\sim} \;\; \NormalDistn(0, 1) \]This can be treated as a pivot to find a confidence interval.
Confidence interval
A 95% confidence interval can therefore be found by solving the inequalities
\[ -1.96 \;\;\lt\;\; \frac{x - n\pi}{\sqrt{n \pi(1 - \pi)}} \;\;\lt\;\; 1.96 \]Example
A retail clothing outlet has collected the following data from random sampling of invoices for T-shirts over the past month.
Small | Medium | Large | XL | Total | |
---|---|---|---|---|---|
North Island | 2 | 15 | 24 | 9 | 50 |
South Island | 4 | 17 | 23 | 6 | 50 |
Find a 95% confidence interval for the probability that a T-shirt purchased from one of the store's North Island shops is Small.
In this example, \(x = 2\) and \(n = 50\), so the confidence interval is found by solving
\[ -1.96 \;\;\lt\;\; \frac{2 - 50\pi}{\sqrt{50 \pi(1 - \pi)}} \;\;\lt\;\; 1.96 \]Squaring gives
\[ \frac{(2 - 50\pi)^2}{50\pi(1 - \pi)} \;\;\lt\;\; 1.96^2 \] \[ (2 - 50\pi)^2 \;\;\lt\;\; 1.96^2 \times 50 \pi(1 - \pi) \] \[ (50^2 + 50 \times 1.96^2) \pi^2 - (2 \times 2 \times 50 + 50 \times 1.96^2) \pi + 2^2 \;\;\lt\;\; 0 \] \[ 2692.08 \pi^2 - 392.08 \pi + 4 \;\;\lt\;\; 0 \]Solving this quadratic gives the 95% confidence interval
\[ 0.011 \;\;\lt\;\; \pi \;\;\lt\;\; 0.135 \]Note that the conventional Wald-type 95% confidence interval in this example is
\[ \hat{\pi} - 1.96 \sqrt{\frac{\hat{\pi}(1 - \hat{\pi})} n} \;\;\lt\;\; \pi \;\;\lt\;\; \hat{\pi} + 1.96 \sqrt{\frac{\hat{\pi}(1 - \hat{\pi})} n} \]which evaluates to
\[ -0.014 \;\;\lt\;\; \pi \;\;\lt\;\; 0.094 \]This includes impossible negative values for \(\pi\), so the confidence interval found from a pivot is better in this example.
(When the sample size and proportion of successes are larger, there is less difference between the two types of confidence interval.)