Confidence interval for μ when σ2 is unknown
The pivot for \(\mu\),
\[ \frac{\overline{X} - \mu}{\diagfrac{S}{\sqrt{n}}} \;\;\sim\;\; \TDistn(n-1 \text{ df}) \]can be used to obtain a 95% confidence interval for the parameter. Writing the 2½th and 97½th percentiles of the \(\TDistn(n-1\text{ df})\) as \(-t_{n-1,\;0.975}\) and \(t_{n-1,\;0.975}\),
\[ P\left(-t_{n-1,\;0.975} \;\;\lt\;\; \frac{\overline{X} - \mu}{\diagfrac{S}{\sqrt{n}}} \;\;\lt\;\; t_{n-1,\;0.975} \right) \;\;=\;\;0.95 \]This leads to the following 95% confidence interval for \(\mu\)
\[ \overline{x} -t_{n-1,\;0.975} \frac{s}{\sqrt{n}} \;\;\lt\;\; \mu \;\;\lt\;\; \overline{x} +t_{n-1,\;0.975} \frac{s}{\sqrt{n}} \]Intervals with other confidence levels can be found with other quantiles from the t distribution. In general, a \((1-\alpha)\) confidence interval is
\[ \overline{x} -t_{n-1,\;1-\diagfrac{\alpha}{2}} \frac{s}{\sqrt{n}} \;\;\lt\;\; \mu \;\;\lt\;\; \overline{x} +t_{n-1,\;1-\diagfrac{\alpha}{2}} \frac{s}{\sqrt{n}} \]Example
We previously showed some results from an experiment about the grazing behaviour of dairy cows. This table gives the grass intake rate (in grams dry mass per bite) in the 48 plots of grass used in the experiment.
1.09 | 1.41 | 1.20 | 1.04 | 1.07 | 1.39 | 1.06 | 1.14 |
0.88 | 0.92 | 1.07 | 1.07 | 1.18 | 0.57 | 0.01 | 0.31 |
1.14 | 1.18 | 0.58 | 0.74 | 0.14 | 0.48 | 0.91 | 0.37 |
2.19 | 1.17 | 2.34 | 1.69 | 1.97 | 1.04 | 1.76 | 1.26 |
1.62 | 0.81 | 1.81 | 2.06 | 2.27 | 1.24 | 0.02 | 1.46 |
2.29 | 2.28 | 1.40 | 0.60 | 1.41 | 0.49 | 1.06 | 1.58 |
Assuming that the data come from a \(\NormalDistn(\mu,\;\sigma^2)\) distribution, find a 95% confidence interval for the mean grass intake per bite, \(\mu\).
The sample mean and variance are
\[ \overline{x} = 1.1827 \spaced{and} s^2 = 0.3606 \]The 97½th percentiles of the \(\TDistn(47\text{ df})\) is \(t_{47,\;0.975} = 2.012\), leading to the 95% confidence interval
\[ \overline{x} -2.012 \frac{s}{\sqrt{48}} \;\;\lt\;\; \mu \;\;\lt\;\; \overline{x} +2.012 \frac{s}{\sqrt{48}} \] \[ 1.008 \;\;\lt\;\; \mu \;\;\lt\;\; 1.357 \]We are therefore 95% confident that the mean intake per bite of cows in similar plots of grass is between 1.008 and 1.357 grams of dry matter per bite.
Robustness
The distribution of the pivot that was the basis of our confidence interval,
\[ \frac{\overline{X} - \mu}{\diagfrac{S}{\sqrt{n}}} \;\;\sim\;\; \TDistn(n-1 \text{ df}) \]was found assuming that the data were a random sample from a \(\NormalDistn(\mu\;, \sigma^2)\) distribution, but we are rarely certain about the shape of the underlying distribution in practical problems.
The Central Limit Theorem shows that the sample mean has approximately a \(\NormalDistn(\mu\;, \diagfrac{\sigma^2}{n})\) distribution when the sample size is large, whatever the distribution being sampled from. Provided this distribution is not too far from normal, it is a good approximation even for small sample sizes. Although we have not however given any similar asymptotic results for the sample variance, the following holds.
Provided the shape of the underlying distribution is not far from normal, a confidence interval based on the t-distribution has approximately the correct confidence level.