Estimating a population mean
The most commonly estimated parameter for numerical data is the population mean, µ. We now examine how to estimate µ when the population standard deviation, σ, is a known value.
In most practical situations, the population standard deviation, σ, is also unknown and must be estimated. However we leave this type of problem until later in this section.
Soluble sugar in plants
A laboratory procedure for assessing soluble sugar in plants is such that in the range 100 to 200 milligrams (mg) of glucose per gram of dry weight, repeat measurements will follow a normal distribution with mean µ equal to the true glucose level and standard deviation σ = 3.0 mg/g dry weight.
When a plant is tested once, the recorded glucose content is therefore
X ~ normal (μ , σ = 3)
A plant is tested several times, giving a sample mean glucose level .
The sample mean is our best estimate of the population mean, but how accurate is the estimate?
Standard error
We saw earlier that the distribution
of
is approximately normal, with mean and standard deviation given by the equations
![]() |
= μ |
![]() |
= | ![]() |
When a sample mean is used to estimate the underlying population mean, µ, there is an error,
error = − μ
The error distribution is also approximately normal,
error ~ normal (0, | ![]() |
) |
The standard deviation of the error distribution is the standard error of the estimator,
standard error = | ![]() |
95% bounds for the error
Applying the 70-95-100 rule of thumb to the error distribution,
Prob( error is between ± 2 | ![]() |
) is approximately 0.95 |
We can refine this using the properties of the normal distribution. Exactly 95% of values from a normal distribution are within 1.96 standard deviations from the mean, so
Prob( error is between ± 1.96 | ![]() |
) = 0.95 |
95% confidence interval
Since
will be within 1.96
of µ
with probability 0.95, we are 95% confident that µ
is in the interval
This kind of interval estimate is called a 95% confidence interval for µ. And we say that the interval has a confidence level of 0.95.
Soluble sugar in plants
A laboratory analyses one plant 16 times to estimate the amount of glucose in it. The results of a single analysis are known to be
X ~ normal (μ , σ = 3)
Irrespective of the data, we know that the standard error of
is therefore
![]() |
= 3/4 = 0.75 |
From this, we can obtain bounds on the error and therefore a confidence interval, as illustrated below.
Click Another sample to see the confidence interval that would result from a different 16 analyses of the same plant. Observe that the width of the confidence interval remains the same.
Change the sample size, n, and observe that the confidence interval becomes narrower as the sample size increases.