Error in an estimate
When we use a summary statistic from a random sample to estimate a population parameter, the estimate will usually not be exactly the same as the parameter.
The difference between the estimate and the target parameter is called the error in the estimate.
For example, if the sample mean, , is used to estimate a population mean, µ, the error in this estimate is
error = - µ
If a sample proportion, p , is used to estimate a population proportion, π, the estimate's error is
error = p - π
Annual rainfall in Samaru, Nigeria
In most of Africa, the most important climatic variable is rainfall. Rainfall is usually highly seasonal and failure of crops is normally associated with late arrival of rain or low rainfall. A better understanding of the distribution of rainfall can affect the crops that are grown and when they are planted.
What is the mean annual rainfall in Samaru, Northern Nigeria?
In other words, we want to estimate the population mean of the annual rainfall.
The stacked dot plot below shows the annual rainfall in Samaru, Northern Nigeria between 1928 and 1983 and marks the mean of these 56 values.
We are not interested specifically in the years 1928 to 1983, but want to understand the underlying 'population' distribution of rainfalls in order to predict what is likely to happen in the future. Assuming that there is no climate change (or that climate change is negligible compared to the year-to-year variation in rainfall), ...
... we estimate that the population mean rainfall in Samaru is 1068.1 mm.
However our estimate of 1068.1 mm is only based on a sample of 56 values so there is likely to be an error in this estimate.
How accurate is the estimate?
In the example above, we used a sample statistic to estimate of an unknown population parameter.
How big is the error likely to be?
The remainder of this chapter introduces some methods to describe the accuracy of estimates (and equivalently, the likely size of the resulting errors).