Sampling mechanism
The mechanism of sampling from a population explains randomness in data.
However, in practice, there is only a single sample and we must use it to give information about the population. The population is the focus of our attention — we are rarely interested in the specific individuals in our sample and the underlying population is a generalisation of this type of 'individual'.
Parameters and statistics
Instead of trying to fully estimate the population distribution, we usually focus attention on a small number of numerical characteristics — often only one. Such population characteristics are called parameters. The corresponding values from a sample are called sample statistics and provide estimates of the unknown parameters.
The population mean is often of particular interest and the sample mean provides an estimate of it.
Variability of sample statistics
The variability in random samples also implies sample-to-sample variability in sample statistics.
In order to assess how well a sample statistic estimates an unknown population parameter, it is important to understand its sample-to-sample variability.
The remainder of this section investigates the variability in sample means.
Tread depth of taxi tyres
A taxi company is interested in the tyre tread depth (mm) in the 60 taxis that it owns. These 60 values are the population of interest and their mean and standard deviation are population parameters. The top half of the diagram below shows this population.
To save the cost of measuring the tread depths of all 60 cars, the company decides to randomly select 12 of them (without replacement). Click the button Take sample to select a random sample. The sample mean could provide an estimate of the mean tread depth in the whole fleet of taxis (if the population was unknown as it would be in practice).
Observe that the sample mean and standard deviation are similar to those of the population but they are not identical. Select a few more samples and note the variability in the sample statistics.
Any single sample mean provides a reasonable estimate of the population mean but the sample-to-sample variability affects its accuracy.