Sampling from a population
Sampling from an underlying population (whether finite or infinite) gives us a mechanism to explain the randomness of data. The underlying population also gives us a focus for generalising from our sample data — the distribution of values in the population is fixed and does not depend on the specific sample data.
Unknown population
The practical problem is that the population underlying most data sets is unknown. Indeed, if we fully knew the characteristics of the population, there would have been little point in collecting the sample data!
Even though our model implies that we could take many different samples from the population,
In practice we only have a single sample.
However this single sample does throw light on the population distribution. In later chapters, we will go into much more detail about how to estimate population characteristics from a sample.
Effectiveness of insecticide
Users of an insecticide are interested in what proportion of the target insects are likely to die at any dose. This proportion will be unknown, but it is possible to collect data that throws light on its value.
The symbol π denotes the population proportion of beetles that would die at a particular weak concentration of the insecticide. In an experiment, fifty beetles were sprayed with this concentration and the diagram below shows the resulting data.
The survival of the fifty beetles can be treated as a sample from an abstract infinite population in which a proportion π would die, but π is an unknown value. It is of more interest than the proportion in our specific sample.
The sample proportion dying, p = 0.72, however throws some light on the likely value of π.