Sample from a finite population

Data sometimes comes from a selection of individuals (e.g. people, animals, trees or plots of land) who are sampled to be 'representative' of some population. The sampling mechanism is most easily understood when there is a finite population of individuals from which the sample is selected.

When the population is finite, the sample is called a survey. Typical questions that might be asked using the data are:

Sample from an infinite population

In other situations, there is no concrete finite population from which the data arose, but we model the data as a random sample from some infinite underlying population.

In all sampling situations, the aim is descriptive — the sample data are used to describe characteristics of the underlying population.

Samples from two or more populations

In many studies, samples are taken from two or more populations. Again the underlying population may be finite or infinite.

This type of survey is often called a stratified sample. The aim may be to combine information from the separate samples into a single summary describing all populations together, or to describe differences between the populations.

Observational studies

Data obtained by sampling from finite or infinite populations are collectively called observational data.

The aim of observational studies is descriptive — the resulting data will be used to describe the characteristics of the populations from which the data are sampled.

Maize grown in a country

Consider a survey that is conducted to investigate patterns and trends in the use of different maize varieties in a country. Random samples of farmers are selected from different regions and are asked to complete a questionnaire about the varieties of maize grown, estimated yields, farm size and other personal information such as age, income and education. The researchers can also determine climatic information for each farm from national databases.

The data might be used to answer the following questions:

If the survey is repeated annually, the data can also be used to investigate trends in the use of the different maize varieties.