Variability in individuals

Most statistical data sets take the form of a data matrix in which different measurements are made from a collection of 'individuals' (e.g. people, plants, houses, bank accounts, etc.). These individuals are not identical so measurements made from them also vary from individual to individual.

Unavoidable variability in measurements

Even when the 'individuals' are very similar, the recorded measurements from them often vary due to:

Variability of the environment in which the measurements were made
Different trees in a plantation will be grown on soil with differing nutrients and drainage and may receive different amounts of light.
Variability in the measurement procedure itself
A valuer's assessment of the likely sale price of a house could be different if the same house was valued the following day or by a different valuer. Even physical measurements such as a person's blood pressure or abdomen circumference are rarely measured without some variability.


For the above reasons, the measurements in a data set are likely to vary from individual to individual.

This is the origin of the term 'variables' for the columns of a data matrix.


Yield from tomato plants

The diagram below shows the weight of tomatoes (kg) harvested from 20 tomato plants grown from the same packet of seeds in pots with as similar as possible growing conditions.

Observe that the yield from the tomato plants varies considerably, due to differences in genetics, unavoidable small differences in the growing environment and other chance occurrences.

Click Another 20 plants to grow another 20 tomato plants.