Individuals / units
Most data sets consist of one or more values that are recorded from each of a set of individuals (or plants, plots of land, repetitions of an experiment or other 'units'). These individuals will vary in many ways other than the variables that are recorded.
There are two different ways in which data can be collected from these units.
Observational studies
Data are collected in an observational study if we passively record (observe) values from each unit.
Most observational studies are conducted by sampling units from some population.
Heights of fathers and sons
The scatterplot below describes the heights (inches) of a randomly selected group of 60 men at age 18 and their fathers' heights at the same age.
Click Take sample a few times to see the heights of other fathers and sons. Observe that both variables vary from data set to data set since the data are observational.
Experiments
In an experiment, the researcher actively changes some characteristics of the units before the data are collected. The values of some variables are therefore under the control of the experimenter. In other words, the experimenter is able to choose each individual's values for some variables.
Type of experiment | Possible controlled variables |
---|---|
Agriculture | Fertiliser applied to plot of land Irrigation applied to plot Time of planting seeds |
Psychology | Time allowed to memorise text Type of stimulus in reaction-time test |
Industrial | Temperature of chemical reaction Quality of raw materials for a process |
Soy bean experiment
In an experiment, soybean plants were exposed to four different chronic levels of ozone. At each ozone level, soybean yield was measured from five different experimental units (soybean plants). The resulting data are displayed below.
This is experimental data since the agricultural researcher can control the ozone exposure of the different plants (the explanatory variable).
Click Take sample a few times to repeat the experiment and observe that the distribution of ozone concentrations remains the same — only the response (soy bean yield) changes in the repetitions of the experiment.