Observational studies and experiments

The method of data collection has a major influence on whether a relationship can be interpreted as causal.

In observational studies, there is usually the potential for a lurking variable to underlie any observed relationship, making relationships difficult to interpret.

The most important characteristic of experiments is that they often do allow relationships to be interpreted as causal ones.

In a well designed experiment, there is little chance of lurking variables driving the observed relationships, so any relationship will be causal.

In a badly designed experiment however, lurking variables can still cause difficulties in interpreting relationships.

The examples below illustrate differences in interpreting observational and experimental data.

Irrigation and wheat growth: an observational study

How does soil moisture affect the yield of wheat? What are the likely benefits of irrigation? In a sample of farms, rain gauges are installed; rainfall (which includes irrigation from sprinklers) and crop yield are subsequently measured.

The scatterplot does not suggest that farms with higher rainfall or irrigation tend to have greater yields of wheat. However the data are observational and may not therefore show causal relationships.

The researcher should not conclude that irrigation has no effect on wheat yields

The study was conducted over a large region in which the climate varied, so other characteristics of the sampled farms may also affect wheat yield. In particular, the mean summer temperature is a lurking variable that may affect the relationship — wetter farms also tend to be colder and the lower temperature may counter any benefits from higher rainfall.

Click the checkbox Show Temperature to discover which farms are hottest and coldest. Temperature is indeed a lurking variable here — there is a positive relationship between wheat yield and rainfall within each temperature group. If information about temperature was not available, the wrong conclusion about the likely effect of irrigation would be reached.

Irrigation and wheat growth: an experiment

In an experimental study, the researcher controls moisture in each farm using supplementary irrigation.

In the experiment whose results are displayed below, farms from wetter areas are not used (since it is not possible to use irrigation to reduce moisture, only to increase it). In each of 8 farms, 3 fields were used in the study, so that the fields in each group of 3 are as similar as possible. Irrigation is used to control moisture in each field, so that the three fields in each farm get the equivalent of 2, 2.5 and 3 mm of rainfall per day over the growing season.

Since differences between temperature, sunshine, and other variables that may affect wheat yields are no longer related to moisture, any differences between yields can be causally attributed to moisture.

The jittered dot plot below shows the results of the experiment.

Since there are similar proportions of warm farms in all three groups, a similar picture of the effect of moisture is obtained whether or not we take account of temperature. From the results shown above, we would conclude that increasing moisture by 1 mm per day would increase wheat yields by approximately 0.4 tonnes per hectare.