For most data sets, we are interested in understanding the relationships between the variables. However interpreting relationships must be done with care.
If the relationship between X and Y is causal, it is possible to predict the effect of changing the value of X.
Causality can only be deduced from how the data were collected — the data values themselves do not contain any information about causality.
In an observational study, values are passively recorded from individuals. Experiments are characterised by the experimenter's control over the values of one or more variables.
Causal relationships can only be deduced from well-designed experiments.
Experiments are conducted to assess the effect of some categorical or numerical explanatory variable on a response. Experiments are characterised by the fact that the researcher can control the values of the explanatory variable that are used.
In many experiments, the units on which the experiment is conducted are not identical. The varying characteristics of the experimental units may affect the response.
If the treatments are badly allocated to experimental units, the experiment may over- or under-estimate their effect.
If the experimental units that are chosen to get one treatment also differ in other ways, it will be impossible to tell whether it is the treatment or the other distinct characteristics of the units that affects the response.
Random allocation of treatments to experimental units avoids systematic over- or under-estimation of treatment effects.
If each treatment is used in more than one experimental unit, unit-to-unit variability can be assessed. This gives information about whether differences between the treatments are more than chance differences.
By grouping similar experimental units into blocks and randomly allocating treatments within blocks, the treatment effects can be estimated more accurately.
Why is the experiment being conducted and how will the results be used?
In an experiment, what experimental units will be used? What response variable will be recorded? Which variables will be controlled?
Other practical issues are involved when conducting experiments on human subjects.