The coefficient of determination for experimental data

In experiments, the values of the explanatory variable are usually chosen by the experimenter. It is important to recognise that the coefficient of determination not only depends on the characteristics of the underlying linear model, but...

R2 is also affected by the choice of x-values.

Increasing the range of x-values in the experiment will usually increase the value of R2.

Conversely, if all x-values are chosen to be similar, R2 is likely to be small, whatever the characteristics of the underlying linear model.

R2 should not be interpreted as a summary of how strongly Y depends on X for experimental data — its value also depends on the values of X used in the experiment.

Although R2 also depends on the range of x-values for observational data, this is less of a problem since these are not chosen by the data collector.

Illustration

The diagram below shows a normal linear model (represented by the grey band) and allows samples to be selected from this model in an experiment. The slider adjusts the x-values used in the experiment.

When the range of x-values becomes small, observe that

However note that R2 is not affected greatly by the sample size — increasing the sample size with the same range of x-values results in a similar value of R2.