Explained and unexplained variation
On the previous page, we distinguished between variation between the treatment means and response variation within each treatment. Both are important in order to assess whether the factor really affects the response. We now generalise these concepts to help extend them to different types of experiment.
Linear model for a numerical factor
The same concepts of explained and unexplained variation apply to experiments with a numerical factor in which the response is modelled by a linear model of the form:
yij = |
(explained by factor) β0 + β1 xi |
+ |
(unexplained) εij |
for i = 1 to g and j = 1 to ni |
where εij ∼ normal (0, σ)
We are again interested in whether the factor affects the response. For the above linear model, this corresponds to the hypotheses:
H0 : | β1 = 0 |
HA : | β1 ≠ 0 |
Assessing whether the factor affects the response
In practice, when this model is fitted to experimental data by least squares, the least squares slope, b1, is unlikely to be exactly zero, even if the factor does not affect the response.
How far from zero should b1 be to allow us to conclude that the factor does affect the response?
Two quantities are again required to help us answer this question. The least squares residuals describe variation that is unexplained by the model.
Both explained and unexplained variation must be used to assess whether the factor really does affect the response.
Explained variation
The scatterplot below shows data that might have arisen from a completely randomised experiment in which 10 experimental units are given each of 4 levels of a factor.
Use the slider to alter the difference between the treatment means. Observe that:
Unexplained variation
The diagram below is similar, but the slider adjusts the spread of values around the least squares line.
Observe that ...
Does the factor really affect the response?
The evidence for the underlying model parameter, β1, being non-zero depends on both the explained and unexplained variation. It is strongest when:
In the following page, we will show how to obtain summary statistics describing the explained and unexplained variation. They will be used in a formal hypothesis test for whether the factor affects the response.