One factor with two levels

We first consider models for an experiment with a factor that has two levels and a covariate.



yij  =  µ


 + 
explained
by factor

βi


 + 
explained
by covariate

γ xi


 + 


εij

Since there are only two parameters of interest, β2 and γ, their significance can be assessed from their least squares estimates and their standard errors.

Pruning sweet cherry trees

A horticulturist wanted to determine the effect of different pruning systems on the yield of sweet cherries. Because there appeared to be differences in size of the trees in the orchard that was used, he recorded the cross-sectional area (cm2) of each tree at a height of 1m above the soil surface before the start of the experiment. Six trees were pruned with each of the systems.

Three years after the pruning treatments had begun, measurement of the yield of cherries (kg) from each tree was started. To avoid a possible problem with biennial bearing, the table below shows the mean yield of sweet cherries from the trees getting two of the pruning systems.

  Pruning system
  System 1   System 4
Tree Area Yield   Area Yield
1
2
3
4
5
6
140
200
144
127
173
225
2.1
3.5
1.8
3.6
3.8
5.0
  110
95
171
199
215
191
4.1
4.1
4.9
7.1
6.5
6.2

The diagram below shows the data and the parameter estimates for the model with terms for the pruning system and covariate.

Since the p-value for the pruning system parameter (difference between System 4 and the baseline System 1) is close to zero, we conclude that:

It is almost certain that the two pruning systems result in different cherry yields.

The p-value associated with the covariate Area is also close to zero. Although there is rarely any interest in testing the covariate, it would be valid to also conclude that there is a significant relationship between the cross-sectional area of the trees at the start of the experiment and the response.

Click the checkbox for the covariate to remove it from the model. Analysing the data without the covariate would have resulted in a much higher standard error for the estimate of the difference between the pruning systems and a less significant effect for the factor.

If the values for a covariate are known, it should always be used in the model.


Models with more factors and factor levels

If the factor of interest has more than two levels, analysis of variance must be used to test whether the factor has an effect. Analysis of variance is also commonly used when there are two or more factors, as in the example below.

Inhibitor and bacteria growth

A bacteriologist investigated the effect of various inhibitors on the growth rate of colonies of a species of bacteria. Five different inhibitors were used in the experiment (four established ones and a new contender). Temperature was also known to affect growth, so it was also controlled and the experiment was conducted as a factorial experiment with the five inhibitors and five temperatures (10, 15, 20, 25 and 30°C).

25 cultures were prepared and the potential growth rate of each was determined. Each culture was treated with one of the inhibitors and left to incubate at one of the temperatures for eight hours after which its actual growth rate was determined.

The potential growth rate is a covariate and should always be added first to the model in the analysis of variance table. Since the two factors are orthogonal in this factorial design (with one replicate for each combination of inhibitor and temperature), the sums of squares for temperature and inhibitor are the same whichever order they are subsequently added to the model.

Drag the red arrow to add terms to the model. Observe that although the linear term for temperature is not significant, there is moderately strong evidence that nonlinear term in temperature is needed, so we should retain both terms in the model. The p-value for differences between the inhibitors is 0.1832, so we should conclude that:

Even after taking account of differences between the potential growth rates of the cultures and differences between the temperatures, there is no evidence of differences between the inhibitors.

A further refinement of the analysis might split the sum of squares for the inhibitors into a sum of squares with 3 degrees of freedom for comparing the four established inhibitors, and a sum of squares with 1 degree of freedom for comparing the new inhibitor to the established ones.