Estimating the error variance
In 2k factorial experiments with k = 4 or more factors, usually only a single replicate of the experiment is conducted. The model with all main effects and interactions therefore fits the data exactly, leaving no residual degrees of freedom. This means that the natural experimental variability (error variance) cannot be estimated. Without this, it is impossible to test the significance of the terms in the model.
In the analysis of data from a standard 2k factorial experiment, the only way to conduct hypothesis tests requires the assumption that there are no high-order interactions between the factors. Fitting a model without the high-order interactions is equivalent to adding their sums of squares and treating them as a residual sum of squares.
At least 5 residual degrees of freedom are recommended in order to get reasonable power in hypothesis tests.
The higher-order interactions that are assumed to be zero should be picked before analysis. If effects are picked because they are small, the error standard deviation is likely to be underestimated.
Analysis of variance
Provided there are some residual degrees of freedom, an analysis of variance table can be completed showing the sum of squares explained by each model term (1 degree of freedom each) and the corresponding p-value which indicates whether the sum of squares is significantly higher than the background experimental variability.
Sugar reduction study
In this example, there are 15 main effects and interactions, each of which has 1 parameter (and hence 1 degree of freedom). Including 1 degree of freedom for the overall mean, the full model has 16 degrees of freedom for the 16 observations so it fits the data perfectly with zero residuals and no residual degrees of freedom.
In order to test the significance of some terms in this model, we must assume that there are no high-order interactions. Click the checkboxes to remove the 4-factor interaction and all 3-factor interactions. This combines the sums of squares for these terms to form a residual sum of squares.
The mean residual sum of squares, 0.020, estimates
the error variance for the experiment.
We therefore estimate that the standard deviation
of the experimental error is the square root of this, 0.14.
From the p-values, we conclude that several 2-factor interactions are significant.
Alternative experimental designs
Replication of a 2k factorial experiment is the best way to estimate the error variance without assuming any interactions to be negligible, but even a single replicate can be prohibitively expensive.
An alternative way to estimate the error variance involves a few extra runs of the experiment at centre points of the design. Designs with centre points will be described later in this section.