In an experiment for two factors, treatments are combinations of levels for the factors. A factorial experiment uses the same number of replicates for all possible treatments and randomly allocates them to the experimental units.
This page shows examples of factorial experiments for two factors.
Sometimes the effect of changing one factor depends on the level of the other factor. A factorial experiment is needed to assess this interaction between factors.
In an experiment for one factor, a second factor can be varied without reducing the accuracy of the estimate for the first factor. It is far more efficient to assess two factors in a single factorial experiment than in separate experiments for the two factors.
Sometimes two factors are thought to affect a response but in an experiment they cannot be independently adjusted. In other experiments, it is impossible to set the value of the 'controlled' factor precisely so the factor level varies randomly.
The response is modelled using a normal distribution whose mean depends on the factor levels and whose standard deviation is the same for all treatments.
If the response does not have a similar spread of values for all treatments, a nonlinear transformation of the response such as a logarithmic transform may help.
The mean response can be modelled as the sum of two terms, each depending on one of the two factors.
The no-interaction model for two categorical factors uses parameters for each factor level of each except for the factor's 'baseline' level.
When terms are added to a model for the response, the residual sum of squares usually decreases.
The reductions in the residual sum of squares are called explained sums of squares. The explained sums of squares also summarise how the fitted values change when terms are added to the model.
If the experimental design is balanced, all explained sums of squares can be shown in a single table.
The explained sums of squares form the basis of an analysis of variance table that can be used to test the significance of the two factors in the model.
If two factors do not interact in their effect on the response, the effects of each can be separately described.
A model in which two factors interact in their effect on the response has a separately adjustable mean for each combination of factor levels. The model can be written using 'main effect' parameters for the two factors and an interaction term.
Comparing the mean interaction sum of squares against the mean residual sum of squares gives a test for whether there is interaction.
If it is concluded that there is no interaction, the results can be summarised in separate plots of the mean response against X and Z. If there is interaction, the model means for all treatment combinations must be shown in profile plots.
If there is only a single replicate for each treatment in a factorial experiment for two categorical factors, the effects of the factors can be tested if it is assumed that there is no interaction, but the existence of interaction cannot be tested.
The existence and amount of interaction is affected by nonlinear transformations of the response. Sometimes analysing logarithms of the response values can remove interaction, making the results easier to interpret.
Numerical factors can be modelled as though they were categorical but the resulting relationship may not be 'smooth'.
If the controlled factors in an experiment are numerical, the response mean can be modelled using linear terms in the two variables. This model corresponds to a plane in 3-dimensions and can be fitted by least squares.
The reductions in the residual sum of squares from adding a linear term for each factor, then generalising it to a categorical term are explained sums of squares. They can be used in an analysis of variance table to test whether the two factors affect the response and to test for curvature in their effects.
Interaction occurs when the effect of X on the mean response is different for different values of Z. If X and Z are numerical, adding a term with the product of X and Z to the model may explain the interaction.
When X is numerical and Z is categorical, interaction can be modelled with a separately adjustable regression line for each value of Z.
Replacing a linear term by a quadratic allows some curvature without the full flexibility of a categorical term. Quadratic models will be fully investigated in a later chapter.
A factorial design uses the same number of runs for all possible combinations of factor levels. A simple model adds terms for the separate factors to a normally distributed error.
The model parameters are estimated to minimise the sum of squared residuals.
In many experiments, the effect of altering one factor depends on the values of the others. Interactions can exist between the effects of pairs of factors. A three-factor interaction involves all factors.
Adding main effects and interaction terms to the model reduces the residual sum of squares. The reductions are explained sums of squares and can be used to test the significance of the terms.
The main effects and interactions can be displayed in 2-dimensional plots of the mean response against each explanatory variable.
If there is only one replicate, the full model with all interactions fits the data perfectly with no residual degrees of freedom. The main effects and interactions cannot be tested in this model. High-order interactions must be negligible (and actually assumed to be zero) in order to perform tests.
If several factors must be assessed, factorial designs with more than 2 levels per factor usually involve too many runs of the experiment.
The factor levels can be coded as -1 and +1 and treated as numerical variables. The main effects for the factors are differences between the response means at the two levels.
Interactions can be modelled with terms that are products of the +/-1 values of the coded main effect variables. The main effects and interactions are orthogonal in a factorial design.
If there are no replicates, the full model with all interactions has no residual degrees of freedom. Some high-order interactions must be negligible (assumed to be zero) before tests can be performed.
A half-normal probability plot of the main effects and interactions can guide the choice of terms to contribute to the residual sum of squares.
Two-dimensional plots of the mean response against each factor can display the main effects and interactions.
If all factors are numerical, an alternative to replicating the complete design or assuming that high-order interactions are negligible is to conduct several runs of the experiment at an average level for all factor. This allows tests of all interactions and also provides a test for linearity.