The combinations of factor levels used are called treatments. The sum of squares explained by the treatments (treated as a single factor) is closely related to those of the main effects and interaction between the actual factors.
The model with no interaction between two factors can be expressed with constraints on the parameters of the most general model. Similarly, models without main effects can be expressed in terms of constraints.
Models where the control level of a single factor equals the average for the other treatments, and where all non-control levels are equal, can both be expressed in terms of constraints.
The treatment sum of squares for an experiment with a control level can be split into a sum of squares explained by the difference between the control and other levels, and a sum of squares explained by differences between the non-control levels. This allows both constraints to be tested.
The treatment sum of squares can be split into components for differences between two groups of treatments, and for differences between these groups.
The above methods for splitting the treatment sum of squares can also be used in more complex experimental designs such as randomised blocks.
An alternative approach to analysing the structure of the levels of a factor estimates linear functions of the model parameter called contrasts.
If the levels of the controlled factor are numerical values -- quantities of some numerical variable -- then we usually expect a 'smooth' relationship between the factor and response.
The simplest model for a smooth relationship between a numerical factor and response contains a linear term in the factor value. Adding a further quadratic term can model some curvature in the relationship.
Linear and quadratic models correspond to terms added to a model in which the factor does not affect the response. They can be equivalently defined through constraints on the smoothness of the parameters of the most general factor model.
The differences between the residual sum of squares of different models for a numerical factor called explained sums of squares and are have degrees of freedom equal to the difference between the unknown parameters of the models.
The explained and residual sums of squares can be presented in an analysis of variance table. The p-values can be used to test whether quadratic or linear relationships are consistent with the data.
If there are only two levels of the factor, a linear model is equivalent to a model with parameters for each level mean. Similarly, a quadratic model is equivalent to a this general factor model if there are three factor levels.
In any model with a numerical factor, its explained sum of squares can be split into linear and nonlinear components. The nonlinear sum of squares can be used to test whether the relationship is linear.
In some applications, we want to find the combination of values of two factors that optimises some response. A model with quadratic terms in the factors and an interaction term is often used.
If several hypothesis tests are conducted with the same significance level, the probability of at least one being significant can be much higher. A modification allows the overall significance level to be fixed.
If it is concluded that the reponse mean depends on the factor level, it is natural to ask which levels differ. If the experiment involves several factor levels, t-tests to compare pairs of means should be avoided -- with many such tests, there is an increased probability of a low p-value for at least one pair of means.
A wider difference than that suggested by a pairwise comparison should be used when there are several group means to compare.
A similar adjustment can be made for comparison of treatments in a randomised block design.
If there is any structure to the factor levels, it is often best to use contrasts to make more meaningful comparisons between the levels.
If some response measurements are missing, the ordinary analysis of variance table can still be used to test whether the level means are equal, provided the probability of a missing value is unrelated to the response that would have been obtained.
If a factor has a control level, it is sometimes replicated more than the other factor levels.
If all treatments have the same replicates, the sums of squares for the two factors do not depend on the order of adding them to the model. The same holds provided the replicates for each level of X are in the same proportion for each level of Z.
If two factors are not orthogonal, there are different analysis of variance tables for the two orders of adding the factors. Both must be examined to fully understand the data.
Provided the replicates of a factor in each block are in proportion, the blocks and factor are orthogonal. However this is less important since blocks are always added to the model before the factor so only a single anova table need be considered.
This section has only considered data with at least one replicate for every treatment. Missing treatments mean that some effects cannot be estimated. In extreme cases, it is impossible to distinguish the effects of different factors -- they are confounded.
The more variability in the experimental units that is explained by the model used, the greater the accuracy of estimating the effects of the factors of interest.
Randomisation ensures that the effects of factors are always estimated without bias. However variation between the units getting different treatments can result in very variable estimates if the variation in the experimental units is not modelled.
Numerical measurements describing variability in the experimental units can be modelled with linear terms in the model.
There is usually no interest in testing the significance of covariates. Inference about the model terms relating to the factors of interest can be done with confidence intervals and analysis of variance as usual.
Experimental units can be grouped into blocks and also have covariates. Blocks and covariates should be included in the model before testing the effects of the factors of interest.
Categorical covariates are modelled in exactly the same way as blocks. Models can contain several terms explaining different aspects of the variability in the experimental units.
'Bad' allocation of treatments to experimental units can sometimes be saved if covariates are recorded and used to reduce unexplained variation.