Long page
descriptions

Chapter 5   More about treatments

5.1   Contrasts and constraints

5.1.1   Treatments and factors

The combinations of factor levels used are called treatments. The sum of squares explained by the treatments (treated as a single factor) is closely related to those of the main effects and interaction between the actual factors.

5.1.2   Defining models with constraints

The model with no interaction between two factors can be expressed with constraints on the parameters of the most general model. Similarly, models without main effects can be expressed in terms of constraints.

5.1.3   Comparing a control to other treatments

Models where the control level of a single factor equals the average for the other treatments, and where all non-control levels are equal, can both be expressed in terms of constraints.

5.1.4   Analysis of variance for constraints

The treatment sum of squares for an experiment with a control level can be split into a sum of squares explained by the difference between the control and other levels, and a sum of squares explained by differences between the non-control levels. This allows both constraints to be tested.

5.1.5   Comparing groups of treatments

The treatment sum of squares can be split into components for differences between two groups of treatments, and for differences between these groups.

5.1.6   Comparisons in block designs

The above methods for splitting the treatment sum of squares can also be used in more complex experimental designs such as randomised blocks.

5.1.7   Estimating contrasts

An alternative approach to analysing the structure of the levels of a factor estimates linear functions of the model parameter called contrasts.

5.2   Numerical factors

5.2.1   One factor with numerical levels

If the levels of the controlled factor are numerical values -- quantities of some numerical variable -- then we usually expect a 'smooth' relationship between the factor and response.

5.2.2   Linear and quadratic models

The simplest model for a smooth relationship between a numerical factor and response contains a linear term in the factor value. Adding a further quadratic term can model some curvature in the relationship.

5.2.3   Terms and constraints

Linear and quadratic models correspond to terms added to a model in which the factor does not affect the response. They can be equivalently defined through constraints on the smoothness of the parameters of the most general factor model.

5.2.4   Explained sums of squares

The differences between the residual sum of squares of different models for a numerical factor called explained sums of squares and are have degrees of freedom equal to the difference between the unknown parameters of the models.

5.2.5   Anova table and tests

The explained and residual sums of squares can be presented in an analysis of variance table. The p-values can be used to test whether quadratic or linear relationships are consistent with the data.

5.2.6   Equivalence of models

If there are only two levels of the factor, a linear model is equivalent to a model with parameters for each level mean. Similarly, a quadratic model is equivalent to a this general factor model if there are three factor levels.

5.2.7   Numerical factors in other designs

In any model with a numerical factor, its explained sum of squares can be split into linear and nonlinear components. The nonlinear sum of squares can be used to test whether the relationship is linear.

5.2.8   Response surface models

In some applications, we want to find the combination of values of two factors that optimises some response. A model with quadratic terms in the factors and an interaction term is often used.

5.3   Multiple comparisons

5.3.1   Problems with multiple tests

If several hypothesis tests are conducted with the same significance level, the probability of at least one being significant can be much higher. A modification allows the overall significance level to be fixed.

5.3.2   Which means are different?

If it is concluded that the reponse mean depends on the factor level, it is natural to ask which levels differ. If the experiment involves several factor levels, t-tests to compare pairs of means should be avoided -- with many such tests, there is an increased probability of a low p-value for at least one pair of means.

5.3.3   Multiple comparisons

A wider difference than that suggested by a pairwise comparison should be used when there are several group means to compare.

5.3.4   Comparisons in randomised block designs

A similar adjustment can be made for comparison of treatments in a randomised block design.

5.3.5   Warning: Are multiple comparisons necessary?

If there is any structure to the factor levels, it is often best to use contrasts to make more meaningful comparisons between the levels.

5.4   Unequal replicates

5.4.1   Missing values with a single factor

If some response measurements are missing, the ordinary analysis of variance table can still be used to test whether the level means are equal, provided the probability of a missing value is unrelated to the response that would have been obtained.

5.4.2   Unequal replicates by design

If a factor has a control level, it is sometimes replicated more than the other factor levels.

5.4.3   Orthogonal designs for two factors

If all treatments have the same replicates, the sums of squares for the two factors do not depend on the order of adding them to the model. The same holds provided the replicates for each level of X are in the same proportion for each level of Z.

5.4.4   Anova for non-orthogonal factors

If two factors are not orthogonal, there are different analysis of variance tables for the two orders of adding the factors. Both must be examined to fully understand the data.

5.4.5   Orthogonal factor and blocks

Provided the replicates of a factor in each block are in proportion, the blocks and factor are orthogonal. However this is less important since blocks are always added to the model before the factor so only a single anova table need be considered.

5.4.6   Missing treatments and confounding

This section has only considered data with at least one replicate for every treatment. Missing treatments mean that some effects cannot be estimated. In extreme cases, it is impossible to distinguish the effects of different factors -- they are confounded.

5.5   Covariates

5.5.1   Variability in experimental units

The more variability in the experimental units that is explained by the model used, the greater the accuracy of estimating the effects of the factors of interest.

5.5.2   Effect of unmodelled covariate

Randomisation ensures that the effects of factors are always estimated without bias. However variation between the units getting different treatments can result in very variable estimates if the variation in the experimental units is not modelled.

5.5.3   Model for numerical covariate

Numerical measurements describing variability in the experimental units can be modelled with linear terms in the model.

5.5.4   Inference with covariates

There is usually no interest in testing the significance of covariates. Inference about the model terms relating to the factors of interest can be done with confidence intervals and analysis of variance as usual.

5.5.5   Blocks and covariates

Experimental units can be grouped into blocks and also have covariates. Blocks and covariates should be included in the model before testing the effects of the factors of interest.

5.5.6   Categorical covariates (cofactors)

Categorical covariates are modelled in exactly the same way as blocks. Models can contain several terms explaining different aspects of the variability in the experimental units.

5.5.7   Recovery from bad design

'Bad' allocation of treatments to experimental units can sometimes be saved if covariates are recorded and used to reduce unexplained variation.