Alternative parameterisations
There are several alternative ways to write a general linear model that all allow an arbitrary mean for each of g groups. We have already seen parameterisations with:
Many other parameterisations of the same basic model are possible.
We will describe a few alternative parameterisations in this page.
Different parameterisations can allow the explained sum of squares for the factor (with g - 1 degrees of freedom) to be split into components that can be used in an anova table to test meaningful hypotheses.
Parameterisation for testing whether some group means are equal
We first consider how to define indicator variables for testing whether some group means are equal.
This can be explained most easily in an example.
Comparisons with a control group
Consider experimental data that includes a Control group and three other treatment groups, AAA, BBB and CCC. We might be interested in testing whether AAA, BBB and CCC have equal group means. We might define the indicator variables as follows:
If there were 3 response measurements (replicates) at each factor level, the X matrix would be as shown below:
Click any row to see how the indicator variables select parameters to form the response mean. The next diagram shows the meanings of the parameters graphically. (The response values were simply chosen to illustrate the model.)
Drag the red arrows and observe that the model still allows arbitrary means for all groups.
Testing equality of some group means
For the above example, if the last two parameters were zero — i.e. if the last two indicator variables were deleted from the model, the response means for AAA, BBB and CCC would be constrained to be equal. The sequential sum of squares for these two indicator variables therefore leads to a hypothesis test for whether these three response means are equal.
The following numerical example explains.
Pain tolerance and hair colour
Studies conducted at the University of Melbourne indicate that there may be a difference between the pain thresholds of blonds and brunettes. Men and women of various ages were divided into four categories according to hair colour: light blond, dark blond, light brunette, and dark brunette. The purpose of the experiment was to determine whether hair colour is related to the amount of pain produced by common types of mishaps and assorted types of trauma. Each person in the experiment was given a pain threshold score based on his or her performance in a pain sensitivity test (the higher the score, the higher the person's pain tolerance).
In the parameterisation below, the first indicator variable is a contrast between the blonds and brunettes. The other indicator variables distinguish (light and dark blonds) and (light and dark brunettes).
Analysis of variance table
The analysis of variance table below shows the sequential sums of squares, split into a component for the first indicator variable contrasting blonds and brunettes, then a component for the other two indicator variables comparing the sub-colours (light vs dark blonds and light vs dark brunettes).
From the p-value associated with the sequential sum of squares between sub-colours, we conclude that there is no evidence in the data of differences in pain threshold between light and dark blonds or between light and dark brunettes.
Since this p-value is not significant, we can continue to interpret the p-value above as giving strong evidence of a difference between blonds and brunettes.
Click the checkbox Combined ssq for hair colour to add together these sequential sums of squares. The resulting sum of squares (3 d.f.) is identical to the explained sum of squares that would be obtained from other parameterisations of the model.
Illustration of the sequential sums of squares
The explained sums of squares in an anova table (whether sequential or not) are always the sum of squares of differences between the fitted values from two models. The diagram below illustrates the explained sums of squares for the pain tolerance data.
Use the pop-up menu to see the differences that are summed for each of the explained sums of squares.