Standard errors and t-tests

The 'baseline-category' parameterisation,

is still a general linear model. Therefore all general results from the first sections of this chapter can be used to further analyse it. In particular, GLM theory provides standard errors for the parameter estimates and t-tests for whether individual parameters are zero.

As usual with GLMs, the hypthesis tests describe the effect of removing a variable from the full model. For the indicator variables, this can be interpreted as follows:

The t-test for whether the coefficient of a group's indicator variable is zero is really a test for whether that group's mean is equal to the mean for the baseline group.


Antibiotic effectiveness

We now return to the experimental data about binding percentages of antibiotics, but we use the full data set with all five antibiotics. (In the previous page, we only showed data from four of the antibiotics to simplify the presentation.) The diagram below shows the full data set.

The table shows the least squares estimates of the intercept and the coefficients of the indicator variables, using Penicilin as the baseline group. The table also gives p-values for the parameters.

Intercept
The p-value for the 'intercept' term is testing whether the mean for the baseline category, Penicillin, is zero. (We conclude that it is not zero.)
Indicator variables
The p-values for the indicator variables test whether the corresponding group mean is equal to the baseline mean. (We would conclude that Streptomycin and Erythromycin have different means from Penicillin.)

You can use the checkboxes to remove indicator variables from the model. Observe how this merges groups (antibiotics) with the baseline group (Penicillin).

Problem with multiple comparisons

In some data sets, one group is 'special' in some ways and is a natural choice for the baseline group. For example, there may be a 'control' treatment in an experiment (a group that gets a 'standard' treatment) and we may be interested in comparing all other treatments with this 'standard' treatment. The p-values for the individual indicator variables then test whether the other treatments have different mean response from the control group.

In many other situations, all groups have similar status and the choice of a 'baseline' group is arbitrary. The more groups, the greater the number of possible pairwise comparisons that could be tested with t-tests.

Number of groups, gNumber of pairwise comparisons
2 1
3 3
4 6
5 10
: :

As the number of possible pairwise comparisons of groups increases, there becomes a greater chance that at least one pair of groups will appear significantly different, even if all groups really have the same mean.

Antibiotic effectiveness

In this data set, none of the five antibiotics is a standard one, so there are 10 possible pairs of antibiotic that we might want to compare with t-tests.

If t-tests are used to compare pairs of antibiotics but there are really no population differences, at least one pairwise comparison will have a p-value less than 5% with probabilitiy 25.8%.

In other words, there is a reasonable chance that one pair of antibiotics will appear different even if all antibiotics are really the same.

A single p-value that is lower than 5% would not be unusual and may not give much evidence of differences between the group means.

The standard guidelines of 5% and 1% for interpreting p-values must be interpreted differently if you are performing many tests simultaneously.

General results about pairwise comparisons

The diagram below can be used to read off the probability of at least one pair of groups seeming different (p-value < 5%) when all groups really have the same mean response.

Observe that:


Testing for equal group means

Since it is hard to interpret the p-values for all pairwise comparisons of group means, it is better to find a single hypthesis test that simultaneously assesses whether all group means are equal.

In the rest of this section, we use analysis of variance to perform a single test for equality of all group means.