Aim for a simple model

The more terms in a model, the greater its flexibility and ability to describe patterns in the experimental data. However when terms are added, the model also becomes harder to interpret.

We therefore want to find the simplest model that is consistent with the data.

Analysis of variance is again used to find the simplest model.

Analysis of variance (anova) table

A column of mean sums of squares (sums of squares divided by their degrees of freedom) is added to the sum of squares table. The mean explained sums of squares are compared to the mean residual sum of squares with F ratios.

Large explained sums of squares correspond to significant terms.

A p-value for each term shows whether its F ratio is larger than could be expected by chance.

The p-values are used to decide which terms are unimportant and can be dropped from the full model.

Note however that we should only consider hierarchical models, so terms must not be dropped if higher-order interactions involving them are still in the model.

Soft drink bottling

The diagram below initially shows the best-fitting model with all interactions for the soft drink bottling data.

The 3-factor interaction term has a high p-value and is not significant so it can be dropped from the model. (Click the checkbox to the left of the term to delete it.)

There is similarly no evidence of interactions between Line speed and the other two factors, so drop them from the full model too. This leaves a model with all main effects (all factors affect mean Fill height deviation) and an interaction between Pressure and Carbonation.

This is the simplest model that is consistent with the data — all remaining terms are significant.

Click the y-z rotation button. The purple lines show the mean Fill height deviation for low Line speed — examine them to help understand the nature of the interaction between Pressure and Carbonation. (The green lines for high Line speed have the same pattern.)

Increasing pressure has less effect on the mean response at 10% carbonation than at higher carbonation levels.

Wear of coated fabrics

An experiment was conducted to assess the durability of coated fabric subjected to standard abrasive tests. A factorial experiment was conducted with two different fillers (F1 and F2) in three different proportions (25%, 50% and 75%), either with or without surface treatment of the fabric. Two replicate fabric specimens were tested for each of the 12 treatment combinations in a completely randomised design. The response is the weight loss (mg) of the fabric specimens in the abrasion test.

For this data set, neither the 3-factor interaction nor the 2-factor interaction between Surface treatment and Percentage filler are significant. Use the checkboxes to delete these two terms from the model.

This is the simplest model that is consistent with the data — all remaining terms are significant.

This model has two 2-factor interaction terms so it is a little harder to understand than the best model for the soft drink data. Click the y-x rotation button to help understand the nature of this model.

For each filler type, surface treatment and percent filler have no interaction. (The two green lines are parallel and the two purple lines are parallel.)

(Also try clicking the y-z rotation button.) To interpret the model, we must therefore separately explain it for the two filler types:

For filler 1,

  • Each extra 25% in percent filler increases fabric loss by about 30.
  • Surface treatment increases fabric loss by about 40.

For filler 2,

  • The filler percent makes little difference to fabric loss.
  • Surface treatment increases wear by about 90.