Does the factor affect the response?

In this section, we assume data from a completely randomised experiment that satisfy a normal model of the form



yij  =  
(explained
by factor
)
µi


 + 

(unexplained)
εij
 

for i = 1 to g and j = 1 to ni

where εij  ∼  normal (0, σ)

If the factor does not affect the response, the treatment means in the model will all be equal. A test for this therefore involves the hypotheses,

H0 :   µi  =  µj        for all i and j
HA:   µi  ≠  µj        for at least some i, j

However unexplained (random) variation will result in sample treatment means that are unlikely to be equal, even if the factor really has no effect.

How much variation in the observed treatment means is needed for us to conclude that the factor does affect the response?

Both variation between treatment means (explained variation) and variation within treatments (unexplained variation) must be used to answer this question.

In the remaining pages of this section, we will introduce statistics that summarise explained and unexplained variation and will show how they are used in a formal test for whether the factor affects the response.

Variation between treatment means (explained variation)

The jittered dot plots below show artificial data that should be considered to have arisen from a completely randomised experiment in which 10 experimental units are given each of 4 levels of a factor.

Use the slider to alter the difference between the treatment means. Observe that:

Variation within treatments (unexplained variation)

The diagram below is similar, but the slider adjusts the spread of values observed within each factor level, leaving the treatment means unaltered.

Observe that ...

Are the underlying means equal?

The evidence for a difference between the treatment means depends on both the variation between and within treatments. It is strongest when:

  • the between-treatment variation is relatively high, and
  • the within-treatment variation is relatively low.