Sums of squares

The table below summarises the interpretation of the total, within-groups and between-groups sums of squares.

Sum of squares Interpretation
Overall variability of Y, taking no account of the groups.
Describes variability around the group means and is therefore variability that cannot be explained by the model.
Describes how far the group means are from the overall mean — i.e. the variability of the group means. It can also be interpreted as the sum of squares explained by the model.

The best prediction for any observation in group i would be if groups were not taken into account, whereas it would be with our model.

The between-group sum of squares summarises how much predictions are improved by using the model.

Coefficient of determination

Since the total sum of squares is the sum of the between-group (explained) and within-group (residual) sums of squares, a useful summary statistic is the proportion of the total sum of squares that is explained by the model. This proportion is called the coefficient of determination and is denoted by R2.

Note the following properties of R2.

0  ≤  R2  ≤  1


Examples

The diagram below shows how R2 is calculated and interpreted for a few data sets.

Note that we have not taken into account randomess of the sums of squares. We cannot conclude from the R2 value on its own whether the underlying group means are different.