Distribution of F ratio

Continuing our analysis in the same way as was done for the multi-group normal model, the mean residual and explained sums of squares are the sums of squares divided by their degrees of freedom. We define an F ratio,

F=MSSexplained/MSSresidual

If the underlying model's slope is zero (so X and Y are unrelated),
...the two sums of squares have chi-squared distributions so the F ratio has an F distribution with (1, n - 2) degrees of freedom.
If the underlying model's slope is non-zero,
...the F ratio is expected to be higher.

The calculations are again organised in an analysis of variance table.

Anova table

Simulation illustrating distribution of F

The diagram allows samples to be selected from a normal linear model with β1 = 0 and σ = 10.

Click Take sample several times to build up the distribution of the F ratio. Observe that it has an extremely skew distribution.

The theoretical F distribution is also shown in grey. The tail of the F distribution is longest when the sample size is small — the difference is not clear in the diagram here, but is important.

Note that the underlying slope is zero in this simulation — the explanatory variable does not affect Y. If the slope was non-zero, the F ratio's distribution would have a higher mean.