Assumption of equal variances

The test on the previous page assumed that the variances of normal distributions in the two samples were equal. We now describe a hypothesis test to assess this assumption.

H0 :   \(\sigma_1^2 = \sigma_2^2\)
HA :   \(\sigma_1^2 \ne \sigma_2^2\)

We showed earlier that the two sample variances, \(S_1^1\) and \(S_2^2\), have distributions proportional to chi-squared distributions,

\[ \frac{n_1-1}{\sigma_1^2} S_1^2 \;\;\sim\;\; \ChiSqrDistn(n_1 - 1 \text{ df}) \]

and similarly for \(S_2^2\).

Hypothesis test

The ratio of the two sample variances can be used as a test statistic for this test.

Test statistic

If \(\overline{X}_1\) and \(S_1^2\) are the mean and variance of a sample of \(n_1\) values from a \(\NormalDistn(\mu_1, \sigma_1^2)\) distribution and \(\overline{X}_2\) and \(S_2^2\) are the mean and variance of an independent sample of \(n_2\) values from a \(\NormalDistn(\mu_2, \sigma_2^2)\) distribution,

\[ F \;\;=\;\; \frac{S_1^2}{S_2^2} \;\;\sim\;\; \FDistn(n_1 - 1,\; n_2 - 1 \text{ df}) \]

provided \(\sigma_1^2 = \sigma_2^2\).

If \(\sigma_1^2 = \sigma_2^2 = \sigma^2\),

\[ \begin{align} F \;\;=\;\; \frac{S_1^2}{S_2^2} \;\;&=\;\; \frac{ \frac{\large {n_1 - 1}}{\large \sigma^2} S_1^2}{n_1 - 1} \div \frac{ \frac{\large {n_2 - 1}}{\large \sigma^2} S_2^2}{n_2 - 1} \\ &\sim\;\; \frac{\ChiSqrDistn(n_1 \text{ df})}{n_1 - 1} \div \frac{\ChiSqrDistn(n_2 \text{ df})}{n_2 - 1} \\[0.4em] &=\;\; \FDistn(n_1 - 1,\; n_2 - 1 \text{ df}) \end{align} \]

The p-value for the test can be found from this distribution.

Example

When analysing the data set about the effect of the hormone IAA on the growth of dwarf pea stems on the previous page, an assumption was made that the underlying normal distribution's variance was the same for both hormone levels. Test whether the data are consistent with this assumption.

The two sample variances were \(s_X^2 = 0.244\) and \(s_Y^2 = 0.353\) and their ratio is

\[ f \;\;=\;\; \frac{0.244}{0.353} \;\;=\;\; 0.693 \]

This should be compared to the \(\FDistn(10,\; 12 \text{ df})\) distribution in a two-tailed test. The tail probability of this distribution below 0.693 is 0.285, so the p-value for the test is

p-value   =   \(2 \times 0.285 \;\;=\;\; 0.569\)

Since this p-value is high, we conclude that sample variances that are as different as those observed in the data could easily have arisen by chance, even if the underlying normal variances were equal. There is therefore no evidence from the data that the two normal distribution variances differ.