Power of a test

For any decision rule, a Type II error arises when we decide that is H0 is true when it is really false. However rather than describing this, it is conventional to describe its converse.

Definition

The power of a decision rule is the probability of correctly deciding that the alternative hypothesis is true when it really is true,

\[ \begin{align} \text{Power} \;&=\; 1 - P(\text{Type II error}) \\[0.3em] &=\; P\left(\text {decide }H_A \text{ is true} \mid H_A\right) \end{align} \]

A test with high power is therefore desirable.

The power of a decision rule is usually not a single value since the alternative hypothesis allows for a range of different parameter values, such as \(\mu \gt 12\). It is really a power function that can be graphed against the possible parameter values.

Failure of printed circuit boards

The example on the previous page about a new method of producing printed circuit boards examined the hypotheses

where \(\pi\) is the probability of failure of a board. We now consider the decision rule that rejects the null hypothesis (and concludes that the new production method results in a lower probability of failure) if \(y=7\) or fewer fail out of a sample of \(n=200\) boards. We showed previously that this decision rule has significance level \(\alpha = 0.083\).

If the alternative hypothesis holds and \(\pi \lt 0.06\), the power function is

\[ \begin{align} \operatorname{Power}(\pi) \;&=\; P\left(\text {decide }H_A \text{ is true} \mid \pi \lt 0.06\right) \\[0.4em] &=\; P(X \le 7 \mid \pi \lt 0.06) \\ &=\; \sum_{x=0}^7 {200 \choose x} \; \pi^x \; (1 - \pi)^{200-x} \end{align} \]

The diagram below graphs the power against \(\pi\).

Drag the slider to see how the probability of rejecting H0 depends on the actual value of \(\pi\).


Observe that the power of the test is 0.746 when \(\pi = 0.03\). The engineer would be disappointed with this. It means that if the new manufacturing method has actually decreased the probability of failure to 0.03 from 0.06 (a big improvement), there is still a good chance (25%) that the null hypothesis, \(\pi = 0.06\), is accepted and the improvement is rejected.

Changing the decision criterion

Ideally, the power of a test should be zero if H0 is true and one if HA is true, but this is impossible to obtain in practice.

Changing the decision rule alters the curve, but unfortunately changing the rule to increase the power of the test (and decrease the probability of a Type II error) also increases its significance level — there is a trade-off between the two types of error.

Failure of printed circuit boards

The slider under the following power function allows the decision rule to be altered.

Observe that