Decisions from tests

Hypothesis tests often result in some action by the researchers that depends on whether we conclude that H0 or HA is true. This decision depends on the data.

Decision    Action
accept H0    some action (often the status quo)   
reject H0    a different action (often a change to a process)   

However the decision that is made could be wrong. There are two ways in which an error might be made — wrongly rejecting H0 when it is true (called a Type I error), and wrongly accepting H0 when it is false (called a Type II error). These are represented by the red cells in the table below:

Decision
  accept H0     reject H0  
True state of nature H0 is true    correct Type I error
HA (H0 is false)     Type II error correct

A good decision rule about whether to accept or reject H0 (and perform the corresponding action) will have small probabilities for both kinds of error.

Saturated fat content of cooking oil

The clinician who tested the saturated fat content of soybean cooking oil was interested in the hypotheses.

H0 :   \(\mu = 15%\)
HA :   \(\mu \gt 15%\)

If H0 is rejected, the clinician intends to report the high saturated fat content to the media. The two possible errors that could be made are described below.

Decision
  accept H0  
(do nothing)
  reject H0  
(contact media)
Truth H0: µ is really 15% correct wrongly accuses manufacturers
HA: µ is really over 15%     fails to detect high saturated fat correct

Ideally the decision should be made in a way that keeps both probabilities low.

Decision rule and significance level

The significance level for any particular decision rule is the probability of wrongly rejecting the null hypothesis — the probability of a Type I error.

Fixing the significance level of the test to say 5% therefore sets the details of the decision rule such that

\[ P(\text{reject }H_0 \mid H_0 \text{ is true}) \;\;=\;\; 0.05 \]

This does not however tell you the probability of a Type II error.

Illustration

We illustrate the idea of these two types of error (and their probabilities) using a test for the mean of a normal distribution whose standard deviation is known to be \(\sigma = 4\). The test will be based on a random sample of \(n = 16\) values and will assess the hypotheses

H0 :   μ = 10
HA :   μ > 10

We will use the sample mean as a test statistic since its distribution is known when the null hypothesis holds,

\[ \overline{X} \;\;\sim\;\; \NormalDistn\left(\mu_0, \frac{\sigma}{\sqrt{n}}\right) \;\;=\;\; \NormalDistn(10, 1) \]

If the null hypothesis does not hold, the distribution of the test statistic is \(\NormalDistn(\mu, 1)\) where \(\mu \ne 10\). These two distributions can be used to calculate the probabilities of the two types of error.

Large values of \(\overline{X}\) would usually be associated with the alternative hypothesis, so we will consider decision rules of the form

Data Decision
< k    accept H0
is k or higher    reject H0   

for some value of \(k\), the critical value for the test. The diagram below illustrates how the probabilities of the two types of error depend on \(k\).

Drag the slider at the top of the diagram to adjust \(k\). Observe that making \(k\) large reduces the probability of a Type I error, but makes a Type II error more likely. It is impossible to simultaneously make both probabilities small with only \(n\) = 16 observations.


Note also that there is not a single value for the probability of a Type II error — the probability depends on how far above 10 the mean \(\mu\) lies. Drag the slider on the row for the alternative hypothesis to observe that:

The probability of a Type II error is always high if \(\mu\) is close to 10, but is lower if \(\mu\) is far above 10.

This is as should be expected — the further that the real value \(\mu\) is above 10, the more likely we are to detect that it is higher than 10 from the sample mean.

The decision rule affects the probabilities of Type I and Type II errors and there is always a trade-off between these two probabilities. Selecting a critical value to reduce one error probability will increase the other.