We now formally test the hypotheses

The following test statistic is used,

\[ X^2 \;\;=\;\; 2\log(R) \;\;=\;\; 2\left(\ell(\mathcal{M}_B) - \ell(\mathcal{M}_S)\right) \]

\(X^2\) has (approximately) a standard distribution when H0 holds, and is likely to be largest if \(\mathcal{M}_S\) is not correct.

Distribution of test statistic

If the data do come from \(L(\mathcal{M}_S)\), and \(L(\mathcal{M}_B)\) has \(k\) more parameters than \(L(\mathcal{M}_S)\),

\[ X^2 \;\;=\;\; 2\left( \ell(\mathcal{M}_B) - \ell(\mathcal{M}_S)\right) \;\;\underset{\text{approx}}{\sim} \;\; \ChiSqrDistn(k \text{ df}) \]

Likelihood ratio test

  1. Find the maximum likelihood estimates of all unknown parameters in \(\mathcal{M}_B\).
  2. Find the maximum likelihood estimates of all unknown parameters in \(\mathcal{M}_S\).
  3. Evaluate the test statistic, \(\chi^2 = 2\left( \ell(\mathcal{M}_B) - \ell(\mathcal{M}_S)\right)\).
  4. The degrees of freedom for the test are the difference between the numbers of unknown parameters in the two models.
  5. The p-value for the test is the upper tail probability of the \(\ChiSqrDistn(k \text{ df})\) distribution above the test statistic.
  6. Interpret the p-value as for other kinds of hypothesis test — small values give evidence that the null hypothesis, model \(\mathcal{M}_S\), does not hold.

Question

The following table describes the number of defective items from a production line in each of 20 days.

1
2
3
4
2
3
2
5
5
2
4
3
5
1
2
4
0
2
2
6

Assuming that the data are a random sample from a \(\PoissonDistn(\lambda)\) distribution, use a likelihood ratio test for whether the rate of defects was \(\lambda = 2\) per week.

(Solved in full version)