Long page
descriptions

Chapter 11   Hypothesis Tests

11.1   Hypothesis test concepts

11.1.1   Inference

Estimation and hypothesis testing are different branches of inference about the parameters in statistical models.

11.1.2   Null and alternative hypotheses

Hypothesis tests compare two claims about the model parameters, the null and alternative hypotheses.

11.1.3   Test statistic

A test statistic is a function of the data whose value is likely to be more "extreme" in some sense when the alternative hypothesis is true than when the null hypothesis is true. Its distribution should also be fully known when the null hypothesis holds.

11.1.4   P-value and its interpretation

The p-value for a test is the probability of as "extreme" a test statistic as the one that was observed, when the null hypothesis holds.

11.1.5   Two-tailed tests

When values of the test statistic at both ends of its distribution favour the alternative hypothesis, the test is called two-tailed. The p-value is double the smaller tail probability.

11.1.6   Distribution of p-values

When the null hypothesis holds, the p-value has a rectangular distribution between 0 and 1. When the alternative hypothesis is true, the p-value's distribution has greater probability of being near 0.

11.2   Goodness of fit tests

11.2.1   Counts and chi-squared distribution

A sum of squared standardised Poisson variables has approximately a chi-squared distribution.

11.2.2   Test for Poisson distribution

The chi-squared distribution can be used to test whether a random sample comes from a Poisson distribution with some pre-specified value of the parameter, λ.

11.2.3   Poisson test with constraints

If the value of the Poisson parameter, λ, is estimated from the data, the chi-squared test can still be conducted, but its degrees of freedom must be reduced by one.

11.2.4   Test based on frequency table

If the Poisson counts are small, the earlier chi-squared approximation for the test statistic does not hold well. If the data are first summarised in a frequency table, the test can be applied to the cell counts instead.

11.2.5   Test for any discrete distribution

The chi-squared goodnes-of-fit test can be applied to frequency tables to assess whether data are random samples from other discrete distributions.

11.2.6   Test for continuous distribution

The chi-squared goodness-of-fit test can also be used for continuous data if the data are first summarised in a frequency table.

11.3   Tests about normal distns

11.3.1   Test for mean, known σ

When a normal distribution's variance is a known value, the sample mean (or a standardised version) can be used as a test statistic since it has a normal distribution whose parameters are known when the null hypothesis is true.

11.3.2   Test about mean, unknown σ

If σ must be estimated from the data, the corresponding test statistic should be referred to a t distribution to get the p-value for tests about μ.

11.3.3   Test about variance

Tests about σ² are based on the sample variance and a chi-squared distribution is used to get the p-value.

11.3.4   Equal means in two distributions

If two samples come from normal distributions with equal variances, the test for comparing their means is based on a pooled estimate of the variance. The test statistic has a t distribution.

11.3.5   Equal variances in two distributions

The ratio of two sample variances has an F distribution when the sample distributions' variances are equal. A test for equal variances can be based on this.

11.4   Fixed significance level

11.4.1   Significance level

Hypothesis tests are often based on interpreting a p-value. An alternative approach involves a decision that is made about which of the two hypotheses is true. The significance level is the probability of wrongly concluding that the null hypothesis is not true.

11.4.2   Type I and II errors

Two types of error can be made from a test; deciding that H₀ is true when it is false, and deciding it is false when it is true.

11.4.3   P-values and decisions

A test with significance level α can be based on the test's p-value. The null hypothesis is rejected if the p-value is less than α.

11.4.4   Significance levels for discrete data

For discrete data, it is usually impossible to find a decision rule with significance level exactly 5% (or any other pre-specified value). A conservative test uses a significance level that is just under the required value.

11.4.5   Power function

The power of a test is the probability of accepting H₀ when it is false. Since this usually depends on the actual parameter value, it is a function that can be graphed.

11.4.6   Deciding on the sample size

If a hypothesis test's significance level and its power at some value of the parameter are specified, the sample size can be determined to achieve this.

11.5   Likelihood ratio test

11.5.1   Big model vs small model

We often want to compare two models for data where one model (the small model) is a special case of the other (the big model).

11.5.2   Likelihood ratio

The ratio of the maximum possible likelihoods under the big and small models gives information about whether the small model fits adequately.

11.5.3   Likelihood ratio test

The likelihood ratio test is based on twice the log of the likelihood ratio. An approximate p-vaue can be found from a chi-squared distribution.

11.5.4   Another example

This example uses the likelihood ratio test to assess whether the rate parameters of two exponential distributions are equal.

11.6   CI from inverting a test

11.6.1   Test statistics and pivots

The concepts of test statistics and pivots are closely related. Hypothesis tests and confidence intervals based on them are therefore also closely related.

11.6.2   Test from a confidence interval

An example shows how a hypothesis test about a binomial parameter with 5% significance level can be obtained from a 95% confidence interval.

11.6.3   Confidence interval from a test

Another example derives an exact confidence interval for a binomial probability by inverting a hypothesis test whose p-value is found exactly using the binomial distribution.

11.6.4   Confidence interval for median (Difficult) (Difficult)

A confidence interval for the median of a continuous distribution can be found by inverting a hypothesis test about it.

11.6.5   CI from likelihood ratio test

Finally, an example shows how inverting a likelihood ratio test can be used to find a confidence interval for an exponential parameter.