A general framework

You may find it difficult to spot the common theme in the examples in this section, but they are all examples of hypothesis testing and fit into a common framework that is used for all hypothesis tests.

Data, model and question

Data (and model)
The data were assumed to arise from a random mechanism (model), some of whose characteristics are unknown.
Null hypothesis
This is a statement about the characteristics of the model. We are particularly interested in whether the null hypothesis is true.
Alternative hypothesis
This is usually just the opposite of the null hypothesis.

We assess whether the null hypothesis is true by asking ...

Are the data consistent with the null hypothesis?

Test statistic
This is some function of the data that throws light on whether the null or alternative hypothesis holds.
P-value
This is the probability of obtaining a test statistic value as 'extreme' as the one recorded if the null hypothesis holds.
Interpreting the p-value
The following table can be used as a guide:
p-value Interpretation
over 0.1 no evidence that the null hypothesis does not hold
between 0.05 and 0.1 very weak evidence that the null hypothesis does not hold
between 0.01 and 0.05 moderately strong evidence that the null hypothesis does not hold
under 0.01 strong evidence that the null hypothesis does not hold

Soccer league in one season

Data (and model)
End-of-season points for the teams in the league. The model involves probabilities for wins, losses and draws in all matches, but we know little about these probabilities.
Null hypothesis
All teams have the same probability of winning each match.
Alternative hypothesis
All teams do not have the same probabilities of winning.
Test statistic
The standard deviation of final points. It will be low if the teams have the same abilities (null hypothesis) and higher otherwise (alternative hypothesis).
P-value
Probability that the standard deviation is as high as 16.7 (the actual data) if all teams are equally matched. None of the simulated leagues had standard deviations as high as 16.7, so the p-value is zero.
Interpreting the p-value
There is extremely strong evidence that the teams do not have the same probability of winning.

Proportion

Data (and model)
Number of successes in 100 values. Our model assumes that P(success) is the same for all values and that they were independent.
Null hypothesis
The probability of success is 0.80.
Alternative hypothesis
The probability of success is less than 0.80.
Test statistic
The number of successes in the 100 values. It will be near 80 if the underlying probability of success is 0.80 (null hypothesis) and lower than 80 if it is less (alternative hypothesis).
P-value
Probability of 72/100 or fewer successes (the actual data) if the underlying population proportion is 0.80. The p-value was 0.05.
Interpreting the p-value
Since 72 or fewer successes would be unlikely if the population proportion was 0.80, we have moderately strong evidence that it is lower than 0.80.

Process mean

Data (and model)
Sample of 10 values from a process. Our model is that they were sampled from a normally distribution with σ = 10 and unknown µ.
Null hypothesis
The distribution of values has mean µ = 520.
Alternative hypothesis
The alternative hypothesis is that µ ≠ 520 gm.
Test statistic
The mean of our sample of 10 values. It will be close to 520 if the process is working correctly (the null hypothesis) and farther from this if the process mean has drifted from 520 (alternative hypothesis).
P-value
Probability of a sample mean as far from 520 as the value in our actual data (529) if the underlying population mean is 520. Since none of the simulated samples had means as far from 520, the p-value is 0.0.
Interpreting the p-value
Since a mean as far from 520 as our actual mean (529) is very unlikely, there is strong evidence that the mean weight is no longer 520 gm.

Comparison of groups

Data (and model)
Samples of values from two groups. No assumptions are made about the shape of the underlying distributions (model) but the data are assumed to be random samples from them.
Null hypothesis
The population distributions are the same in both groups.
Alternative hypothesis
The groups have different distributions.
Test statistic
The difference between the means of the two groups. It should be near zero if the two populations are the same (null hypothesis) and only different from zero if the groups are different (alternative hypothesis).
P-value
Probability that the difference in means is further from zero than 0.902 (the actual data) if both samples come from the same population. Since this never happened in our randomised samples, the p-value is zero.
Interpreting the p-value
We conclude that it is almost certain that the null hypothesis does not hold —values are higher in Group A than in Group B.

Correlation coefficient

Data (and model)
Pairs of values from two variables (Y and X). No assumptions are made about a model underlying the data.
Null hypothesis
The correlation coefficient between Y and X in the population is zero.
Alternative hypothesis
The correlation is non-zero.
Test statistic
The sample correlation coefficient between Y and X. It will be close to zero if the variables are uncorrelated in the underlying population (null hypothesis) and further from zero otherwise (alternative hypothesis).
P-value
Probability that the correlation coefficient is further from zero than 0.537 (the actual data) if the variables are uncorrelated in the underlying population. Only 3.5% of the simulated samples had relationships as strong, so this is the p-value.
Interpreting the p-value
A correlation coefficient as far from zero would be unlikely if the null hypothesis was true, so there is moderately strong evidence that the variables are correlated.