As with single-group data, the populations underlying two-group data sets are usually of more interest than the specific sample data.
Two-group data sets are often modelled as separate random samples from two normal populations.
The normal model has four parameters — the means and standard deviations in the two groups.
The parameters of the normal model can be estimated by the sample means and standard deviations in the two groups.
The difference between the population means is of particular interest. The difference between the sample means provides an estimate. It varies from sample to sample and has a distribution.
The mean of a random sample is approximately normal with s.d. equal to σ divided by √n. The sum of a random sample is also approximately normal, but its s.d. is σ times √n.
The sum and difference of two independent normal variables is also normally distributed. If they have the same standard deviation, σ, the sum and difference both have standard deviation 1.414σ. (Their variance is 2σ².)
This page generalises the results to the sum and difference of variables whose standard deviations may be different.
If two variables are independent and have normal distributions, probabilities relating to their sum and difference can be found using the formulae for the mean and standard deviation of sums and differences.
The difference between the means of two samples from normal populations has a normal distribution whose mean and s.d. can be found from the population means and s.d.s. This is the approximate distribution even when the populations are non-normal.
When the difference between the sample means is used to estimate the difference between the underlying population means, there is likely to be an error. The error distribution is approximately normal with mean 0. A formula for its standard deviation is given.
A 95% confidence interval is given for the difference between two population means. Its properties are demonstrated.
A hypothesis test is developed for testing whether two group means are the same.
If the alternative hypothesis is for one particular mean to be greater, then the p-value for the test is found from only one tail of the t distribution.
Two-group categorical data can be modelled as samples from two categorical populations with different probabilities of 'success'.
The difference between two sample proportions has a distribution that is approximately normal and whose parameters can be estimated using earlier results about the mean and standard deviation of differences.
The standard deviation of the difference between two sample proportions can be estimated. From this, a 95% confidence interval is developed for the difference between two probabilities.
A hypothesis test is developed to assess whether two population probabilities are the same.
Paired data are a type of bivariate data in which two similar measurements are made from each individual. We are usually interested in testing whether the means of both measurements are the same.
For paired data, differences between the two measurements hold all information about whether the means of both variables are the same.
Testing for a difference between the means of the measurements is done with an ordinary t-test for whether the mean difference is zero.
To estimate or test the difference between two means, it is sometimes possible to collect data from two independent samples or from paired units. If the paired units are similar, a pair data gives more accurate results.
To compare the means of several groups, a model of normal distributions in all groups is used but all group standard deviations must be assumed to be the same.
The sample standard deviations in the separate groups can be combined to give a pooled estimate of the common standard deviation, σ.
Earlier CIs and tests for equality of two group means can be improved when the group standard deviations are known to be the same. However this refinement is not recommended for general use.
Both variability between group means and variability within groups must be used to assess whether the groups differ.
Variability within groups and between groups are described by sums of squares.
The coefficient of determination (R-squared) is the ratio of the between-groups and total sums of squares. It is the proportion of variation that can be explained by differences between the groups.
The F-ratio is a test statistic that is based on the between- and within-groups sums of squares. The associated p-value tests whether all groups have the same mean.
The F-test is applied to a few data sets.
In some data sets, the values arise in blocks of 3 or more related measurements. Randomised block and repeated measure data are of this form.
Ignoring the blocking of values loses important information about the difference between treatments. Comparing treatments separately against a baseline treatment using paired differences may be possible.
If there is no baseline treatment against which to compare the other measurements in each block, it is possible to simultaneously test whether all treatment means are equal. Again, ignoring the blocks loses important information.
Data of this form often arises from a randomised block experiment in which the experimental units occur in related blocks and treatments are randomly allocated within each block.
Although blocks and treatments arise in different ways, they are modelled similarly. A 3-dimensional display of the data represents both blocks and treatments in the same way.
The variation between blocks can be removed by adding/subtracting a value to each block to make all block means equal. This reduces the residual (unexplained) sum of squares.
The total sum of squares can be split into sums of squares for blocks and treatments, and a residual sum of squares.
An anova table shows these sums of squares and associated degrees of freedom. The F-ratio for treatments in the table is the basis of a test for equal treatment means. Several examples are given.