Modelling two-group data

Numerical data from two groups are usually modelled as random samples from two underlying populations. Categorical data can be modelled in a similar way.

Typical data sets

We now show a few examples in which the difference between two sample proportions provides an estimate of the difference between the underlying probabilities.

Sample-to-sample variability

A simulation shows that the difference between the sample proportions is random and usually different from the difference between the population probabilities.

We will examine the distribution of p2 - p1 more carefully in the next page.