Contingency tables from univariate data in several groups
Contingency tables often arise from bivariate categorical data. However they can also arise from univariate categorical data that is recorded separately from several groups.
'Group membership' can be treated as a second categorical variable.
Effect of false claims in adverts
In a study to assess how false or misleading adverts affect consumers, one group of 100 experimental subjects was exposed to a series of adverts falsely claiming that a new brand of coffee contained 'no bitterness'. These subjects and another control group of 100 people who had not seen the adverts were given a sample of coffee that had been prepared to be intentionally bitter. The contingency table below shows whether the subjects reported the coffee as 'having bitterness'.
False advert | No advert | Total | |
---|---|---|---|
Coffee described as bitter |
68 | 89 | 157 |
Coffee described as not bitter |
32 | 11 | 43 |
Total | 100 | 100 | 200 |
In this example, two different groups of people were used in the experiment. The column variable (distinguishing between the two types of advert) is not a random variable, as it is controlled by the experiment. A single categorical measurement (whether or not the coffee sample was bitter) was made from each person.
Comparing groups
Although the chi-squared test was motivated as a test of independence of two categorical variables, the same test can be used when each row (or column) of a contingency table corresponds to a separate group of individuals.
The χ2 test statistic and p-value are identical to those given earlier for testing independence.
Examples
In the following examples, we test whether the 'response' proportions are the same in several groups.
Note again that a visual comparison of the observed counts and those estimated from the margins assuming independence helps to explain the nature of the relationship in examples where we conclude that there is some difference between the groups.
Two groups and two categories
In the special case where there are two groups and the categorical measurement has two categories (that we will call 'success' and 'failure'), the chi-squared test is testing whether the probability of success is the same in both groups. For example, in the Bitter Coffee data set, we are testing whether the probability of reporting that the coffee was bitter is the same for the groups seeing the false adverts and those who did not.
This hypothesis can also be tested with a 2-sample test of equality of two proportions.
Fortunately, although the two tests have been motivated in a different way, it can be proved that:
The 2-sample test for equality of two proportions and the chi-squared test both result in the same p-value and conclusion.