Categorical data from several groups
Useful information can sometimes be obtained by examining a single categorical distribution with a bar or pie chart. However more interesting questions can usually be asked of data when they are obtained from several groups.
All questions involve comparisons of a categorical distribution (cancer type, grade, infestation, ...) for different groups (races, student type, pesticide, ...).
Contingency tables
Assuming again that the ordering of recording the values is unimportant, the categorical data in each group can be expressed as a frequency table. Combining these frequency tables into a single rectangular array gives a contingency table.
Rice survey
As part of a survey of rice producers in Sri Lanka, 36 farmers were randomly selected from 4 villages. Each sampled farmer was asked about the variety of rice that he used and the varieties were categorised into 'Old', 'Traditional' or 'New'. The 36 resulting categorical values are grouped by village on the left of the diagram below.
Click on all the values from Sabey to build up the frequencies in the first column of the contingency table. Repeat with the values from the other villages to complete the table.
The data may not be presented as separate lists of values from each group. The groups may equivalently be defined by a categorical variable in the original data matrix. Each 'individual' again contributes a count of 1 to a single cell of the contingency table.
Rice survey
The diagram below shows the full rice survey data with a categorical variable 'village' defining the groups.
Click on each row in turn to add 1 to the appropriate cells of the contingency table. (The resulting contingency table is identical to the one earlier in this page.)