Categorical data from several groups
Useful information can sometimes be obtained by examining a single categorical distribution with a bar or pie chart. However more interesting questions can usually be asked of data when they are obtained from several groups.
All questions involve comparisons of a categorical distribution (cancer type, grade, infestation, ...) for different groups (races, student type, pesticide, ...).
Contingency tables
Assuming again that the ordering of recording the values is unimportant, the categorical data in each group can be expressed as a frequency table. Combining these frequency tables into a single rectangular array gives a contingency table.
Student degrees
As part of a survey of students graduating at a university, 36 students were randomly selected from four degree programmes. For each graduating student, the class of degree was recorded (1st, 2nd or 3rd class). The 36 resulting categorical values are grouped by the type of degree on the left of the diagram below.
Click on all the values for the students getting BBS degrees to build up the frequencies in the first column of the contingency table. Repeat with the values from the other degrees to complete the table.
The data may not be presented as separate lists of values from each group. The groups may equivalently be defined by a categorical variable in the original data matrix. Each 'individual' again contributes a count of 1 to a single cell of the contingency table.
Student degrees
The diagram below shows the student survey data with a categorical variable 'degree' defining the groups. (The variable Fail gives the number of courses failed by each student before graduating and variable Loan gives the accumulated student loan at graduation ($000).
Click on each row in turn to add 1 to the appropriate cells of the contingency table. (The resulting contingency table is identical to the one earlier in this page.)