Spliting into groups
If the two variables can be treated as a response and an explanatory variable, it is useful to split the data into 'groups' using the explanatory variable, and compare the distributions of the response within the different groups. These are also called the conditional distributions of the response at each value of the explanatory variable.
Even if the two variables cannot be classified into a response and explanatory variable, similar methods can be used. If the variables are called X and Y, we can either
These are called the conditional distributions of Y given X, and the conditional distributions of X given Y, and proportions within the groups would be used to make comparisons easier.
In the context of a contingency table, the conditional proportions are found by dividing each frequency in the table by its row (or column) total. This scales each row (or column) of the table to sum to 1.0.
Rank and age in a university
The following contingency table again shows the rank and age of all academic staff in a university in the USA.
Select Proportion from the pop-up menu to see the conditional distributions for each Age group. In effect, this scales the frequencies in each row of the contingency table to add to 1.0. Click on the row for Under 30 to see how the conditional proportions are obtained by dividing the joint frequencies by the marginal frequency for Under 30.
Now choose Rank from the pop-up menu on the right to see the conditional distributions for each Rank. Click on columns to see how these conditional proportions are obtained from the joint frequencies.
Graphical displays of conditional distributions
The conditional distributions can be shown graphically on a 3-dimensional bar chart, but a clustered 2-dimensional display is usually easier to interpret. Note however that several different types of clustered displays can be drawn — they make it easier to compare different aspects of the distributions.
Rank and age
The clustered bar chart below initially shows the joint frequencies for all combinations of age and rank.
First select Rank from the pop-up menu under the bar chart to cluster the bars by rank. The total number of instructors is small, so it is difficult to campare the ages of instructors to those of the other ranks. Select Propn within Rank from the pop-up menu at the top to display the conditional distributions of age within rank. It effectively scales each rank's bars to give the same total (1.0).
It is now easy to see that the age distributions of assistant professors and instructors are very similar, but both are different from those of associate and full professors.
Select Frequency and Age from the two menus to show the raw counts, clustered by age. Select Propn within Age to display the conditional distributions of the ranks of staff who are in each age group.
This diagram emphasises the spike in assistant professors for the youngest staff, and the increasing proportion of associate and full professors as staff get older.