Conditional and marginal distributions
Another important distinction is between the marginal distribution for a variable and the conditional distributions. The following example illustrates.
Bruising of apples
The contingency table below describes bruising of 96 apples in a packing plant. The apples were classified by the variety of apple (Granny Smith or Fuji) and whether or not they were bruised. (The data are not real.)
OK | Bruised | |
---|---|---|
Granny Smith | 40 | 8 |
Fuji | 24 | 24 |
The diagram below shows the apples, arranged in rows by variety.
Click on any group of apples to read off the marginal proportion of that type of apple and its conditional proportion of bruising. Observe the notation
P(Bruised | Fuji)
for the conditional proportion of bruising given Fuji.
Choose Group by Bruising from the pop-up menu to rearrange the apples according to whether or not they are bruised. The rearranged diagram shows the marginal proportions for bruising and the conditional proportions for variety, given bruising. Observe that
Observe also that
Proportional Venn diagrams
The diagrams above are closely related to stacked bar charts, where the widths of the bars are given by the marginal proportions. This type of diagram is called a proportional Venn diagram.
Note that the area of each rectangle is given by the joint frequency of that pair of categories. (It is determined by the number of apples in it!)
Although proportional Venn diagrams do not help greatly in understanding this section of CAST, they will be useful for explaining various concepts in later sections.
Click the checkbox Hide Icons in the diagram above. Depending on whether the apples have been grouped by bruising or by variety, the diagram will be similar to stacked bar charts of the other variable.
Change the grouping variable and observe that the four areas remain the same — they are determined by the four joint frequencies.