Tree diagrams for two attributes

Two categorical variables (or partitions of the sample space) are sometimes represented in a type of diagram called a tree diagram. In this diagram, we think of the measurements as being observed in sequence. The main branches of the tree correspond to the possible values for the first of these measurements. At the end of each of these branches, further branches are drawn for each possible value of the second measurement.

Gender and marital status

An adult can be either male or female. For either gender, there are four options for marital status (a second partition of the sample space):

The main branches are usually labelled with their marginal probabilities. The branches to the right are labelled with their conditional probabilities, conditional on the branches to the left leading to that fork.

Evaluating joint probabilities

The main use of tree diagrams is to evaluate joint probabilities. To obtain any joint probability, multiply the probabilities down the corresponding branches. This corresponds to applying the general multiplication rule, P(A and B) = P(A) × P(B | A).

Gender and marital status

The probability of an adult being both male and divorced is the product of the probabilities going down these branches:

Bruising of apples

The contingency table below describes bruising of 96 apples in a packing plant. The apples were classified by the variety of apple (Granny Smith or Fuji) and whether or not they were bruised.

     OK    Bruised
Granny Smith 40 8
Fuji 24 24

We now consider choosing an apple randomly from the batch of 96 apples; it can be one of two types (Granny Smith or Fuji) and it can be bruised or not bruised (denoted below as B and B'). The tree diagram below describes the probabilities for the apple type and its bruising.

Note that

We do not have to represent the apple type first.  If we chose to represent the bruising characteristic first, the tree diagram would be as follows:

The joint probabilities on the right are the same whichever ordering of the variables is used.

Sampling without replacement

In the apple-bruising example, there was little to be gained by using a tree diagram — the joint probabilities could have been obtained more easily directly from the original contingency table.

Tree diagrams are more useful for problems in which there is a natural ordering of the two measurements.

An important example of this is when two or more items are selected from a finite population without replacement — in other words, when the same item cannot be selected more than once.

Sampling two students from a class

We now consider selection of two students from a class of 2 women and 4 men to give a presentation on their weekend reading on the history of Probability. The fairest way to select two students from the class is by "pulling names out of a hat". The following tree diagram illustrates this process.

Note that the probability of the first student being male is P(M) = 4/6 since four out of the six students are male. However the conditional probability of the second student being male given that the first is male is reduced to 3/5 since there are now only five remaining students of whom 3 are male.

This diagram allows us to determine the probability that one student of each gender is selected.

P(one man and one woman)   =   P(MW) + P(WM)
  =   8/30 + 8/30 = 8/15