Probabilities in a sub-population

Complete population
The joint probabilities pxy and the marginal probabilities px and py all describe proportions in the complete population of (xy) pairs.
Sub-population
In contrast, it is sometimes meaningful to restrict attention to a subset of the (xy) pairs. For example, we may be interested only in pairs for which the first variable, X , has some particular value. Probabilities that relate to a sub-population are called conditional probabilities.

The concept of a conditional probability is similar to that of a conditional proportion that was described earlier for bivariate categorical data sets.

Conditional probabilities for Y, given X = x

Consider again hair colour (Y ) and eye colour (X ) in a population of teenagers. The probability of a teenager being blonde, conditional on blue eyes, is the proportion of blondes within the sub-population with blue eyes. The conditional probability is most easily understood as the ratio of the population numbers with (a) blue eyes and (b) both blonde hair and blue eyes.

However if the population is infinite, it is better to express it in terms of probabilities as the ratio of a joint and marginal probability (an equivalent definition for finite populations).

The general definition of the conditional probabilities for Y given that the value of X is x is

Conditional probabilities as a rescaling of joint probabilities

The conditional probabilities for Y, given X  = x , can therefore be found by rescaling of that row of the table of joint probabilities (dividing by px) so that the row sums to 1.0, as shown in the diagram below.

Two sets of conditional probabilities

Note that there is an equivalent formula for conditional probabilities for X given the value of Y that corresponds to using the other variable to define the sub-population. When we restrict attention to population values for which Y  has the value y , the conditional probabilities for X are

You should be careful to distinguish between px | y and py | x.

The probability of being pregnant, given that a randomly selected person is female would be fairly small. The probability of being female, given that a person is pregnant is 1.0 !!


Type and site of melanoma

We use the melanoma example to illustrate conditional probabilities. The diagram below again shows the joint probabilities in a 3-dimensional barchart.

Click the formula for the conditional probabilities of 'Y' (tumour type) given 'X' (site). The bars for each site are separately scaled up to add to 1.0. Observe that

Click the formula for joint probabilities, then the formula for conditional probabilities of 'X' given 'Y'. This time the joint probabilities are separately scaled for each tumour type. This shows the distribution of locations for each type of tumour. This display is less informative, but notice that