Notation

We now generalise the telepathy example on the previous page. Consider an infinite categorical population that contains a proportion π of some category that we will call 'success'. We call the other values in the population 'failures'.

In the telepathy example, a correct guess might be called a 'success' and a wrong guess would be a 'failure'. The probability of success is π = 0.333.

The labels 'success' and 'failure' provide terminology that can describe a wide range of data sets. For example,

Data set  'Success'   'Failure' 
Sex of a sample of fish female male
Quality of export apples good bruised
  Effect of insecticide on beetle   dead alive

When a random sample of n values is selected from such a population, we denote the number of successes by x and the proportion of successes by p  = x/n.

Distribution of a proportion from a simple random sample

The number of successes, x , has a 'standard' discrete distribution called a binomial distribution which has two parameters, n and π. In practical applications, n is a known constant, but π may be unknown. The sample proportion, p , has a distribution with the same shape, but is scaled by n .

With appropriate choice of the parameters n and π, the binomial distribution can describe the distribution of any proportion from a random sample.

Shape of the binomial distribution

The diagram below shows some possible shapes of the binomial distribution. The barchart has dual axes and therefore shows the distributions of both x and p.

Drag the sliders to adjust the two parameters of the binomial distribution. Observe that

The diagram can be used to obtain binomial probabilities by setting π and n to the appropriate values, then clicking on one of the bars in the barchart.

Telepathy experiment

For example, to find the probability of a subject guessing correctly 4 out of 5 cards in the telepathy example, set π = 0.33 and n = 5, then click on the bar for x = 4. The probability is shown under the barchart.


The diagram below demonstrates that a binomial distribution does indeed describe sample-to-sample variability. The pink barchart at the bottom of the diagram shows the binomial distribution with parameters n = 20 and π = 0.333 that describes the distribution of the sample proportion of correct guesses from n = 20 guesses.

Click Accumulate and take several samples. Observe that the distribution of x matches the theoretical binomial distribution. Repeat the exercise with different sample sizes.