Family of binomial distributions

We have talked about "the" binomial distribution, but there is really a family of binomial distributions corresponding to different values of the two parameters \(n\) and \(\pi\). We now use bar charts to show the shapes of some binomial distributions.

Binomial bar charts

The diagram below shows some possible shapes of the binomial distribution.

Use the slider on the left to increase the number of trials, \(n\), and observe that both the mean and the variance increase.

Adjust the probability of success, \(\pi\), and observe that the distribution is symmetric when \(\pi = 0.5\) but is skew with a long tail to the right when \(\pi\) is clos to zero and skew with its long tail to the left when \(\pi\) is close to one.

When the number of trials is large, the probabilities for all individual x-values are small, so click to turn off Show zero-one axis and rescale the bars of the bar chart to fill up more of the diagram, making the shape of the distributions clearer.

Bar chart for the proportion of successes

In the following bar chart, the main horizontal axis is for the proportion of successes, \(p\), rather than the count, \(x\). (The diagram actually has dual axes so it also displays the x-values.)

Again increase the sample size, \(n\). Now note that the distribution of the sample proportion, \(P\), remains centred on \(\pi\) for all sample sizes. Note that the distribution's spread now decreases as the sample size increases.

Now adjust the probability of success, \(\pi\), and again observe how the shape of the distribution changes. The distribution has greatest spread when \(\pi = 0.5\) but smaller spread when \(\pi\) is close to zero or one.

(This diagram can also be used to read off binomial probabilities by clicking on bars.)

From the above bar charts, you should have observed the following properties of the binomial distribution.

Mode

Effect of \(\pi\) on the shape of the distribution

Effect of \(n\) on the shape of the distribution

Normal approximation

A binomial random variable, \(X\), is the sum of \(n\) independent Bernoulli values, and the proportion of successes, \(P\), is the mean of these \(n\) Bernoulli values. The Central Limit Theorem can therefore be applied to the distributions of both \(X\) and \(P\):

\[ \begin{align} X &\;\; \xrightarrow[n \rightarrow \infty]{} \; \; \NormalDistn\left(\mu_X = n\pi, \;\;\sigma_X^2 = n\pi(1-\pi) \right) \\ P &\;\; \xrightarrow[n \rightarrow \infty]{} \; \; \NormalDistn\left(\mu_P = \pi, \;\;\sigma_P^2 = \frac{\pi(1-\pi)} n \right) \end{align} \]