Bernoulli trials until first success

We will now describe a second family of distributions that is related to a sequence of Bernoulli trials.

Definition

In a sequence of independent Bernoulli trials with \(P(success) = \pi\) in each trial, the number of trials until the first success is observed has a distribution called a geometric distribution.

\[ X \;\; \sim \; \; \GeomDistn(\pi) \]

The probability function of a geometric random variable is relatively simple.

Probability function

If a random variable has a geometric distribution, \(X \sim \GeomDistn(\pi) \), then its probability function is

\[ p(x) = \pi (1-\pi)^{x-1} \quad \quad \text{for } x = 1, 2, \dots \]

If the first success arises on the \(x\)'th trial, the first \(x\) outcomes must be the following sequence

\[ \overbrace {FF\dots F}^{x - 1 \text{ failures}} S \]

Since each failure arises with probability \( (1 - \pi) \), the probability of this sequence is

\[ p(x) \;\;=\;\; \overbrace {(1-\pi)(1-\pi)\dots (1-\pi)}^{x - 1 \text{ times}} \times \pi \;\;=\;\; \pi(1-\pi)^{x-1} \]

The lowest possible value for \(x\) is 1, but there is no upper limit for \(x\). Extremely high values may be unlikely but are still possible, whatever the value of \(\pi\).

Although strictly unnecessary because of how it was derived, we now demonstrate that \(p(x)\) satisfies the two properties that are required of a valid probability function. To do so, we will use a standard mathematical result about the sum of a geometric series.

Sum of geometric series

If \(-1 < a < 1\), then

\[ \sum_{x=0}^\infty {a^x} = \frac 1 {1-a} \]
\[ \begin{align} \sum_{x=0}^\infty {a^x} & = 1 + a + a^2 + a^3 + \dots \\ & = 1 + a(1 + a + a^2 + \dots) \\ & = 1 + a \times \sum_{x=0}^\infty {a^x} \end{align} \]

Rearranging this equation,

\[ (1 - a)\sum_{x=0}^\infty {a^x} = 1 \]

so

\[ \sum_{x=0}^\infty {a^x} = \frac 1 {1-a} \]

We now show that the geometric probability function that was obtained above does have the required properties of a probability function.

Negative probabilities are impossible
Since \(0 \le \pi \le 1 \), it must also be true that \(0 \le (1-\pi) \le 1 \). Therefore
\[ p(x) = \pi (1-\pi)^{x-1} \ge 0 \quad \quad \text{for all } x \]
The probabilities sum to one
\[ \begin{align} \sum_{x=1}^\infty {p(x)} &= \sum_{x=1}^\infty {\pi (1-\pi)^{x-1}} \\ & = \pi \sum_{x=1}^\infty {(1-\pi)^{x-1}} \\ & = \pi \sum_{x=0}^\infty {(1-\pi)^x} \\ & = \pi \times \frac 1 {1 - (1-\pi)} \\ & = \pi \times \frac 1 \pi = 1 \end{align} \]
using the formula for the sum of a geometric series with \(a = (1 - \pi)\).

Geometric bar charts

The diagram below shows some possible shapes of geometric distributions. Use the slider to see the effect of changing the parameter \(\pi\).

The height of each bar is \( (1-\pi) \) times that of the previous bar, so when \(\pi\) is close to zero, the probabilities decrease slowly from their maximum at \( p(1) \).

When \(\pi\) is small, all individual probabilities are small, so the bars are all low. Click the checkbox Show zero-one axis to turn it off, rescaling the probabilities to show the shape of the distributions better.