Direct evaluation of cumulative probabilities

As with the binomial distribution, there is no simple formula for the negative binomial distribution's cumulative distribution function, \(F(x)\).

It is possible to simply evaluate negative binomial probabilities and add them,

\[ F(x) = \sum_{u=k}^{\lfloor x \rfloor} {{u-1} \choose {k-1}} \pi^k(1-\pi)^{u-k} \]

where \(\lfloor x \rfloor \) denotes the smallest integer less than or equal to \(x\).

Cumulative probabilities from binomial distribution

There is however a simpler method if \(k\) is much smaller than \(x\). It is based on the fact that taking more than \(x\) trials to get the \(k\)'th success is equivalent to there being fewer than \(k\) successes in the first \(x\) trials. This can be found by adding \(k\) binomial probabilities.

\[ \begin{align} P(X \gt x) & = P(\text{fewer than } k \text{ successes in first } x \text{ trials}) \\ & = \sum_{v=0}^{k-1} {x \choose v} \pi^v (1-\pi)^{x-v} \end{align}\]

The cumulative probability is \(F(x) = 1 - P(X \gt x)\).

Cumulative probabilities in Excel

These formulae can be avoided if Excel is used to find cumulative probabilities. Again the function "NEGBINOM.DIST()" is used. To find the probability that the \(k\)'th success in a series of Bernoulli trials occurs on or before the \(x\)'th trial, \(F(x)\), the function takes four parameters:

  1. The number of failures, \(x-k\)
  2. The number of successes, \(k\)
  3. The probability of success, \(\pi\)
  4. true

Example

If a fair six-sided die is rolled repeatedly, what is the probability that it will take more than 20 rolls before three sixes are observed?

If \(X\) denotes the toss when the third six appears,

\[ X \;\; \sim \; \; \NegBinDistn(3, \frac 1 6) \]

The answer is therefore

\[ P(X > 20) = (1 - F(20)) \]

This can be found by typing into an Excel spreadsheet cell

= 1 - NEGBINOM.DIST(17, 3, 1/6, true)

To find the probability on a calculator, we would note that this event corresponds to 2 or fewer sixes in the first 20 rolls and this can be evaluated using the distribution of the number of sixes in the first 20 rolls, \(Y \sim \BinomDistn(20, \frac 1 6) \).

\[ \begin{align} P(X > 20) & = P(Y \le 2) = p_Y(0) + p_Y(1) + p_y(2) \\ & = {{20} \choose 0} \left(\frac 1 6\right)^0 \left(\frac 5 6\right)^{20} + {{20} \choose 1} \left(\frac 1 6\right)^1 \left(\frac 5 6\right)^{19} + {{20} \choose 2} \left(\frac 1 6\right)^2 \left(\frac 5 6\right)^{18} \\ & = \left(\frac 5 6\right)^{20} + 20 \times \left(\frac 1 6\right) \left(\frac 5 6\right)^{19} + 190 \times \left(\frac 1 6\right)^2 \left(\frac 5 6\right)^{18} \\[0.5em] & = 0.3287 \end{align} \]