Generalising Bernoulli trials
The binomial distribution arose from a collection of \(n\) independent trials, each of which had two possible values that we called success and failure. Since there are only two possibilities, we can simply consider the number of successes, \(X\) — the number of failures, \(n - X\), is determined exactly from \(X\). The random variable \(X\) has a \(\BinomDistn(n, \pi)\) distribution where \(\pi\) is the probability of success and \((1-\pi)\) is the probability of failure.
We now extend this to situations in which each of the \(n\) trials may have more than two possibilities.
Definition
If the following conditions hold:
then the total numbers of occurrences of the different outcomes, \((X_1, X_2,\dots, X_g)\), have a multinomial distribution with parameters \(n, \pi_1, \dots, \text{ and }\pi_g\),
\[ (X_1, X_2,\dots, X_g) \;\; \sim \;\; \MultinomDistn(n, \pi_1, \dots, \pi_g) \]Note here that
\[ \sum_{i=1}^g X_i \;=\; n \spaced{and} \sum_{i=1}^g {\pi_i} \;=\; 1 \]When \(g = 2\), this is simply a binomial experiment and \(X_1\) and \(X_2\) are the numbers of successes and failures. This is therefore essentially a univariate situation.
When there are \(g = 3\) possible outcomes, the situation is essentially bivariate since we can simply examine the joint distribution of \(X_1\) and \(X_2\); the value of \(X_3 = n - X_1 - X_2\) is completely determined by them.
Joint probability function
If \((X_1, X_2,\dots, X_g)\) have a \(\MultinomDistn(n, \pi_1, \dots, \pi_g)\) distribution, then their joint probability function is
\[ p(x_1, x_2, \dots, x_g) = \frac{n!}{x_1!\;x_2!\; \cdots,\;x_g!} \pi_1^{x_1}\pi_2^{x_2}\cdots \pi_g^{x_g} \]provided
\[ x_i=0, 1, \dots, n, \quad\text{for all }i \spaced{and}\quad \sum_{i=1}^g {x_i} = n \]but is zero for other values of the \(\{x_i\}\).
Since each trial is independent, any specific sequence of \(n\) outcomes containing \(x_1\) values of type \(O_1\), ..., and \(x_g\) values of type \(O_g\) has probability
\[ \pi_1^{x_1}\pi_2^{x_2} \cdots \pi_g^{x_g} \](For example, this is the probability that the first \(x_1\) values are of type, \(O_1\), the next \(x_2\) values are of type \(O_2\), and so on.)
We now need to find how many distinct sequences of this type would result in \(x_1\) values of type \(O_1\), ..., and \(x_g\) values of type \(O_g\).
The \(x_1\) values of type \(O_1\) can be placed in the sequence in \({n \choose x_1} = \frac{n!}{x_1!(n-x_1)!}\) different ways.
For each of these, the \(x_2\) values of type \(O_2\) can be placed in the remaining \((n-x_1)\) positions in the sequence in \({n-x_1 \choose x_2} = \frac{(n-x_1)!}{x_2!(n-x_1-x_2)!}\) different ways. The total number of ways of placing the values of type \(O_1\) and \(O_2\) in the sequence is therefore
\[ {n \choose x_1}{n-x_1 \choose x_2} \;\;= \frac{n!}{x_1!x_2!(n-x_1-x_2)!} \]Similar logic shows that the total number of ways of placing the values of type \(O_1\), \(O_2\) and \(O_3\) in the sequence is
\[ {n \choose x_1}{n-x_1 \choose x_2}{n-x_1-x_2 \choose x_3} \;\;= \frac{n!}{x_1!x_2!x_3!(n-x_1-x_2-x_3)!} \]Carrying on in the same way shows that the total number of sequences containing \(x_1\) values of type \(O_1\), ..., and \(x_g\) values of type \(O_g\) is
\[ \frac{n!}{x_1!\;x_2!\; \cdots,\;x_g!} \]The total probability of getting \(x_1\) values of type \(O_1\), ..., and \(x_g\) values of type \(O_g\) is the product of the probability of a specific sequence of this type times the number of possible sequences of this type, giving the joint probability function.
We now give a numerical example.
Opinion poll
Consider a public opinion poll in which people are asked for their opinion about a new piece of legislation. Three possible responses are possible and each selected individual has the following probabilities for the responses:
P(Agree) = 0.3,
P(Neutral) = 0.4
P(Disagree) = 0.3
If \(n\) individuals are randomly chosen and their responses are independent, the numbers giving the three responses will have a \(\MultinomDistn(n, 0.3, 0.4, 0.3)\) distribution. Note that this is really a bivariate situation since the number disagreeing can be determined from the numbers who agree or are neutral. (This is similar to the way that we only consider the distribution of the successes in a binomial situation.)
The joint probability function can be written as
\[ p(x_1, x_2, x_3) = \frac{n!}{x_1!\;x_2!\; (n-x_1-x_2)!} {0.3}^{x_1}{0.4}^{x_2}{0.3}^{n-x_1-x_2} \]The diagram below shows these probabilities in a 3-dimensional bar chart for different values of \(n\).
When the sample size is 1, there are only three possible values for the numbers agreeing and neutral, \(X_1\) and \(X_2\). These are the probabilities that the single value is "Agree", "Neutral" or "Disagree", corresponding to \((x_1=1, x_2=0)\), \((x_1=0, x_2=1)\) and \((x_1=0, x_2=0)\).
Note that \(X_1\) and \(X_2\) are not independent. Knowing that \(X_1=1\) person agrees tells us that \(X_2\) must be zero since this is a sample of size one.
Use the pop-up menu to increase the sample size and observe how this affects the joint distribution. Note again that \(X_1\) and \(X_2\) are not independent since
\[ X_1 + X_2 \;\;\le\;\; n \]