The means and variances of the individual variables in a multinomial distribution follow directly from their marginal binomial distributions.

Means and variances

If \((X_1, \dots, X_g)\) have a \(\MultinomDistn(n, \pi_1, \dots, \pi_g)\) distribution,

\[ E[X_i] \;=\; n\pi_i \spaced{and} \Var(X_i) \;=\; n\pi_i(1 - \pi_i) \qquad \text{for }i=1,\dots,g \]

The covariance between any two variables in a multinomial distribution is a little harder to obtain.

Covariances

If \((X_1, \dots, X_g)\) have a \(\MultinomDistn(n, \pi_1, \dots, \pi_g)\) distribution,

\[ \Covar(X_i, X_j) \;=\; -n\pi_i\pi_j \qquad \text{if }i \ne j \]

The random variable \(Y = X_i + X_j\) is the number of values that are in either categories \(i\) or \(j\). Treating these two categories as a "success", \(Y\) has a \(\BinomDistn(n, \pi_i + \pi_j)\) distribution, so

\[ \Var(X_i + X_j) \;=\; n(\pi_i + \pi_j)(1 - \pi_i - \pi_j) \]

We will now find this variance in a different way. Using the earlier result about the variance of the sum of two random variables,

\[ \begin{align} \Var(X_i + X_j) \;&=\; \Var(X_i) + \Var(X_j) + 2\Covar(X_i, X_j) \\ &=\; n\pi_i(1 - \pi_i) + n\pi_j(1 - \pi_j) + 2\Covar(X_i, X_j) \end{align} \]

Equating these two formulae for the variance,

\[ n(\pi_i + \pi_j)(1 - \pi_i - \pi_j) \;=\; n\pi_i(1 - \pi_i) + n\pi_j(1 - \pi_j) + 2\Covar(X_i, X_j) \]

Rearranging this gives

\[ \begin{align} 2\Covar(X_i, X_j) \;&=\; n(\pi_i + \pi_j)(1 - \pi_i - \pi_j) - n\pi_i(1 - \pi_i) - n\pi_j(1 - \pi_j) \\ &=\; -2n\pi_i\pi_j \end{align} \]

The correlation between any two of the multinomial variables can be easily found from their covariance,

Correlation coefficients

If \((X_1, \dots, X_g)\) have a \(\MultinomDistn(n, \pi_1, \dots, \pi_g)\) distribution,

\[ \Corr(X_i, X_j) \;=\; -\sqrt{\frac{\pi_i\pi_j}{(1 - \pi_i)(1 - \pi_j)}} \]
\[ \begin{align} \Corr(X_i, X_j) \;&=\; \frac{\Covar(X_i, X_j)}{\sqrt{\Var(X_i) \Var(X_j)}} \\[0.4em] &=\; \frac{-n\pi_i\pi_j}{\sqrt{n\pi_i(1 - \pi_i) \times n\pi_j(1 - \pi_j)}} \\[0.4em] &=\; -\sqrt{\frac{\pi_i\pi_j}{(1 - \pi_i)(1 - \pi_j)}} \end{align} \]

It should be noticed here that the correlation between two multinomial variables does not depend on the sample size, \(n\). Increasing \(n\) does not decrease their correlation.