Counts with Poisson distributions
Suppose that we have \(n\) independent discrete random variables \(\{X_1, X_2,\dots, X_k\}\) that are counts of events. We will now consider whether they might be counts from Poisson processes in which the rate of events for \(X_i\) is \(\lambda_i\),
\[ X_i \;\;\sim\;\; \PoissonDistn(\lambda_i) \]If this model holds then, from the properties of Poisson distributions,
\[ E[X_i] \;\;=\;\; \Var(X_i) \;\;=\;\; \lambda_i \]Standardised versions of these counts
\[ Z_i \;\;=\;\; \frac{X_i - \lambda_i}{\sqrt{\lambda_i}} \]will have mean zero and standard deviation one. If the \(\{\lambda_i\}\) are large,
\[ Z_i \;\;=\;\; \frac{X_i - \lambda_i}{\sqrt{\lambda_i}} \;\; \underset{\text{approx}}{\sim} \;\; \NormalDistn(0,1) \]Chi-squared statistic
If the Poisson model holds,
\[ \sum_{i=1}^k {Z_i^2} \;\;=\;\; \sum_{i=1}^k {\frac{\left(X_i - \lambda_i\right)^2}{\lambda_i}} \;\; \underset{\text{approx}}{\sim} \;\; \ChiSqrDistn(k \text{ df}) \]In the context of goodness-of-fit tests, this is often written as
\[ X^2 \;\;=\;\; \sum_{i=1}^k {\frac{\left(O_i - E_i\right)^2}{E_i}} \;\; \underset{\text{approx}}{\sim} \;\; \ChiSqrDistn(k \text{ df}) \]In practice, this approximation is reasonable provided most of the \(\{E_i\}\) — the Poisson means — are reasonably large. The usual guideline is that
If these guidelines are not met, the chi-squared distribution should not be used to find probabilities related to \(X^2\).