We now describe two examples in which there is no exact formula for the standard error of the estimator.
Example: Geometric random sample
If \(\{x_1, x_2, \dots, x_n\}\) is a random sample from a geometric distribution with parameter \(\pi\), find a large-sample 90% confidence interval for the parameter \(\pi\).
The geometric probability function is
\[ p(x\;|\;\pi) = \pi(1-\pi)^{x-1} \]The log-likelihood is
\[ \ell(\pi) \;\; = \;\; \sum_{i=1}^n {\log p(x_i\;|\;\pi)} \;\; = \;\; n \log(\pi) + (\sum x - n) \log(1 - \pi)\]and its first derivative is
\[ \ell'(\pi) \;\; = \;\; \frac n {\pi} - \frac {\sum x - n} {1 - \pi} \]The maximum likelihood estimator was shown earlier to be
\[ \hat{\pi} \;\; = \;\; \frac n {\sum x} \;\; = \;\; \frac 1 {\overline x}\]Although this estimator is biased, maximum likelihood estimators are asymptotically unbiased, so the bias will be small in large samples.
In a similar way, there is no simple formula for the standard deviation of \(\hat{\pi}\), but we have an asymptotic formula that gives an approximate value for large samples. The second derivative of the log-likelihood is
\[ \ell''(\pi) \;\; = \;\; -\frac n {\pi^2} -\frac {\sum x - n} {(1 - \pi)^2} \]and replacing \(\pi\) by \(\hat{\pi}\) gives
\[ \ell''(\hat{\pi}) \;\; = \;\; -\frac n {\hat{\pi}^2(1-\hat{\pi})} \]The asymptotic value for the standard error is therefore
\[ \se(\hat {\pi}) \;\;\approx\;\; \sqrt {- \frac 1 {\ell''(\hat {\pi})}} \;\;=\;\; \sqrt {\frac {{\hat{\pi}}^2(1-\hat{\pi})} n} \]Since the large sample also means that the estimator will have a distribution that is close to normal, an approximate 90% confidence interval for \(\pi\) is
\[ \begin{align} \hat {\pi} \;\pm\; 1.645 \times \se(\hat {\pi}) \;\;&=\;\; \frac 1 {\overline x} \;\pm\; 1.645 \times \sqrt {\frac {{\hat{\pi}}^2(1-\hat{\pi})} n} \\ &=\;\; \frac 1 {\overline x} \;\pm\; 1.645 \times \sqrt{\frac {\overline x - 1} {n {\overline x}^3}} \end{align} \]In the next example, the Newton-Raphson algorithm is used to obtain the maximum likelihood estimate and its standard error.
Example: Log-series distribution
The following data set that is assumed to arise from a log-series distribution with probability function
\[ p(x) \;=\; \frac {-1} {\log(1-\theta)} \times \frac {\theta^x} x \quad\quad \text{for } x=1, 2, \dots \]3 | 5 | 1 | 4 | 8 | 10 | 2 | 1 | 1 | 2 |
1 | 8 | 1 | 6 | 13 | 1 | 6 | 2 | 1 | 3 |
1 | 1 | 1 | 2 | 1 | 6 | 1 | 1 | 1 | 1 |
Find a large-sample 95% confidence interval for the parameter \(\theta\).
The log-likelihood for a random sample from the log-series distribution is
\[ \ell(\theta) = \sum_{i=1}^n \log \left(p(x_i)\right) = {\sum x_i} \log(\theta) - n \times \log \left( \log(1 - \theta) \right) + K \]where \(K\) is a constant whose value does not depend on \(\theta\). The maximum likelihood estimate is the solution of
\[ \ell'(\theta) = \frac {\sum x_i} {\theta} + \frac n {(1 - \theta)\log(1 - \theta)} = 0 \]but cannot be solved algebraically for \(\theta\). Iterations of the Newton Raphson algorithm for numerically solving the equation also use the second derivative of the log-likelihood,
\[ \ell''(\theta) = -\frac {\sum x_i} {\theta^2} + \frac {n \left(1 + \log(1 - \theta) \right)} {(1 - \theta)^2\log^2(1 - \theta)} \]The following table shows iterations from an initial guess, \(\theta_0 = 0.7\).
Iteration, i | \(\theta_i\) | \(\ell'(\theta_i)\) | \(\ell''(\theta_i)\) |
---|---|---|---|
0 | 0.7000 | 52.656 | -240.78 |
1 | 0.9187 | -43.613 | -1200.14 |
2 | 0.8823 | -11.484 | -661.52 |
3 | 0.8650 | -1.139 | -538.41 |
4 | 0.8629 | -0.013 | -526.41 |
5 | 0.8628 | -0.000 | -526.28 |
6 | 0.8628 | -0.000 | -526.28 |
7 | 0.8628 |
From the second derivative, we can approximate the standard error of the estimator,
\[ \se(\hat {\pi}) \;\;\approx\;\; \sqrt {- \frac 1 {\ell''(\hat {\pi})}} \;\;=\;\; \sqrt {\frac 1 {526.28}} \;\;=\;\; 0.0436\]This leads to a 95% confidence interval
\[ \begin{align} \hat {\theta} \;\pm\; 1.96 \times \se(\hat {\theta}) \;\;&=\;\; 0.8628 \;\pm\; 1.96 \times 0.0436 \\ &=\;\; 0.7773 \text{ to } 0.9483 \end{align} \]