We now describe two examples in which there is no exact formula for the standard error of the estimator.

Example: Geometric random sample

If \(\{x_1, x_2, \dots, x_n\}\) is a random sample from a geometric distribution with parameter \(\pi\), find a large-sample 90% confidence interval for the parameter \(\pi\).

The geometric probability function is

\[ p(x\;|\;\pi) = \pi(1-\pi)^{x-1} \]

The log-likelihood is

\[ \ell(\pi) \;\; = \;\; \sum_{i=1}^n {\log p(x_i\;|\;\pi)} \;\; = \;\; n \log(\pi) + (\sum x - n) \log(1 - \pi)\]

and its first derivative is

\[ \ell'(\pi) \;\; = \;\; \frac n {\pi} - \frac {\sum x - n} {1 - \pi} \]

The maximum likelihood estimator was shown earlier to be

\[ \hat{\pi} \;\; = \;\; \frac n {\sum x} \;\; = \;\; \frac 1 {\overline x}\]

Although this estimator is biased, maximum likelihood estimators are asymptotically unbiased, so the bias will be small in large samples.

In a similar way, there is no simple formula for the standard deviation of \(\hat{\pi}\), but we have an asymptotic formula that gives an approximate value for large samples. The second derivative of the log-likelihood is

\[ \ell''(\pi) \;\; = \;\; -\frac n {\pi^2} -\frac {\sum x - n} {(1 - \pi)^2} \]

and replacing \(\pi\) by \(\hat{\pi}\) gives

\[ \ell''(\hat{\pi}) \;\; = \;\; -\frac n {\hat{\pi}^2(1-\hat{\pi})} \]

The asymptotic value for the standard error is therefore

\[ \se(\hat {\pi}) \;\;\approx\;\; \sqrt {- \frac 1 {\ell''(\hat {\pi})}} \;\;=\;\; \sqrt {\frac {{\hat{\pi}}^2(1-\hat{\pi})} n} \]

Since the large sample also means that the estimator will have a distribution that is close to normal, an approximate 90% confidence interval for \(\pi\) is

\[ \begin{align} \hat {\pi} \;\pm\; 1.645 \times \se(\hat {\pi}) \;\;&=\;\; \frac 1 {\overline x} \;\pm\; 1.645 \times \sqrt {\frac {{\hat{\pi}}^2(1-\hat{\pi})} n} \\ &=\;\; \frac 1 {\overline x} \;\pm\; 1.645 \times \sqrt{\frac {\overline x - 1} {n {\overline x}^3}} \end{align} \]

In the next example, the Newton-Raphson algorithm is used to obtain the maximum likelihood estimate and its standard error.

Example: Log-series distribution

The following data set that is assumed to arise from a log-series distribution with probability function

\[ p(x) \;=\; \frac {-1} {\log(1-\theta)} \times \frac {\theta^x} x \quad\quad \text{for } x=1, 2, \dots \]
3 5 1 4 8 10 2 1 1 2
1 8 1 6 13 1 6 2 1 3
1 1 1 2 1 6 1 1 1 1

Find a large-sample 95% confidence interval for the parameter \(\theta\).

The log-likelihood for a random sample from the log-series distribution is

\[ \ell(\theta) = \sum_{i=1}^n \log \left(p(x_i)\right) = {\sum x_i} \log(\theta) - n \times \log \left( \log(1 - \theta) \right) + K \]

where \(K\) is a constant whose value does not depend on \(\theta\). The maximum likelihood estimate is the solution of

\[ \ell'(\theta) = \frac {\sum x_i} {\theta} + \frac n {(1 - \theta)\log(1 - \theta)} = 0 \]

but cannot be solved algebraically for \(\theta\). Iterations of the Newton Raphson algorithm for numerically solving the equation also use the second derivative of the log-likelihood,

\[ \ell''(\theta) = -\frac {\sum x_i} {\theta^2} + \frac {n \left(1 + \log(1 - \theta) \right)} {(1 - \theta)^2\log^2(1 - \theta)} \]

The following table shows iterations from an initial guess, \(\theta_0 = 0.7\).

Iteration, i \(\theta_i\) \(\ell'(\theta_i)\) \(\ell''(\theta_i)\)
0 0.7000 52.656 -240.78
1 0.9187 -43.613 -1200.14
2 0.8823 -11.484 -661.52
3 0.8650 -1.139 -538.41
4 0.8629 -0.013 -526.41
5 0.8628 -0.000 -526.28
6 0.8628 -0.000 -526.28
7 0.8628

From the second derivative, we can approximate the standard error of the estimator,

\[ \se(\hat {\pi}) \;\;\approx\;\; \sqrt {- \frac 1 {\ell''(\hat {\pi})}} \;\;=\;\; \sqrt {\frac 1 {526.28}} \;\;=\;\; 0.0436\]

This leads to a 95% confidence interval

\[ \begin{align} \hat {\theta} \;\pm\; 1.96 \times \se(\hat {\theta}) \;\;&=\;\; 0.8628 \;\pm\; 1.96 \times 0.0436 \\ &=\;\; 0.7773 \text{ to } 0.9483 \end{align} \]