Problems evaluating maximum likelihood estimates
For some families of two-parameter distributions, it is difficult to find maximum likelihood estimates algebraically.
A numerical method must then be used to evaluate the maximum likelihood estimates.
Grid search
There are sometimes better algorithms, but a simple method would be to evaluate the log-likelihood over a grid of values of the two parameters, letting us identify the approximate values of the parameters that maximise the log-likelihood.
The grid of parameter values can then be refined to focus on a narrower range of possible parameter values. The method is more easily explained in an example.
Beta distribution
The following data set contains proportions between zero and one:
0.078 | 0.713 | 0.668 | 0.621 | 0.069 | 0.378 | 0.735 | 0.255 | 0.220 | 0.220 |
0.136 | 0.413 | 0.516 | 0.183 | 0.724 | 0.377 | 0.409 | 0.403 | 0.042 | 0.692 |
0.486 | 0.421 | 0.358 | 0.236 | 0.654 | 0.717 | 0.520 | 0.266 | 0.520 | 0.641 |
A reasonable distribution that could be used to model the data would be a beta distribution with probability density function
\[ f(x) \;\;=\;\; \begin{cases} \dfrac {\Gamma(\alpha +\beta) }{\Gamma(\alpha)\Gamma(\beta)} x^{\alpha - 1} (1 - x)^{\beta - 1}& \text{if }0 \lt x \le 1 \\ 0 & \text{otherwise} \end{cases} \]We will estimate \(\alpha\) and \(\beta\) by maximum likelihood. The beta distribution's log-likelihood is
\[ \begin{align} \ell(\alpha, \beta) \;=\; n \log \Gamma(\alpha + \beta) &- n \log \Gamma(\alpha) - n \log \Gamma(\beta) \\ &+ (\alpha - 1) \sum(\log(x_i) + (\beta - 1)\sum \log(1 - x_i) \end{align} \]Using the values in this data set, we therefore want to maximise
\[ \ell(\alpha, \beta) \;=\; 30 \log \Gamma(\alpha + \beta) - 30 \log \Gamma(\alpha) - 30 \log \Gamma(\beta) -31.89 (\alpha - 1) - 18.75 (\beta - 1) \]with respect to \(\alpha\) and \(\beta\). Differentiating with respect to \(\alpha\) and \(\beta\) requires the derivative of the log-gamma function and the resulting equations cannot be solved algebraically.
The following Excel spreadsheet shows how the log-likelihood can be evaluated for a grid of values of the two parameters. The formula in cell C7 evaluates the log-likelihood for \(\alpha = 1\) and \(\beta = 2\). When written in this way, the cell can be copied into the other cells in the table to evaluate the log-likelihood for all other combinations of parameter values in the grid.
From these log-likelihoods, the maximum is at \(\alpha \approx 1.8\) and \(\beta \approx 2.6\).
α | |||||||||
---|---|---|---|---|---|---|---|---|---|
1 | 1.2 | 1.4 | 1.6 | 1.8 | 2 | 2.2 | 2.4 | ||
β | 2 | 2.05 | 4.00 | 4.86 | 4.88 | 4.26 | 3.12 | 1.53 | -0.41 |
2.2 | 1.16 | 3.55 | 4.82 | 5.23 | 4.97 | 4.16 | 2.90 | 1.26 | |
2.4 | 0.02 | 2.82 | 4.47 | 5.25 | 5.33 | 4.84 | 3.89 | 2.54 | |
2.6 | -1.33 | 1.86 | 3.87 | 4.98 | 5.39 | 5.21 | 4.55 | 3.47 | |
2.8 | -2.86 | 0.69 | 3.04 | 4.48 | 5.19 | 5.30 | 4.92 | 4.11 | |
3 | -4.54 | -0.65 | 2.03 | 3.77 | 4.77 | 5.16 | 5.04 | 4.49 | |
3.2 | -6.35 | -2.14 | 0.84 | 2.88 | 4.16 | 4.81 | 4.95 | 4.64 | |
3.4 | -8.28 | -3.76 | -0.49 | 1.82 | 3.37 | 4.28 | 4.66 | 4.58 |
The grid can then be refined to a narrower range of values of the parameters,
α | |||||||||
---|---|---|---|---|---|---|---|---|---|
1.6 | 1.65 | 1.7 | 1.75 | 1.8 | 1.85 | 1.9 | 1.95 | ||
β | 2.4 | 5.245 | 5.325 | 5.363 | 5.363 | 5.326 | 5.254 | 5.148 | 5.010 |
2.45 | 5.204 | 5.305 | 5.364 | 5.384 | 5.367 | 5.315 | 5.229 | 5.110 | |
2.5 | 5.146 | 5.267 | 5.347 | 5.388 | 5.391 | 5.358 | 5.291 | 5.192 | |
2.55 | 5.072 | 5.214 | 5.314 | 5.374 | 5.397 | 5.383 | 5.336 | 5.256 | |
2.6 | 4.983 | 5.145 | 5.265 | 5.345 | 5.386 | 5.392 | 5.364 | 5.302 | |
2.65 | 4.879 | 5.060 | 5.200 | 5.299 | 5.360 | 5.385 | 5.375 | 5.332 | |
2.7 | 4.760 | 4.961 | 5.120 | 5.239 | 5.318 | 5.362 | 5.370 | 5.345 | |
2.75 | 4.627 | 4.848 | 5.026 | 5.163 | 5.262 | 5.324 | 5.350 | 5.343 |
A still finer grid is shown below.
α | |||||||||
---|---|---|---|---|---|---|---|---|---|
1.77 | 1.78 | 1.79 | 1.8 | 1.81 | 1.82 | 1.83 | 1.84 | ||
β | 2.52 | 5.39297 | 5.39512 | 5.39582 | 5.39507 | 5.39289 | 5.38929 | 5.38429 | 5.37790 |
2.53 | 5.39186 | 5.39480 | 5.39627 | 5.39631 | 5.39490 | 5.39208 | 5.38786 | 5.38224 | |
2.54 | 5.39009 | 5.39381 | 5.39606 | 5.39687 | 5.39625 | 5.39420 | 5.39075 | 5.38590 | |
2.55 | 5.38765 | 5.39215 | 5.39519 | 5.39677 | 5.39692 | 5.39565 | 5.39297 | 5.38889 | |
2.56 | 5.38456 | 5.38984 | 5.39365 | 5.39601 | 5.39693 | 5.39643 | 5.39452 | 5.39121 | |
2.57 | 5.38082 | 5.38687 | 5.39146 | 5.39459 | 5.39629 | 5.39656 | 5.39541 | 5.39287 | |
2.58 | 5.37643 | 5.38326 | 5.38862 | 5.39252 | 5.39499 | 5.39602 | 5.39564 | 5.39386 | |
2.59 | 5.37140 | 5.37900 | 5.38513 | 5.38981 | 5.39304 | 5.39484 | 5.39522 | 5.39420 |
From this, we can say that the maximum likelihood estimates are approximately
\[ \hat{\alpha} = 1.81 \spaced{and} \hat{\beta} = 2.56 \]