Likelihood function
We defined the likelihood function of a discrete data set to be the probability of obtaining these data values, treated as a function of the unknown parameter, \(\theta\).
\[ L(\theta) \;=\; P(data \;| \; \theta) \]If \(\{x_1, x_2, \dots, x_n\}\) is a random sample from a discrete distribution with probability function \(p(x \mid \theta)\), this is
\[ L(\theta) \;=\; P(X_1 = x_1, X_2 = x_2, ..., X_n = x_n \;| \; \theta) \;\;=\;\; \prod_{i=1}^n {p(x_i \;| \; \theta)} \]If \(\{x_1, x_2, \dots, x_n\}\) is a random sample from a continuous distribution with probability density function \(f(x\;|\; \theta)\), we define the likelihood function in a similar way, based on the probability of getting values that are close to the observed data. We showed earlier that
\[ P(X_1 \approx x_1, X_2 \approx x_2, ..., X_n \approx x_n) \;\; \propto \;\; \prod_{i=1}^n f(x_i) \]Therefore the product of the probability density functions plays the same role for continuous random variables as the product of probability functions for discrete ones.
Definition
If random variables \(\{X_1, X_2, \dots, X_n\}\) are a random sample from a continuous distribution with probability density function \(f(x \;|\; \theta)\), then the function
\[ L(\theta) = \prod_{i=1}^n {f(x_i \;| \; \theta)} \]is called the likelihood function of \(\theta\).
Maximum likelihood estimate
The maximum likelihood estimate of \(\theta\) is again the value for which the observed data are most likely — the value that maximises \(L(\theta)\).
This is usually (but not always) a turning point of the likelihood function and can be found as the solution of the equation
\[ L'(\theta) \;\; =\;\; 0 \]As with discrete distributions, it is usually easier to solve the equivalent equation involving the logarithm of likelihood function
\[ \ell'(\theta) \;\; =\;\; \frac d {d \theta} \log\big(L(\theta)\big) \;\; =\;\; 0 \]