Distribution of an estimator
If \({X_1, X_2, \dots, X_n}\) is a random sample from a distribution involving a single unknown parameter \(\theta\), an estimator \( \hat{\theta}(X_1, X_2, \dots, X_n) \) is a function of these \(n\) random variables and also has a distribution. It is often simply written as \( \hat{\theta}\).
The properties of an estimator depend on its distribution. For example, an estimator with a continuous distribution might have the pdf show below.
For \( \hat{\theta}\) to be a good estimator of \(\theta\), its distribution should concentrated near \(\theta\).
Bias
A good estimator of \(\theta\) should have a distribution whose "centre" is close to \(\theta\). This can be summarised by the distance of the estimator's mean from \(\theta\).
Definition
The bias of an estimator \(\hat{\theta}\) of a parameter \(\theta\) is defined to be
\[ \Bias(\hat{\theta}) \;=\; E[\hat{\theta}] - \theta \]If its bias is zero, \(\hat{\theta}\) is called an unbiased estimator of \(\theta\).
Many popular estimators are unbiased.
Sample mean
If \({X_1, X_2, \dots, X_n}\) is a random sample from a distribution with mean \(\mu\), the sample mean, \(\overline{X}\), is an unbiased estimator of the distribution mean, \(\mu\).
(Proved in full version)
A sample variance is also an unbiased estimator of a distribution's variance, \(\sigma^2\), but this is harder to prove.
Sample variance
If \({X_1, X_2, \dots, X_n}\) is a random sample from a distribution with variance \(\sigma^2\), the sample variance,
\[ S^2 = \sum_{i=1}^n {\frac {(X_i - \overline{X})^2} {n-1}} \]is an unbiased estimator of \(\sigma^2\).
(Proved in full version)
Although the sample variance is unbiased, the sample standard deviation is a biased estimator.
Sample standard deviation
The sample standard deviation, \(S\), is a biased estimator of a distribution's standard deviation, \(\sigma\).
(Proved in full version)