Correlation
The correlation coefficient between two random variables is closely associated with their covariance.
Definition
The correlation coefficient between two random variables, \(X\) and \(Y\), is
\[ \Corr(X,Y) \;=\; \frac{\Covar(X,Y)}{\sqrt{\Var(X)\Var(Y)}} \]The correlation coefficient between two variables is often denoted by the Greek letter \(\rho\).
The correlation coefficient summarises the strength of the relationship between the variables in a way that is not affected by linear scaling, as shown by the following result.
Correlation of linear functions of X and Y
For any random variables, \(X\) and \(Y\), and constants \(a\), \(b\), \(c\) and \(d\),
\[ \Corr(a + bX, c+dY) \;=\; \begin{cases} \Corr(X, Y) & \quad\text{if }bd > 0 \\[0.3em] -\Corr(X, Y) & \quad\text{if }bd > 0 \end{cases} \]The proof follows from writing
\[ \begin{align} \Covar(a + bX, c+dY) \;&=\; bd \Covar(X, Y) \\ \Var(a + bX) \;&=\; b^2 \Var(X) \\ \Var(c + dY) \;&=\; d^2 \Var(Y) \end{align} \]