Using linear models with transformed variables

If the relationship between Y and X is nonlinear, a linear model will give poor predictions and must be avoided. However, by transforming one or both of the variables, it is often possible to linearise the relationship and therefore use least squares to fit a linear model to the transformed variables.

For many data sets, a logarithmic transformation works, but a more general power transformation is sometimes needed to linearise the relationship.

Survival of marine bacteria

The scatterplot on the left below again shows the numbers of a marine bacterium surviving exposure to different doses of X-rays.

Initially, only consider the scatterplot on the left. Drag the red line on the vertical axis upwards to apply a power transformation to the number of survivors. When the power becomes close to zero (which corresponds to a log transformation of the response), the relationship becomes nearly linear. (You may use the up- and down-arrow keys on your keyboard to fine-tune the transformation.)

The least squares line (calculated using the transformed response) is also shown in grey on the left. The scatterplot on the right shows the untransformed data. Observe that...

The straight line fitted to the transformed data on the left is equivalent to a curve on the plot of the raw data