Adding a quadratic term

An alternative solution to the problem of curvature is to extend the simple linear model with the addition of a quadratic term,

y  =  b0  +  b1 x  +  b2 x2

Fitted values and residuals are defined (and interpreted) in a similar way to those for a linear model,

  =  b0  +  b1 xi  +  b1 xi2
ei  =  yi

As in a linear model, the quadratic model's residuals are the vertical distances between the crosses in a scatterplot and the curve. We again use least squares to estimate the unknown parameters — choose values of the three parameters to minimise the residual sum of squares,

The scatterplot below shows a data set with a nonlinear relationship.

Drag the three red arrows to adjust the position of the quadratic curve. Observe that ...

Click the checkbox Show residuals. The residuals are displayed with blue vertical lines on the scatterplot. Adjust the coefficients using the red arrows to make the residuals as small as possible, then click the button Least squares to minimise the residual sum of squares.

If a relationship is nonlinear, residuals from a quadratic model are likely to be smaller than those for a linear model. This also suggests that the errors are likely to be smaller if the model is used to predict future response values.

The next example illustrates the potential improvement from using a quadratic model.

Forbes' data

We tried earlier to use a linear model to explain the relationship between barometric pressure and the boiling point of water. The scatterplot of residuals on the right below shows the problems with this linear model.

Firstly, click the checkbox Delete outlier to remove the outlier that we identified earlier. Note again the strong impression of curvature in the residual plot on the right.

Next choose the option Quadratic fit from the pull-down menu. The least squares quadratic curve is displayed on the left, and the residual plot on the right displays the residuals from the quadratic model. Note that ...