Simple linear model

Statistical analysis by researchers often examines the relationship between a single 'response' variable, Y, and other explanatory variables.

In earlier chapters of CAST, we have used a simple linear model to describe the effect of a single numerical explanatory variable, X, on a response variable, Y,

Inference about the parameters in this model describe how Y is related to X.

Two or more explanatory variables

In many situations, more than one explanatory variable could potentially be related to the response.

In this chapter, we extend the simple linear model to explain how two explanatory variables, X and Z, affect the response.


Body fat

Percentage body fat of individuals is an important measure of their health, but is a difficult quantity to measure. Accurate measurement of a person's body fat involves weighing the individual submersed in water, so is rarely done in fitness checks.

It is however possible to estimate percentage body fat from other body measurements that are easier to obtain. In order to determine how effectively body fat could be estimated from simple body measurements, scientists accurately determined body fat from a group of 252 men. Several other easily obtained measurements were also recorded from each subject.

Measurement   Person 1     Person 2     Person 3     ...  
Body fat (percent) 12.6 6.9 24.6 ...
Weight (lbs) 154.25 173.25 154.00 ...
Age (yrs) 23 22 22 ...
Height (inches) 67.75 72.25 66.25 ...
Neck circumference (cm) 36.2 38.5 34.0 ...
Chest circumference (cm) 93.1 93.6 95.8 ...
Abdomen circumference (cm) 85.2 83.0 87.9 ...
Hip circumference (cm) 94.5 98.7 99.2 ...
Thigh circumference (cm) 59.0 58.7 59.6 ...
Knee circumference (cm) 37.3 37.3 38.9 ...
Ankle circumference (cm) 21.9 23.4 24.0 ...
Extended biceps circumference (cm) 32.0 30.5 28.8 ...
Forearm circumference (cm) 27.4 28.9 25.2 ...
Wrist circumference (cm) 17.1 18.2 16.6 ...

The diagram below shows scatterplots of body fat against the other variables.

Use the popup menu to examine the relationship between body fat and each of the variables in turn. The correlation coefficients are also shown in the diagram — they provide a useful summary of the strength of the relationship.

Which variables are most strongly related to body fat?

Click Least Squares Line to show the least squares line that might be used to predict body fat from any single explanatory variable.

Could we improve the prediction by using two or more of the measurements?