Three parameters of the normal linear model

A normal linear model,

μy  =  β0  +  β1x

σy  =  σ

involves 3 parameters, β0, β1 and σ. These parameters provide considerable flexibility to the model.

Drag the three red arrows to adjust the parameters of the normal linear model.

Click Take sample a few times to verify that approximately 95% of values are within the grey band.

(Note that the values of X are not fixed in this example — they vary from sample to sample. The normal linear model does not attempt to describe variability in X, though a standard univariate distribution such as a normal distribution might fit the distribution of X in this example.)

Interpreting the model's slope and intercept

The most important parameters of a linear model are its slope, β1, and intercept, β0. These can be interpreted in a similar way to the slope and intercept of the least squares lines that were fitted to data in an earlier chapter.

Slope
The slope parameter, β1, is the more important of the two. It specifies the increase in the mean response per unit increase in X.
Intercept
The intercept, β0, is the mean response when X = 0. (In some contexts, this may not have an easily interpreted meaning.)
Context Interpretation of β1 Interpretation of β0
Y = Yield of wheat per acre
X = Fertiliser (kg per m2)
Increase in mean yield per acre for each additional kg/m2 of fertiliser Mean yield per acre if no fertiliser is used
Y = Exam mark
X = Hours of study by student before exam
Increase in expected mark for each additional hour of study Expected mark if there is no study
Y = Hospital stay (days)
X = Age of patient
Average extra days in hospital per extra year of age Average days in hospital at age 0. Not particularly meaningful here.

Esimating the parameters by eye

The regression line (i.e. the straight line showing how the mean depends on x) and the band that is 2σ above and below it are a good way to understand the normal linear model. Indeed they can be used as an informal way to estimate the parameters of the model 'by eye'. (We will give better methods in the next section.)

Artificial data

A normal linear model might be used to describe how the response depends on the explanatory variable.

In the next section, we will explain how to objectively estimate the parameters to match a data set. Click Best values to see these 'best' parameter values.

If there are fewer values or if the relationship is weaker, it is harder to position the band by eye.

Cotton yield and irrigation

The data set below gives the yield of Pima cotton (pounds per acre) and irrigation levels (water applied in feet per acre) in the Salt River Valley for different plots of land. Each plot was on Maricopa sandy loam soil.

Again drag the three arrows to obtain estimates of the three model parameters. It is much harder by eye than for the previous data set.