Normal model for explained and unexplained variation
In normal models, explained variation is modelled using a function for the mean response that depends on the values of the controlled variables and the known structure of the experimental units (e.g. blocks)
µi = f (xi, zi, ... )
This function also involves unknown parameters that must be estimated. The unexplained varitioan is modelled with a normal distribution. This model can be expressed in the form
yi = |
(explained) µi |
+ |
(unexplained) εi |
where the term εi is called the error term in the model. The error term is the random part of that models the unexplained variation and we usually make the assumption that it has mean zero and standard deviation, σ.
εi ∼ normal (0, σ)
This model assumes that the standard deviation of the error is the same for all experimental units — this assumption should be checked from graphical displays of the data.
Although the term εi is conventionally called the model's 'error', there is no implication of any kind of mistake.
Antibiotic effectiveness
An experiment was conducted to assess how well an antibiotic, polymyxin B, killed the bacterium, Brucella bronchiseptica. Petri dishes containing the bacterium (grown in agar) were used and different doses of the antibiotic were added. The diameter of the area cleared around the addition point were recorded. Each of 6 different doses of antibiotic was used 10 times.
In this experiment, the explanatory factor is the dose of antibiotic which is numerical. The diagram below shows the data from the experiment with the crosses jittered a little (randomly moved) to separate them in the scatterplot.
Click in the centre of the diagram and drag towards the top left to display histograms of the response measurement at each value of x.
Select Model from the pop-up menu to display a model that could potentially underly the data.
Observe that the standard deviation of the response is the same for all values of x in this model — all normal distributions have the same spread.
Clam lengths
In a study to determine the effects of thermal pollution on Corbicula fluminea (Asiatic clams), samples of clams were obtained from three different geographical locations. The diagram below shows the lengths of these clams.
Since the explanatory variable (location) is categorical, there is no requirement for a 'smooth' relationship between the mean clam length and location. We might model the effect of the effect of this variable by allowing arbitrary means for the three locations.
Again select Model from the pop-up menu to display one such model. Note that this model does not imply that the average lengths differ much between the locations.