Smoothness in a categorical model

The previous section described a model without interactionfor experiments with two factors, X and Z. For the k 'th replicate observation when factor X is at level i and factor Z is at level j , the response was modelled by the following equation:


yijk =  µ + 
(explained by X)
βi

 + 
(explained by Z)
γj

 + 
(unexplained)
εijk

where β1 = 0 and γ1 = 0.

If X has a levels, this model has (a - 1) unknown parameters, β2, β3, ..., βa, corresponding to the levels of the factor. This flexibility allows the factor levels to have arbitrary effects on the response.

If the factor is numerical, there is usually an expectation of smoothness in the relationship between the factor level and the response. Refining the model to impose some form of smoothness on the relationship usually provides a better estimate of the effect the factor.

A categorical model for a numerical factor often does not display a 'smooth' relationship and cannot be used to predict the response at intermediate values of the factor.

The same argument suggests that the set of parameters j} for the factor Z may be better replaced by a term that imposes some smoothness to the relationship.

Soybean yield and trace elements

An experiment was conducted to compare the effect of four manganese rates (from MnSO4) and four copper rates (from CuSO4 5H2O) on the yield of soybeans. A large field was subdivided into 32 separate plots. Two plots were randomly assigned to each of the 16 combinations of levels of the two factors. Soybeans were then planted over the entire field in rows 3 feet apart. The yields (in kg per hectare) are shown in the diagram below.

Observe that:

The model does not allow us to predict the thrust at intermediate values of the factors. For example, we cannot use it to predict yield if Mn is 100 and Cu is 6.