1. Home
  2. AUPREDICT procedure

AUPREDICT procedure

Forms predictions from an unbalanced analysis of variance, performed by AUNBALANCED (R.W. Payne).


PRINT = string tokens What to print (description, predictions, se, sed, sedsummary, ese, lsd, lsdsummary, vcovariance); default pred, sed
MODEL = formula Model to use to calculate the predictions; default * i.e. full model fitted by AUNBALANCED
FACTORIAL = scalar Limit on number of factors or variates in each term specified by MODEL; default 3
COMBINATIONS = string token Factor combinations for which to form predicted means (present, estimable); default esti
ADJUSTMENT = string token Type of adjustment to be made when predicting means (marginal, equal, observed); default marg
WEIGHTS = table Weights classified by some or all of the factors in the model
PREDICTIONS = tables or scalars Saves predictions; default *
SE = tables or scalars Saves standard errors of predictions; default *
SED = symmetric matrices Saves matrices of standard errors of differences between predictions; default *
ESE = table Saves effective standard errors; default *
LSD = symmetric matrix Saves least significant differences between predictions; default *
LSDLEVEL = scalar Significance level (%) for least significant differences; default 5
VCOVARIANCE = symmetric matrices Saves variance-covariance matrices of predictions; default *
SAVE = identifier Save structure (from AUNBALANCED) containing details of the analysis for which predictions are required; if omitted, output is from the most recent use of AUNBALANCED


CLASSIFY = vectors Variates and/or factors to classify table of predictions
LEVELS = variates or scalars To specify values of variates, levels of factors


AUPREDICT can produce predicted means following an analysis of variance of an unbalanced design by AUNBALANCED. The predictions are calculated using the PREDICT directive. The first step (A) of the calculation forms a full table of predictions, classified by every factor in the model. The second step (B) averages the full table over the factors that do not occur in the table of means. The COMBINATIONS option specifies which cells of the full table are to be formed in Step A. The default setting, estimable, fills in all the cells other than those that involve parameters that cannot be estimated, for example because of aliasing. Alternatively, setting COMBINATIONS=present excludes the cells for factor combinations that do not occur in the data. The ADJUSTMENT and WEIGHTS options then define how the averaging is done in Step B. The WEIGHTS option allows you to specify your own table of weights to use in the averaging. Alternatively, if WEIGHTS is not set, the weights are formed automatically according to the setting of the ADJUSTMENT option. The default setting, marginal, of ADJUSTMENT forms a table of marginal weights for each factor, containing the proportion of observations with each of its levels; the full table of weights is then formed from the product of the marginal tables. The setting equal weights all the combinations equally. Finally, the setting observed uses the WEIGHTS option of PREDICT to weight each factor combination according to its own individual replication in the data.

Printed output, which extends the output available from PREDICT, is controlled by settings of the PRINT option:

    description standardization policies used when forming the predictions,
    predictions predictions,
    se predictions and standard errors,
    sed standard errors for differences between the predictions,
    sedsummary summary of the standard errors for differences between the predictions,
    lsd least significant differences between the predictions,
    lsdsummary summary of the least significant differences between the predictions,
    ese approximate effective standard errors – these are formed by procedure SED2ESE with the aim of allowing good approximations to the standard errors for differences to be calculated by the usual formula of sedi,j = √( esei2 + esej2 ), and
    vcovariance variance and covariances of the predictions.

The default is to print predictions and a summary of the standard errors of differences. The standard errors (and sed’s) are relevant for the predictions when considered as means of those data that have been analysed, with the means formed according to the averaging policy defined by the options of PREDICT. The word prediction is used because these are predictions of what the means would have been if the factor levels been replicated differently in the data; see Lane & Nelder (1982) for more details. The LSDLEVEL option specifies the significance level (%) to use in the calculation of least significant differences (default 5%).

Another extension in AUPREDICT is that you can produce predictions using a smaller model than the full model that has been fitted by AUNBALANCED. This can be useful if the full model contains many parameters. A substantial amount of time and computer workspace may then be needed to calculate the predictions and standard errors. Very large models may even exceed the capacity of some PCs.

You might choose to omit a term from the full model when forming a particular table of predictions if the term is orthogonal to all the terms involved in the table. For example, you might omit the term blocks when forming an A-by-B table of predictions if each combination of levels of the factors A and B is replicated the same number of times in every block. The justification is that an orthogonal term cannot affect the size of any of the differences between predictions. Different weighting of the levels of the orthogonal term may affect the overall mean of the predictions, but this is usually unimportant. If you omit the term, it is though you had included it with weightings based on the observed replication of its levels in the data set – and in any well-designed data set these should provide a satisfactory outcome. You might also omit a term if it is nearly orthogonal to the terms involved in the table, and you are happy to ignore its effect on the predictions.

The model is specified by the MODEL option. The FACTORIAL option sets a limit on number of factors or variates in each term specified by MODEL; default 3.

The PREDICTIONS, SE, SED, ESE, LSD and VCOVARIANCE options allow the results of the prediction to be save in appropriate Genstat data structures.

The SAVE option allows you to specify save structure from the analysis for which further output is required. If SAVE is not set, output will be produced for the most recent analysis from AUNBALANCED; however, none of the Genstat regression directives (MODEL, TERMS, FIT, ADD, DROP and so on) must then have been used in the interim.




The predictions are produced using the PREDICT directive.


Lane, P.W. & Nelder, J.A. (1982). Analysis of covariance and standardization as instances of prediction. Biometrics, 38, 613-621.

See also

Directive: PREDICT.


Commands for: Analysis of variance.


        'Data from Genstat 5 Release 1 Reference Manual, page 340.';\
FACTOR  [NVALUES=36; LEVELS=3; VALUES=12(1...3)] Block
FACTOR  [NVALUES=36; LABELS=!t(baresoil,emerald,emergo)] Leachate
&       [LABELS=!t('1','1/4','1/16','1/64')] Dilution
VARIATE [NVALUES=36] Nhatch,Nnohatch
READ    Leachate,Dilution,Nhatch,Nnohatch
  1           2         109         318
  3           4          54         350
  3           1           *         415
  2           2         783         212
  3           3         652        1375
  2           4         490         816
  1           3          95        1219
  2           1        1012          66
  1           4         166         943
  3           2        1059         313
  1           1         257        1006
  2           3        1058         234
  2           4         507        1119
  1           2         194         840
  1           3         175        1707
  1           1         326         609
  3           4         142         980
  2           3         286         230
  3           2         546         313
  2           2           *         301
  2           1        2471         112
  3           3          76         489
  1           4         208         503
  3           1           *         325
  1           1         322         913
  1           2         255        2246
  3           2        1774        1446
  2           2         999         193
  2           4         388        1836
  3           4         221        1800
  1           3         220        1902
  2           1        2821         187
  3           1        1486         463
  3           3         717        1473
  1           4         143         941
  2           3         968         550 :
CALCULATE          Logit%h = LOG(Nhatch/Nnohatch)
AUNBALANCED        [PRINT=aovtable] Logit%h
AUPREDICT          Leachate
&                  Dilution
&                  Leachate,Dilution
Updated on June 20, 2019

Was this article helpful?