AUNBALANCED procedure

Performs analysis of variance for unbalanced designs (R.W. Payne).

Options

`PRINT` = string tokens	Controls printed output from the analysis (`aovtable`, `effects`, `means`, `residuals`, `screen`, `%cv`); default `aovt`, `mean`
`FACTORIAL` = scalar	Limit on number of factors in a treatment term; default 3
`PFACTORIAL` = scalar	Limit on number of factors in printed tables of predicted means; default 3
`NOMESSAGE` = string tokens	Which warning messages to suppress (`dispersion`, `leverage`, `residual`, `aliasing`, `marginality`, `vertical`, `df`, `inflation`); default `*` i.e. none
`FPROBABILITY` = string token	Printing of probabilities for variance ratios in the analysis-of-variance table (`yes`, `no`); default `no`
`TPROBABILITY` = string token	Printing of probabilities for t-tests of effects (`yes`, `no`); default `no`
`PLOT` = string tokens	Which residual plots to provide (`fittedvalues`, `normal`, `halfnormal`, `histogram`); default `*` i.e. none
`COMBINATIONS` = string token	Factor combinations for which to form predicted means (`present`, `estimable`); default `esti`
`ADJUSTMENT` = string token	Type of adjustment to be made when predicting means (`marginal`, `equal`, `observed`); default `marg`
`WEIGHTS` = variate	Weights for each unit; default `*` i.e. all units with weight one
`PSE` = string tokens	Types of standard errors to be printed with the predicted means (`differences`, `alldifferences`, `lsd`, `alllsd`, `means`, `ese`); default `diff`
`LSDLEVEL` = scalar	Significance level (%) for least significant differences; default 5
`RMETHOD` = string token	Type of residuals to plot (`simple`, `standardized`); default `simp`

Parameters

`Y` = variates	Data values to be analysed
`RESIDUALS` = variates	Variate to save the residuals from each analysis
`FITTEDVALUES` = variates	Variate to save the fitted values from each analysis
`SAVE` = identifiers	To save details of each analysis to use subsequently with the `AUDISPLAY` procedure

Description

This procedure carries out analysis of variance using the regression directives in Genstat. It is particularly useful for designs that are unbalanced and which thus cannot be analysed by the ANOVA directive.

The method of use is similar to that for ANOVA. The treatment terms to be fitted must be specified, before calling the procedure, by the TREATMENTSTRUCTURE directive. Similarly, any covariates must be indicated by the COVARIATE directive. The procedure also takes account of any blocking structure specified by the BLOCKSTRUCTURE directive. However, it cannot produce stratified analyses like those generated by ANOVA, and is able to estimate treatments and covariates only in the “bottom stratum”. So, for example, the full analysis can be produced for a randomized block design, where the treatments are all estimated on the plots within blocks, but it cannot produce the whole-plot analysis in a split plot design.

The parameters of the procedure are identical to those of ANOVA. The variates to be analysed are specified by the Y parameter. Residuals and fitted values can be saved using the RESIDUALS and FITTEDVALUES parameters respectively. Finally, the SAVE parameter allows details of the analysis to be saved so that further output can be obtained using the AUDISPLAY procedure, or information can be copied into Genstat data structures using the AUKEEP procedure. (Note that this is a regression save structure, not an ANOVA structure, so it cannot be used with the directives ADISPLAY or AKEEP.)

Printed output is controlled by the PRINT option, with settings: aovtable to print the analysis-of-variance table, effects to print the effects (as estimated by Genstat regression), means to print tables of predicted means with standard errors, residuals to print residuals and fitted values, screen to print “screening” tests for treatment terms, and %cv to print the coefficient of variation. The default is to print the analysis-of-variance table and tables of means.

The model is fitted sequentially, first any block terms, then any covariates and then the treatments. Thus, the sum of square in each line of the analysis-of-variance table is for the term concerned, eliminating the effects of terms in earlier lines and ignoring the effects of terms lower in the table. In particular, the sums of squares for covariates are ignoring treatments, and not after eliminating treatments (as with the ANOVA directive). Alternatively, the screen setting calls the RSCREEN procedure to provide screening tests for the treatment terms: marginal tests to assess the effect of adding each term to the simplest possible model (i.e. a model containing any blocks and covariates, and any terms marginal to the term); conditional tests to assess the effect of adding each term to the fullest possible model (i.e. a model containing all terms other than those to which the term is marginal). For example, if we have

BLOCKSTRUCTURE Blocks

and

TREATMENTSTRUCTURE A + B + A.B

the marginal test for A will show the effect of adding A to a model containing only Blocks, while the conditional test will show the effect of adding A to a model containing Blocks and B. (The terms A and B are marginal to A.B.)

Tables of means are calculated using the PREDICT directive. The first step (A) of the calculation forms the full table of predictions, classified by every factor in the model. The second step (B) averages the full table over the factors that do not occur in the table of means. The COMBINATIONS option specifies which cells of the full table are to be formed in Step A. The default setting, estimable, fills in all the cells other than those that involve parameters that cannot be estimated, for example because of aliasing. Alternatively, setting COMBINATIONS=present excludes the cells for factor combinations that do not occur in the data. The ADJUSTMENT option then defines how the averaging is done in Step B. The default setting, marginal, forms a table of marginal weights for each factor, containing the proportion of observations with each of its levels; the full table of weights is then formed from the product of the marginal tables. The setting equal weights all the combinations equally. Finally, the setting observed uses the WEIGHTS option of PREDICT to weight each factor combination according to its own individual replication in the data.

The PSE option controls the types of standard errors that are produced to accompany the tables of means, with settings:

`differences`	summary of standard errors for differences between pairs of means;
`alldifferences`	standard errors for differences between all pairs of means;
`lsd`	summary of least significant differences between pairs of means;
`alllsd`	least significant differences between all pairs of means;
`means`	standard errors of the means (relevant for comparing them with zero);
`ese`	approximate effective standard errors – these are formed by procedure `SED2ESE` with the aim of allowing good approximations to the standard errors for differences to be calculated by the usual formula of sed_i_,j = √( ese_i² + ese_j² ).

The default is differences. The LSDLEVEL option sets the significance level (as a percentage) for the least significant differences.

The FACTORIAL option sets a limit on the number of factors that a higher-order term, such as an interaction, can contain; any terms with more factors are deleted from the analysis. Similarly, the PFACTORIAL option limits the number of factors in terms for which predicted means are printed. Probabilities can be printed for variance ratios by setting option FPROBABILITY=yes, and probabilities for t-tests of effects by setting option TPROBABILITY=yes. The WEIGHTS option allows a variate of weights to be specified for a weighted analysis of variance. The NOMESSAGE option allows various warning messages (produced by the FIT directive) to be suppressed, and the PLOT option allows various residual plots to be requested: fittedvalues for a plot of residuals against fitted values, normal for a Normal plot, halfnormal for a half Normal plot, and histogram for a histogram of residuals. By default, simple residuals are plotted, but you can set option RMETHOD=standardized to plot standardized residuals instead.

Options: PRINT, FACTORIAL, PFACTORIAL, NOMESSAGE, FPROBABILITY, TPROBABILITY, PLOT, COMBINATIONS, ADJUSTMENT, PSE, WEIGHTS, LSDLEVEL, RMETHOD.

Parameters: Y, RESIDUALS, FITTEDVALUES, SAVE.

Method

The y-variate is specified using the MODEL directive, along with any variates to save residuals and fitted values. The current settings of the TREATMENTSTRUCTURE and COVARIATE directives are recovered using the SET directive, and used to define the terms in the analysis (using the TERMS directive). The model is then fitted (using FIT), AUDISPLAY is called to print the output and any plots of residuals.

Action with `RESTRICT`

If the Y variate is restricted, only the units not excluded by the restriction will be analysed.

Example

CAPTION 'AUNBALANCED example',\
        'Data from Genstat 5 Release 1 Reference Manual, page 340.';\
        STYLE=meta,plain
FACTOR  [NVALUES=36; LEVELS=3; VALUES=12(1...3)] Block
FACTOR  [NVALUES=36; LABELS=!t(baresoil,emerald,emergo)] Leachate
&       [LABELS=!t('1','1/4','1/16','1/64')] Dilution
VARIATE [NVALUES=36] Nhatch,Nnohatch
READ    Leachate,Dilution,Nhatch,Nnohatch
  1           2         109         318
  3           4          54         350
  3           1           *         415
  2           2         783         212
  3           3         652        1375
  2           4         490         816
  1           3          95        1219
  2           1        1012          66
  1           4         166         943
  3           2        1059         313
  1           1         257        1006
  2           3        1058         234
  2           4         507        1119
  1           2         194         840
  1           3         175        1707
  1           1         326         609
  3           4         142         980
  2           3         286         230
  3           2         546         313
  2           2           *         301
  2           1        2471         112
  3           3          76         489
  1           4         208         503
  3           1           *         325
  1           1         322         913
  1           2         255        2246
  3           2        1774        1446
  2           2         999         193
  2           4         388        1836
  3           4         221        1800
  1           3         220        1902
  2           1        2821         187
  3           1        1486         463
  3           3         717        1473
  1           4         143         941
  2           3         968         550 :
CALCULATE          Logit%h = LOG(Nhatch/Nnohatch)
BLOCKSTRUCTURE     Block
TREATMENTSTRUCTURE Leachate*Dilution
AUNBALANCED        [PSE=differences,alldifferences] Logit%h

Updated on June 20, 2019

Was this article helpful?

Yes No