1. Home
2. AUNBALANCED procedure

# AUNBALANCED procedure

Performs analysis of variance for unbalanced designs (R.W. Payne).

### Options

`PRINT` = string tokens Controls printed output from the analysis (`aovtable`, `effects`, `means`, `residuals`, `screen`, `%cv`); default `aovt`, `mean` Limit on number of factors in a treatment term; default 3 Limit on number of factors in printed tables of predicted means; default 3 Which warning messages to suppress (`dispersion`, `leverage`, `residual`, `aliasing`, `marginality`, `vertical`, `df`, `inflation`); default `*` i.e. none Printing of probabilities for variance ratios in the analysis-of-variance table (`yes`, `no`); default `no` Printing of probabilities for t-tests of effects (`yes`, `no`); default `no` Which residual plots to provide (`fittedvalues`, `normal`, `halfnormal`, `histogram`); default `*` i.e. none Factor combinations for which to form predicted means (`present`, `estimable`); default `esti` Type of adjustment to be made when predicting means (`marginal`, `equal`, `observed`); default `marg` Weights for each unit; default `*` i.e. all units with weight one Types of standard errors to be printed with the predicted means (`differences`, `alldifferences`, `lsd`, `alllsd`, `means`, `ese`); default `diff` Significance level (%) for least significant differences; default 5 Type of residuals to plot (`simple`, `standardized`); default `simp`

### Parameters

`Y` = variates Data values to be analysed Variate to save the residuals from each analysis Variate to save the fitted values from each analysis To save details of each analysis to use subsequently with the `AUDISPLAY` procedure

### Description

This procedure carries out analysis of variance using the regression directives in Genstat. It is particularly useful for designs that are unbalanced and which thus cannot be analysed by the `ANOVA` directive.

The method of use is similar to that for `ANOVA`. The treatment terms to be fitted must be specified, before calling the procedure, by the `TREATMENTSTRUCTURE` directive. Similarly, any covariates must be indicated by the `COVARIATE` directive. The procedure also takes account of any blocking structure specified by the `BLOCKSTRUCTURE` directive. However, it cannot produce stratified analyses like those generated by `ANOVA`, and is able to estimate treatments and covariates only in the “bottom stratum”. So, for example, the full analysis can be produced for a randomized block design, where the treatments are all estimated on the plots within blocks, but it cannot produce the whole-plot analysis in a split plot design.

The parameters of the procedure are identical to those of `ANOVA`. The variates to be analysed are specified by the `Y` parameter. Residuals and fitted values can be saved using the `RESIDUALS` and `FITTEDVALUES` parameters respectively. Finally, the `SAVE` parameter allows details of the analysis to be saved so that further output can be obtained using the `AUDISPLAY` procedure, or information can be copied into Genstat data structures using the `AUKEEP` procedure. (Note that this is a regression save structure, not an `ANOVA` structure, so it cannot be used with the directives `ADISPLAY` or `AKEEP`.)

Printed output is controlled by the `PRINT` option, with settings: `aovtable` to print the analysis-of-variance table, `effects` to print the effects (as estimated by Genstat regression), `means` to print tables of predicted means with standard errors, `residuals` to print residuals and fitted values, `screen` to print “screening” tests for treatment terms, and `%cv` to print the coefficient of variation. The default is to print the analysis-of-variance table and tables of means.

The model is fitted sequentially, first any block terms, then any covariates and then the treatments. Thus, the sum of square in each line of the analysis-of-variance table is for the term concerned, eliminating the effects of terms in earlier lines and ignoring the effects of terms lower in the table. In particular, the sums of squares for covariates are ignoring treatments, and not after eliminating treatments (as with the `ANOVA` directive). Alternatively, the `screen` setting calls the `RSCREEN` procedure to provide screening tests for the treatment terms: marginal tests to assess the effect of adding each term to the simplest possible model (i.e. a model containing any blocks and covariates, and any terms marginal to the term); conditional tests to assess the effect of adding each term to the fullest possible model (i.e. a model containing all terms other than those to which the term is marginal). For example, if we have

`BLOCKSTRUCTURE Blocks`

and

`TREATMENTSTRUCTURE A + B + A.B`

the marginal test for `A` will show the effect of adding `A` to a model containing only `Blocks`, while the conditional test will show the effect of adding `A` to a model containing `Blocks` and `B`. (The terms `A` and `B` are marginal to `A.B`.)

Tables of means are calculated using the `PREDICT` directive. The first step (A) of the calculation forms the full table of predictions, classified by every factor in the model. The second step (B) averages the full table over the factors that do not occur in the `table of means. `The `COMBINATIONS` option specifies which cells of the full table are to be formed in Step A. The default setting, `estimable`, fills in all the cells other than those that involve parameters that cannot be estimated, for example because of aliasing. Alternatively, setting `COMBINATIONS=present` excludes the cells for factor combinations that do not occur in the data. The `ADJUSTMENT` option then defines how the averaging is done in Step B. The default setting, `marginal`, forms a table of marginal weights for each factor, containing the proportion of observations with each of its levels; the full table of weights is then formed from the product of the marginal tables. The setting `equal` weights all the combinations equally. Finally, the setting `observed` uses the `WEIGHTS` option of `PREDICT` to weight each factor combination according to its own individual replication in the data.

The `PSE` option controls the types of standard errors that are produced to accompany the tables of means, with settings:

    `differences` summary of standard errors for differences between pairs of means; standard errors for differences between all pairs of means; summary of least significant differences between pairs of means; least significant differences between all pairs of means; standard errors of the means (relevant for comparing them with zero); approximate effective standard errors – these are formed by procedure `SED2ESE` with the aim of allowing good approximations to the standard errors for differences to be calculated by the usual formula of sedi,j = √( esei2 + esej2 ).

The default is `differences`. The `LSDLEVEL` option sets the significance level (as a percentage) for the least significant differences.

The `FACTORIAL` option sets a limit on the number of factors that a higher-order term, such as an interaction, can contain; any terms with more factors are deleted from the analysis. Similarly, the `PFACTORIAL` option limits the number of factors in terms for which predicted means are printed. Probabilities can be printed for variance ratios by setting option `FPROBABILITY=yes`, and probabilities for t-tests of effects by setting option `TPROBABILITY=yes`. The `WEIGHTS` option allows a variate of weights to be specified for a weighted analysis of variance. The `NOMESSAGE` option allows various warning messages (produced by the `FIT` directive) to be suppressed, and the `PLOT` option allows various residual plots to be requested: `fittedvalues` for a plot of residuals against fitted values, `normal` for a Normal plot, `halfnormal` for a half Normal plot, and `histogram` for a histogram of residuals. By default, simple residuals are plotted, but you can set option `RMETHOD=standardized` to plot standardized residuals instead.

Options: `PRINT`, `FACTORIAL`, `PFACTORIAL`, `NOMESSAGE`, `FPROBABILITY`, `TPROBABILITY`, `PLOT`, `COMBINATIONS`, `ADJUSTMENT`, `PSE`, `WEIGHTS`, `LSDLEVEL`, `RMETHOD`.

Parameters: `Y`, `RESIDUALS`, `FITTEDVALUES`, `SAVE`.

### Method

The y-variate is specified using the `MODEL` directive, along with any variates to save residuals and fitted values. The current settings of the `TREATMENTSTRUCTURE` and `COVARIATE` directives are recovered using the `SET` directive, and used to define the terms in the analysis (using the `TERMS` directive). The model is then fitted (using `FIT`), `AUDISPLAY` is called to print the output and any plots of residuals.

### Action with `RESTRICT`

If the `Y` variate is restricted, only the units not excluded by the restriction will be analysed.

Directives: `ANOVA`, `REML`.

Procedures: `AUDISPLAY`, `AUGRAPH`, `AUPREDICT`, `AUMCOMPARISON`, `AUKEEP`.

Commands for: Analysis of variance.

### Example

```CAPTION 'AUNBALANCED example',\
'Data from Genstat 5 Release 1 Reference Manual, page 340.';\
STYLE=meta,plain
FACTOR  [NVALUES=36; LEVELS=3; VALUES=12(1...3)] Block
FACTOR  [NVALUES=36; LABELS=!t(baresoil,emerald,emergo)] Leachate
&       [LABELS=!t('1','1/4','1/16','1/64')] Dilution
VARIATE [NVALUES=36] Nhatch,Nnohatch
1           2         109         318
3           4          54         350
3           1           *         415
2           2         783         212
3           3         652        1375
2           4         490         816
1           3          95        1219
2           1        1012          66
1           4         166         943
3           2        1059         313
1           1         257        1006
2           3        1058         234
2           4         507        1119
1           2         194         840
1           3         175        1707
1           1         326         609
3           4         142         980
2           3         286         230
3           2         546         313
2           2           *         301
2           1        2471         112
3           3          76         489
1           4         208         503
3           1           *         325
1           1         322         913
1           2         255        2246
3           2        1774        1446
2           2         999         193
2           4         388        1836
3           4         221        1800
1           3         220        1902
2           1        2821         187
3           1        1486         463
3           3         717        1473
1           4         143         941
2           3         968         550 :
CALCULATE          Logit%h = LOG(Nhatch/Nnohatch)
BLOCKSTRUCTURE     Block
TREATMENTSTRUCTURE Leachate*Dilution
AUNBALANCED        [PSE=differences,alldifferences] Logit%h
```
Updated on June 20, 2019