Copies information from an `ANOVA`

analysis into Genstat data structures.

### Options

`FACTORIAL` = scalar |
Limit on number of factors in a model term; default 3 |
---|---|

`STRATUM` = formula |
Model term of the lowest stratum to be searched for effects; default `*` implies the lowest stratum |

`SUPPRESSHIGHER` = string token |
Whether to suppress the searching of higher strata if a term is not found in `STRATUM` (`yes` , `no` ); default `no` |

`TWOLEVEL` = string token |
Representation of effects in 2^{n} experiments (`responses` , `Yates` , `effects` ); default `resp` |

`RESIDUALS` = variate |
Saves residuals from the final stratum (as in the `RESIDUALS` parameter of `ANOVA` ) |

`FITTEDVALUES` = variate |
Saves fitted values (data values or missing value estimates, minus the residuals from the final stratum – as in the `FITTEDVALUES` parameter of `ANOVA` ) |

`CBRESIDUALS` = variate |
Saves the sum of the residuals from all the strata |

`CBCREGRESSION` = variate |
Saves the estimates of the covariate regression coefficients, combining information from all the strata |

`CBCVCOVARIANCE` = symmetric matrix |
Saves the variance-covariance matrix of the combined estimates of the covariate regression coefficients |

`TREATMENTSTRUCTURE` = formula structure |
Saves the treatment formula used for the analysis |

`BLOCKSTRUCTURE` = formula structure |
Saves the block formula used for the analysis |

`AFACTORIAL` = scalar |
Saves the setting of the `FACTORIAL` option used in the `ANOVA` command that performed the analysis |

`WEIGHTS` = variate |
Saves the weights used in the analysis |

`YVARIATE` = dummy |
Dummy to be set to the y-variate of the analysis |

LSDLEVEL = scalar |
Significance level (%) to use in the calculation of least significant differences; default 5 |

`AOVTABLE` = pointer |
Saves the analysis-of-variance table as a pointer with a variate or text for each column (source, d.f., s.s., m.s. etc) |

`EQFACTORS` = factors |
Factors whose levels are to be assumed to be equal within the comparisons between means calculated for `SEMEANS` |

`RMETHOD` = string token |
Type of residuals to form if the `RESIDUALS` option or parameter is set (`simple` , `standardized` ); default `simp` |

`EXIT` = scalar |
Saves an exit code indicating the properties of the design |

`SAVE` = identifier |
Defines the Save structure (from `ANOVA` ) that provides details of the analysis; default `*` gives that from the most recent `ANOVA` |

### Parameters

`TERMS` = formula |
Model terms for which information is required |
---|---|

`MEANS` = tables |
Table to store means for each term (available for treatment terms only) |

`SEMEANS` = tables |
Table of effective standard errors for the means, usable for calculating standard errors for differences between means in the table, at equal levels of the factors specified by the `EQFACTORS` option |

`SEDMEANS` = symmetric matrices |
Standard errors for comparisons between every pair of entries in the table of means |

`VCMEANS` = symmetric matrices |
Variances and covariances of means |

`EFFECTS` = tables or scalars |
Table or scalar (for terms with 1 d.f. when `TWOLEVEL=responses` or `Yates` ) to store effects (for treatment terms only) |

`PARTIALEFFECTS` = tables |
Table or scalar (for terms with 1 d.f. when `TWOLEVEL=responses` or `Yates` ) to store partial effects (for treatment terms only) |

`REPLICATIONS` = tables or scalars |
Table to store replications or scalar if they are all equal |

`RESIDUALS` = tables |
Table to store residuals (for block terms only) |

`DF` = scalars |
Number of degrees of freedom for each term |

`LSDMEANS` = symmetric matrices |
Least significant differences of means |

`DFMEANS` = symmetric matrices |
Degrees of freedom for comparisons between every pair of entries in the table of means |

`SS` = scalars |
Sum of squares for each term |

`EFFICIENCY` = scalars |
Efficiency factor for each term |

`VARIANCE` = scalars |
Unit variance for the effects of each term |

`RTERM` = formula structures |
Residual terms: for a treatment term this saves the lowest stratum where the term is estimated (down to the stratum specified by the `STRATUM` option); for a block term it saves all the strata to which it would be appropriate to compare the term |

`CEFFICIENCY` = scalars |
Covariance efficiency factor for each term |

`CREGRESSION` = variates |
Estimated regression coefficients for the covariates in the specified stratum |

`CVCOVARIANCE` = symmetric matrix |
Variance-covariance matrix of the covariate regression coefficients in the specified stratum |

`CSSP` = symmetric matrices |
Covariate sums of squares and products in the specified stratum |

`CONTRASTS` = pointers |
Estimates for the fitted contrasts of each treatment term, stored in a pointer to scalars or tables; units of the pointer are labelled by the contrast name (as used in the analysis-of-variance table) |

`XCONTRASTS` = pointers |
X-variates used to fit contrasts, as orthogonalized by `ANOVA` , stored in a pointer to tables; units of the pointer are labelled as for `CONTRASTS` |

`SECONTRASTS` = pointers |
Standard errors for estimated contrasts, stored in a pointer to scalars or tables; units of the pointer are labelled as for `CONTRASTS` |

`DFCONTRASTS` = pointers |
Degrees of freedom for estimated contrasts, stored in a pointer to scalars; units of the pointer are labelled as for `CONTRASTS` |

`CBMEANS` = tables |
Table to store estimates of the means, combining information from all the strata (for treatment terms only) |

`SECBMEANS` = tables |
Table of standard errors for the combined means, usable for calculating standard errors for differences between means in the table, at equal levels of the factors specified by the `EQFACTORS` option |

`SEDCBMEANS` = symmetric matrices |
Standard errors for comparisons between every pair of entries in the table of combined means |

`LSDCBMEANS` = symmetric matrices |
Least significant differences of combined means |

`VCCBMEANS` = symmetric matrices |
Variances and covariances of combined means |

`DFCBMEANS` = symmetric matrices |
Effective degrees of freedom for comparisons between every pair of entries in the table of combined means |

`CBEFFECTS` = tables or scalars |
Table or scalar (for terms with 1 d.f. when `TWOLEVEL=responses` or `Yates` ) to store estimates of the effects, combining information from all the strata (for treatment terms only) |

`CBVARIANCE` = scalars |
Unit variance for the combined estimates of the effects of each term |

`DFCEFFECTS` = scalars |
Effective degrees of freedom for the combined estimates of the effects of each term |

`CBCEFFICIENCY` = scalars |
Covariance efficiency factor for the combined estimates of each term |

`STRATUMVARIANCE` = scalars |
Estimates of the stratum variances (for block terms only) |

`COMPONENT` = scalars |
Stratum variance components (for block terms only) |

`STATUS` = scalars |
Status code describing how the term is estimated (together with its marginal terms, if the term is a treatment term) |

### Description

`AKEEP`

allows you to copy components of the output from an analysis of variance into standard Genstat data structures. You can save the information from the analysis in a save structure, using the `SAVE`

option of `ANOVA`

and then specify the same structure in the `SAVE`

option of `AKEEP`

. Alternatively, Genstat automatically stores the save structure from the last y-variate that has been analysed, and this is used as a default by `AKEEP`

if you do not specify a save structure explicitly. Note, however, that the save structure does not store the y-variate nor the block and treatment factors. Almost all of the items saved by AKEEP will be unaffected by any changes to their values. However, the y-variate is needed to save the fitted values, and the block factors are needed to save the bottom-stratum residuals (in a table) by the RESIDUALS parameter. These two items should therefore be saved before any changes are made to the y-variate or block factors.

Several options are provided to save information about the analysis as a whole. The `RESIDUALS`

and `FITTEDVALUES`

options allow variates to be specified to store the residuals and fitted values, respectively. The residuals, like those saved by the `RESIDUALS`

parameter of `ANOVA`

, are taken only from the final stratum. The `RMETHOD`

option controls whether these are simple residuals (like those printed by `ANOVA`

– the default) or whether they are standardized according to their variances. As an alternative, the `CBRESIDUALS`

option saves residuals that incorporate the variability from all the strata. With an orthogonal design, these are simply the sum of the residuals from every stratum. For a non-orthogonal design, they are the data values minus the combined estimates of the treatment effects. Likewise, the `CBCREGRESSION`

option allows you to save estimates of covariate regression coefficients that combine information from all the strata, and the `CBCVCOVARIANCE`

option can save their variances and covariances. (The estimates and their variances and covariances from each individual stratum can be saved using the `CREGRESSION`

and `CVCOVARIANCE`

parameters, as described below.) The `AOVTABLE`

option saves the analysis-of-variance table, as a pointer with a variate or a text for each column of the table. The pointer elements are labelled with the column labels of the table, and the variates contain missing values where the table has blanks. These can be printed as blanks by setting option `MISSING=' '`

in the `PRINT`

directive.

The `TREATMENTSTRUCTURE`

, `BLOCKSTRUCTURE`

and `WEIGHTS`

options can save the treatment and block formulae, and the weights variate (if any) that were used to specify the analysis. The `AFACTORIAL`

option can save the value used for the `FACTORIAL`

option in the `ANOVA`

comamnd that did the analysis, and the `YVARIATE`

option can be set to a dummy to point to the variate that was analysed (i.e. the variate defined by the `Y`

parameter of `ANOVA`

). The `EXIT`

option can save an exit code summarizing the properties of the design; see the description of `ANOVA`

for details.

The parameters of `AKEEP`

save information about particular model terms in the analysis. With the `TERMS`

parameter you specify a model formula, which Genstat expands to form the series of model terms about which you wish to save information. As in `ANOVA`

, the `FACTORIAL`

option sets a limit on the number of factors in each term. Any term containing more than that limit is deleted. The subsequent parameters allow you to specify identifiers of data structures to store various components of information for each of the terms that you have specified. If there are components that are not required for some of the terms, you should insert a missing identifier (`*`

) at that point of the list. For example

`AKEEP Source + Amount + Source.Amount; MEANS=*,*,Meangain;\`

` SS=Ssource,Samount,Ssbya; VARIANCE=Vsource,*,*`

sets up a table `Meangain`

containing the source by amount table of means; it forms scalars `Ssource`

, `Samount`

and `Ssbya`

to hold the sums of squares for `Source`

, `Amount`

and `Source.Amount`

respectively, and scalar `Vsource`

to store the unit variance for the effects of `Source`

.

The structures to hold the information are defined automatically, so you need not declare them in advance. If you have declared any of the tables already, its classification set will be redefined, if necessary, to match the factors in the table that you wish to store. Thus `Meangain`

here would be redefined to be classified by the factors `Source`

and `Amount`

, if it had previously been declared with some other set of classifying factors. Sizes of variates and symmetric matrices will also be redefined if necessary.

Many of the components are stored in tables, classified by the factors in the model term. Tables of means and effects are relevant only for treatment terms. Standard errors for a table of means can be saved using the `SEMEANS`

parameter. For some designs, such as split-plots, different standard errors are needed for the means according to which pair of means is to be compared. The `EQFACTORS`

option allows you to specify factors within the tables of means whose levels are assumed to be equal for the two means. Alternatively, the `SEDMEANS`

parameter can save a symmetric matrix containing a standard error of difference for each pair of means, and the `VCMEANS`

parameter can save a symmetric matrix with the variances and covariances for the means, and the LSDMEANS parameter can save a symmetric matrix containing least significant differences. The LSDLEVEL option specifies the significance level to use; default 5(%). The `DFMEANS`

parameter saves a symmetric matrix with the degrees of freedom for comparing each pair of means. The rows and columns of these matrices are labelled by the factor name and level (or label if available) of the mean concerned.

Tables of partial effects (saveable only for treatment terms, using the `PARTIALEFFECTS`

parameter) differ from the usual effects, presented by Genstat, only when there is non-orthogonality. The usual effects of a treatment term are estimated after eliminating the terms that precede it in the model, whereas the partial effects are those that would be estimated after eliminating the subsequent treatment terms as well. The `TWOLEVEL`

option controls what it stored for terms whose factors all have only two levels. The settings `response`

(the default) or `Yates`

generate a scalar response; whereas `TWOLEVELS=effects`

produces a table of effects. Replications are stored in tables if the values are unequal. For equal replications you can supply either a scalar or a table, but if the saving structure has not been declared `AKEEP`

will define it as a scalar. Tables of residuals are available only for block terms, and the `RMETHOD`

option controls whether or not they are standardized.

Sums of squares, numbers of degrees of freedom, efficiency factors and unit variances are saved in scalars. The unit variance of a treatment term is the residual mean square of the stratum where the term is estimated, divided by its efficiency factor and covariance efficiency factor. Thus you can calculate the estimated variance of any of the effects of the term by dividing its unit variance by the replication of the effect.

For a treatment term, the `RTERM`

parameter can be used to save a formula containing the model term corresponding to the lowest stratum in which it is estimated (down to and including any stratum defined by the `STRATUM`

option). This can then be used as the setting of the `TERMS`

parameter of a subsequent `AKEEP`

statement to obtain further information about the stratum, for example its number of residual degrees of freedom. For a block term, `RTERM`

saves all the strata to which it would be appropriate to compare the term. So, with a block structure of

`Blocks/Plots/Subplots`

the command

`AKEEP Blocks + Blocks.Plots; RTERM=Rb,Rbp`

would define `Rb`

as the formula `!f(Blocks.Plots)`

, and `Rbp`

as the formula `!f(Blocks.Plots,Subplots)`

. Alternatively, with a block structure of

`Reps/(Rows*Columns)`

the command

`AKEEP Reps; RTERM=Rr`

would define `Rr`

as the formula `!f(Reps.Rows`

`+`

`Reps.Blocks)`

.

There are three parameters for saving information about the covariates. To save the regression coefficients estimated in a particular stratum, you should specify the model term of the stratum with the `TERMS`

parameter and a variate with the `CREGRESSION`

parameter. Genstat defines the variate to have a length equal to the number of covariates, and stores the estimated regression coefficients of the covariates in the order in which they were listed in the `COVARIATE`

statement. The `CVCOVARIANCE`

parameter saves the variances and covariances of the estimated covariate regression coefficients, in a symmetric matrix. The `CSSP`

parameter allows you to obtain sums of squares and products between the covariates for the specified model term. These are arranged in a symmetric matrix. The value in row *i* on the diagonal is the sum of squares for the term in the analysis of variance that has as its y-variate the *i*th covariate listed in the `COVARIATE`

statement. The value in row *i* and column *j* is the cross-product between the effects estimated for the term in the analysis of variance of covariate *i* and those estimated for the same term in the analysis of covariate *j*.

The `CONTRASTS`

, `XCONTRASTS`

, `SECONTRASTS`

and `DFCONTRASTS`

parameters save information about contrasts. For each treatment term there will generally be several contrasts, so the information is stored in pointers with one element for each contrast. The elements are laballed by the name of the contrasts as it appears, for example, in the analysis-of-variance table.

The `CBMEANS`

, `CBSEMEANS`

, `CBSEDMEANS`

, `VCCBMEANS`

, `LSDCBMEANS`

, `DFCBMEANS`

, `CBEFFECTS`

, `CBVARIANCE`

, `DFCEFFECTS`

, `CBCEFFICIENCY`

and `STRATUMVARIANCES`

parameters save details of estimates that combine information from all the strata of the design, and the `COMPONENT`

parameter saves the stratum variance components.

In designs where there is partial confounding, and treatment terms are estimated in more than one stratum, options `STRATUM`

and `SUPPRESSHIGHER`

allow you to specify the strata from which the information is to be taken. This is relevant to tables of effects and partial effects, sums of squares, efficiency factors, unit variances, sums of squares and products between covariates, and information about contrasts. By default, Genstat searches all the strata, and takes the information from the lowest of the strata where the term is estimated. If you set the `STRATUM`

option, only strata down to the specified stratum are searched. By setting `SUPPRESSHIGHER=yes`

, you can restrict the search to only that stratum. You cannot save tables of means if you have excluded any stratum from the search. Likewise, tables of residuals and residual sums of squares cannot be saved for any of the excluded strata. If a term is not estimated in any of the strata that are searched, the corresponding data structures are filled with missing values.

The `STATUS`

parameter saves an integer code that describes the type of term, and how it is estimated. If the term is a treatment term, the code also gives information about how its marginal terms are estimated. (For example, the interaction term `A.B`

has the main effects `A`

and `B`

as margins.)

1 | the term is a treatment term; the term itself and all of its margins are orthogonal, and are estimated in the same stratum. |
---|---|

2 | the term is a treatment term; the term itself and all of its margins have the same efficiency factor, and are estimated in the same stratum. |

3 | the term is a treatment term; the term and its margins have different efficiency factors, but are all estimated in the same stratum. |

4 | the term is a treatment term; the term itself and all of its margins are orthogonal, but are estimated in different strata. |

5 | the term is a treatment term; the term itself and all of its margins have the same efficiency factor, but are estimated in different strata. |

6 | the term is a treatment term; the term and its margins have different efficiency factors and are all estimated in different strata. |

0 | the term is a treatment term; and term itself or one of its margins is aliased. |

-1 | the term is an orthogonal block term. |

-2 | the term is a non-orthogonal block term. |

`*` |
the term was not in either the block or treatment model but all of its factors occurred somewhere in the analysis (`AKEEP` gives a fault if the term contains factors that did not occur anywhere in the analysis); all other parameters are then ignored for that term. |

As explained in the description of the `BLOCKSTRUCTURE`

directive, Genstat will set up an extra “factor” denoted `*Units*`

if the block formula does not specify the final stratum explicitly. `AKEEP`

allows you to refer to this “factor”, if necessary, by putting the string `'*Units*'`

(or `'*units*'`

or `'*UNITS*'`

) in the `TERMS`

formula. Thus, to save the residual sum of squares in these circumstances, you could put

`AKEEP '*Units*'; SS=ResidSS`

Options: `FACTORIAL`

, `STRATUM`

, `SUPPRESSHIGHER`

, `TWOLEVEL`

, `RESIDUALS`

, `FITTEDVALUES`

, `CBRESIDUALS`

, `CBCREGRESSION`

, `CBCVCOVARIANCE`

, `TREATMENTSTRUCTURE`

, `BLOCKSTRUCTURE`

, `AFACTORIAL`

, `WEIGHTS`

, `YVARIATE`

, `LSDLEVEL`

, `AOVTABLE`

, `EQFACTORS`

, `RMETHOD`

, `EXIT`

, `SAVE`

.

Parameters: `TERMS`

, `MEANS`

, `SEMEANS`

, `SEDMEANS`

, `VCMEANS`

, `EFFECTS`

, `PARTIALEFFECTS`

, `REPLICATIONS`

, `RESIDUALS`

, `DF`

, `LSDMEANS`

,`DFMEANS`

, `SS`

, `EFFICIENCY`

, `VARIANCE`

, `RTERM`

, `CEFFICIENCY`

, `CREGRESSION`

, `CVCOVARIANCE`

, `CSSP`

, `CONTRASTS`

, `XCONTRASTS`

, `SECONTRASTS`

, `DFCONTRASTS`

, `CBMEANS`

, `SECBMEANS`

, `SEDCBMEANS`

, `VCCBMEANS`

, `LSDCBMEANS`

,`DFCBMEANS`

, `CBEFFECTS`

, `CBVARIANCE`

, `DFCEFFECTS`

, `CBCEFFICIENCY`

, `STRATUMVARIANCE`

, `COMPONENT`

, `STATUS`

.

### See also

Directives: `ANOVA`

, `BLOCKSTRUCTURE`

, `COVARIATE`

, `TREATMENTSTRUCTURE`

.

Procedures:`AFMEANS`

, `AUKEEP`

, `A2KEEP`

, `ASPREADSHEET`

, `A2RDA`

.

Commands for: Analysis of variance.

### Example

" Example ANOV-5: randomized block design with two treatment factors" " This is a field experiment to study the effects of nitrogen and sulphur on the yield of wheat with a randomized block design." FILEREAD [NAME='%gendir%/examples/ANOV-5.DAT'; PRINT=summary]\ Block,Plot,N,S,Yield; FGROUPS=4(yes),no " Define the structure of the treatments in the design: here we have the factorial structure of nitrogen (N) crossed with sulphur (S). The model formula N * S expands to give the 3 model terms: N, S and N.S representing the main effects of nitrogen & sulphur, and their interaction. Each of these will have a line in the aov table." TREATMENTSTRUCTURE N * S " The conventional way of analysing these designs, which can be seen in many text books, can be achieved in Genstat simply by putting the block factor (here called Block) at the start of the treatment formula: Block + N * S This works for straightforward designs like the randomized-block design but it is not satisfactory in more complicated situations like the balanced-incomplete-block or split-plot designs (see later examples). Moreover, the analysis that is obtained does not reflect the real structure of the design - for example that Block is a random term and not a fixed term like the treatment main effects and interaction. Consequently, the BLOCKSTRUCTURE directive allows you to define the underlying structure of the design - and thus the random (or error) terms for the analysis. The randomized-block design has an underlying structure of units (here the factor Plot) nested within blocks." BLOCKSTRUCTURE Block / Plot " The formula expands to give two model terms Block + Block.Plot each of these defines a stratum in the analysis-of-variance table: the Block stratum contains the variation between blocks, and the Block.Plot stratum contains the variation between the plots within each block. As the blocks all contain an identical collection of treatments, no treatment terms are estimated between blocks - they are all in the Block.Plot stratum. Genstat discovers this all automatically for you and produces the correct analysis-of-variance table." " The PRINT option is set to give just the analysis-of-variance table, and FPROBABILITY requests probabilities for the variance ratios." ANOVA [PRINT=aov; FPROBABILITY=yes] Yield " Use ADISPLAY to print the means (without recalculating the analysis)." ADISPLAY [PRINT=means] " Plot the means against S, with a different line for each level of N." AGRAPH [METHOD=line] S; GROUPS=N " Example ANOV-5a: saving output from ANOVA" " Using AKEEP to save the sums of squares and degrees of freedom for N and S." AKEEP N+S; SS=N_ss,S_ss; DF=N_df,S_df PRINT N_ss,N_df,S_ss,S_df; DECIMALS=5,0