1. Home
  2. AKEEP directive

AKEEP directive

Copies information from an ANOVA analysis into Genstat data structures.

Options

FACTORIAL = scalar Limit on number of factors in a model term; default 3
STRATUM = formula Model term of the lowest stratum to be searched for effects; default * implies the lowest stratum
SUPPRESSHIGHER = string token Whether to suppress the searching of higher strata if a term is not found in STRATUM (yes, no); default no
TWOLEVEL = string token Representation of effects in 2n experiments (responses, Yates, effects); default resp
RESIDUALS = variate Saves residuals from the final stratum (as in the RESIDUALS parameter of ANOVA)
FITTEDVALUES = variate Saves fitted values (data values or missing value estimates, minus the residuals from the final stratum – as in the FITTEDVALUES parameter of ANOVA)
CBRESIDUALS = variate Saves the sum of the residuals from all the strata
CBCREGRESSION = variate Saves the estimates of the covariate regression coefficients, combining information from all the strata
CBCVCOVARIANCE = symmetric matrix Saves the variance-covariance matrix of the combined estimates of the covariate regression coefficients
TREATMENTSTRUCTURE = formula structure Saves the treatment formula used for the analysis
BLOCKSTRUCTURE = formula structure Saves the block formula used for the analysis
AFACTORIAL = scalar Saves the setting of the FACTORIAL option used in the ANOVA command that performed the analysis
WEIGHTS = variate Saves the weights used in the analysis
YVARIATE = dummy Dummy to be set to the y-variate of the analysis
LSDLEVEL = scalar Significance level (%) to use in the calculation of least significant differences; default 5
AOVTABLE = pointer Saves the analysis-of-variance table as a pointer with a variate or text for each column (source, d.f., s.s., m.s. etc)
EQFACTORS = factors Factors whose levels are to be assumed to be equal within the comparisons between means calculated for SEMEANS
RMETHOD = string token Type of residuals to form if the RESIDUALS option or parameter is set (simple, standardized); default simp
EXIT = scalar Saves an exit code indicating the properties of the design
SAVE = identifier Defines the Save structure (from ANOVA) that provides details of the analysis; default * gives that from the most recent ANOVA

Parameters

TERMS = formula Model terms for which information is required
MEANS = tables Table to store means for each term (available for treatment terms only)
SEMEANS = tables Table of effective standard errors for the means, usable for calculating standard errors for differences between means in the table, at equal levels of the factors specified by the EQFACTORS option
SEDMEANS = symmetric matrices Standard errors for comparisons between every pair of entries in the table of means
VCMEANS = symmetric matrices Variances and covariances of means
EFFECTS = tables or scalars Table or scalar (for terms with 1 d.f. when TWOLEVEL=responses or Yates) to store effects (for treatment terms only)
PARTIALEFFECTS = tables Table or scalar (for terms with 1 d.f. when TWOLEVEL=responses or Yates) to store partial effects (for treatment terms only)
REPLICATIONS = tables or scalars Table to store replications or scalar if they are all equal
RESIDUALS = tables Table to store residuals (for block terms only)
DF = scalars Number of degrees of freedom for each term
LSDMEANS = symmetric matrices Least significant differences of means
DFMEANS = symmetric matrices Degrees of freedom for comparisons between every pair of entries in the table of means
SS = scalars Sum of squares for each term
EFFICIENCY = scalars Efficiency factor for each term
VARIANCE = scalars Unit variance for the effects of each term
RTERM = formula structures Residual terms: for a treatment term this saves the lowest stratum where the term is estimated (down to the stratum specified by the STRATUM option); for a block term it saves all the strata to which it would be appropriate to compare the term
CEFFICIENCY = scalars Covariance efficiency factor for each term
CREGRESSION = variates Estimated regression coefficients for the covariates in the specified stratum
CVCOVARIANCE = symmetric matrix Variance-covariance matrix of the covariate regression coefficients in the specified stratum
CSSP = symmetric matrices Covariate sums of squares and products in the specified stratum
CONTRASTS = pointers Estimates for the fitted contrasts of each treatment term, stored in a pointer to scalars or tables; units of the pointer are labelled by the contrast name (as used in the analysis-of-variance table)
XCONTRASTS = pointers X-variates used to fit contrasts, as orthogonalized by ANOVA, stored in a pointer to tables; units of the pointer are labelled as for CONTRASTS
SECONTRASTS = pointers Standard errors for estimated contrasts, stored in a pointer to scalars or tables; units of the pointer are labelled as for CONTRASTS
DFCONTRASTS = pointers Degrees of freedom for estimated contrasts, stored in a pointer to scalars; units of the pointer are labelled as for CONTRASTS
CBMEANS = tables Table to store estimates of the means, combining information from all the strata (for treatment terms only)
SECBMEANS = tables Table of standard errors for the combined means, usable for calculating standard errors for differences between means in the table, at equal levels of the factors specified by the EQFACTORS option
SEDCBMEANS = symmetric matrices Standard errors for comparisons between every pair of entries in the table of combined means
LSDCBMEANS = symmetric matrices Least significant differences of combined means
VCCBMEANS = symmetric matrices Variances and covariances of combined means
DFCBMEANS = symmetric matrices Effective degrees of freedom for comparisons between every pair of entries in the table of combined means
CBEFFECTS = tables or scalars Table or scalar (for terms with 1 d.f. when TWOLEVEL=responses or Yates) to store estimates of the effects, combining information from all the strata (for treatment terms only)
CBVARIANCE = scalars Unit variance for the combined estimates of the effects of each term
DFCEFFECTS = scalars Effective degrees of freedom for the combined estimates of the effects of each term
CBCEFFICIENCY = scalars Covariance efficiency factor for the combined estimates of each term
STRATUMVARIANCE = scalars Estimates of the stratum variances (for block terms only)
COMPONENT = scalars Stratum variance components (for block terms only)
STATUS = scalars Status code describing how the term is estimated (together with its marginal terms, if the term is a treatment term)

Description

AKEEP allows you to copy components of the output from an analysis of variance into standard Genstat data structures. You can save the information from the analysis in a save structure, using the SAVE option of ANOVA and then specify the same structure in the SAVE option of AKEEP. Alternatively, Genstat automatically stores the save structure from the last y-variate that has been analysed, and this is used as a default by AKEEP if you do not specify a save structure explicitly. Note, however, that the save structure does not store the y-variate nor the block and treatment factors. Almost all of the items saved by AKEEP will be unaffected by any changes to their values. However, the y-variate is needed to save the fitted values, and the block factors are needed to save the bottom-stratum residuals (in a table) by the RESIDUALS parameter. These two items should therefore be saved before any changes are made to the y-variate or block factors.

Several options are provided to save information about the analysis as a whole. The RESIDUALS and FITTEDVALUES options allow variates to be specified to store the residuals and fitted values, respectively. The residuals, like those saved by the RESIDUALS parameter of ANOVA, are taken only from the final stratum. The RMETHOD option controls whether these are simple residuals (like those printed by ANOVA – the default) or whether they are standardized according to their variances. As an alternative, the CBRESIDUALS option saves residuals that incorporate the variability from all the strata. With an orthogonal design, these are simply the sum of the residuals from every stratum. For a non-orthogonal design, they are the data values minus the combined estimates of the treatment effects. Likewise, the CBCREGRESSION option allows you to save estimates of covariate regression coefficients that combine information from all the strata, and the CBCVCOVARIANCE option can save their variances and covariances. (The estimates and their variances and covariances from each individual stratum can be saved using the CREGRESSION and CVCOVARIANCE parameters, as described below.) The AOVTABLE option saves the analysis-of-variance table, as a pointer with a variate or a text for each column of the table. The pointer elements are labelled with the column labels of the table, and the variates contain missing values where the table has blanks. These can be printed as blanks by setting option MISSING=' ' in the PRINT directive.

The TREATMENTSTRUCTURE, BLOCKSTRUCTURE and WEIGHTS options can save the treatment and block formulae, and the weights variate (if any) that were used to specify the analysis. The AFACTORIAL option can save the value used for the FACTORIAL option in the ANOVA comamnd that did the analysis, and the YVARIATE option can be set to a dummy to point to the variate that was analysed (i.e. the variate defined by the Y parameter of ANOVA). The EXIT option can save an exit code summarizing the properties of the design; see the description of ANOVA for details.

The parameters of AKEEP save information about particular model terms in the analysis. With the TERMS parameter you specify a model formula, which Genstat expands to form the series of model terms about which you wish to save information. As in ANOVA, the FACTORIAL option sets a limit on the number of factors in each term. Any term containing more than that limit is deleted. The subsequent parameters allow you to specify identifiers of data structures to store various components of information for each of the terms that you have specified. If there are components that are not required for some of the terms, you should insert a missing identifier (*) at that point of the list. For example

AKEEP Source + Amount + Source.Amount; MEANS=*,*,Meangain;\

  SS=Ssource,Samount,Ssbya; VARIANCE=Vsource,*,*

sets up a table Meangain containing the source by amount table of means; it forms scalars Ssource, Samount and Ssbya to hold the sums of squares for Source, Amount and Source.Amount respectively, and scalar Vsource to store the unit variance for the effects of Source.

The structures to hold the information are defined automatically, so you need not declare them in advance. If you have declared any of the tables already, its classification set will be redefined, if necessary, to match the factors in the table that you wish to store. Thus Meangain here would be redefined to be classified by the factors Source and Amount, if it had previously been declared with some other set of classifying factors. Sizes of variates and symmetric matrices will also be redefined if necessary.

Many of the components are stored in tables, classified by the factors in the model term. Tables of means and effects are relevant only for treatment terms. Standard errors for a table of means can be saved using the SEMEANS parameter. For some designs, such as split-plots, different standard errors are needed for the means according to which pair of means is to be compared. The EQFACTORS option allows you to specify factors within the tables of means whose levels are assumed to be equal for the two means. Alternatively, the SEDMEANS parameter can save a symmetric matrix containing a standard error of difference for each pair of means, and the VCMEANS parameter can save a symmetric matrix with the variances and covariances for the means, and the LSDMEANS parameter can save a symmetric matrix containing least significant differences. The LSDLEVEL option specifies the significance level to use; default 5(%). The DFMEANS parameter saves a symmetric matrix with the degrees of freedom for comparing each pair of means. The rows and columns of these matrices are labelled by the factor name and level (or label if available) of the mean concerned.

Tables of partial effects (saveable only for treatment terms, using the PARTIALEFFECTS parameter) differ from the usual effects, presented by Genstat, only when there is non-orthogonality. The usual effects of a treatment term are estimated after eliminating the terms that precede it in the model, whereas the partial effects are those that would be estimated after eliminating the subsequent treatment terms as well. The TWOLEVEL option controls what it stored for terms whose factors all have only two levels. The settings response (the default) or Yates generate a scalar response; whereas TWOLEVELS=effects produces a table of effects. Replications are stored in tables if the values are unequal. For equal replications you can supply either a scalar or a table, but if the saving structure has not been declared AKEEP will define it as a scalar. Tables of residuals are available only for block terms, and the RMETHOD option controls whether or not they are standardized.

Sums of squares, numbers of degrees of freedom, efficiency factors and unit variances are saved in scalars. The unit variance of a treatment term is the residual mean square of the stratum where the term is estimated, divided by its efficiency factor and covariance efficiency factor. Thus you can calculate the estimated variance of any of the effects of the term by dividing its unit variance by the replication of the effect.

For a treatment term, the RTERM parameter can be used to save a formula containing the model term corresponding to the lowest stratum in which it is estimated (down to and including any stratum defined by the STRATUM option). This can then be used as the setting of the TERMS parameter of a subsequent AKEEP statement to obtain further information about the stratum, for example its number of residual degrees of freedom. For a block term, RTERM saves all the strata to which it would be appropriate to compare the term. So, with a block structure of

Blocks/Plots/Subplots

the command

AKEEP Blocks + Blocks.Plots; RTERM=Rb,Rbp

would define Rb as the formula !f(Blocks.Plots), and Rbp as the formula !f(Blocks.Plots,Subplots). Alternatively, with a block structure of

Reps/(Rows*Columns)

the command

AKEEP Reps; RTERM=Rr

would define Rr as the formula !f(Reps.Rows + Reps.Blocks).

There are three parameters for saving information about the covariates. To save the regression coefficients estimated in a particular stratum, you should specify the model term of the stratum with the TERMS parameter and a variate with the CREGRESSION parameter. Genstat defines the variate to have a length equal to the number of covariates, and stores the estimated regression coefficients of the covariates in the order in which they were listed in the COVARIATE statement. The CVCOVARIANCE parameter saves the variances and covariances of the estimated covariate regression coefficients, in a symmetric matrix. The CSSP parameter allows you to obtain sums of squares and products between the covariates for the specified model term. These are arranged in a symmetric matrix. The value in row i on the diagonal is the sum of squares for the term in the analysis of variance that has as its y-variate the ith covariate listed in the COVARIATE statement. The value in row i and column j is the cross-product between the effects estimated for the term in the analysis of variance of covariate i and those estimated for the same term in the analysis of covariate j.

The CONTRASTS, XCONTRASTS, SECONTRASTS and DFCONTRASTS parameters save information about contrasts. For each treatment term there will generally be several contrasts, so the information is stored in pointers with one element for each contrast. The elements are laballed by the name of the contrasts as it appears, for example, in the analysis-of-variance table.

The CBMEANS, CBSEMEANS, CBSEDMEANS, VCCBMEANS, LSDCBMEANS, DFCBMEANS, CBEFFECTS, CBVARIANCE, DFCEFFECTS, CBCEFFICIENCY and STRATUMVARIANCES parameters save details of estimates that combine information from all the strata of the design, and the COMPONENT parameter saves the stratum variance components.

In designs where there is partial confounding, and treatment terms are estimated in more than one stratum, options STRATUM and SUPPRESSHIGHER allow you to specify the strata from which the information is to be taken. This is relevant to tables of effects and partial effects, sums of squares, efficiency factors, unit variances, sums of squares and products between covariates, and information about contrasts. By default, Genstat searches all the strata, and takes the information from the lowest of the strata where the term is estimated. If you set the STRATUM option, only strata down to the specified stratum are searched. By setting SUPPRESSHIGHER=yes, you can restrict the search to only that stratum. You cannot save tables of means if you have excluded any stratum from the search. Likewise, tables of residuals and residual sums of squares cannot be saved for any of the excluded strata. If a term is not estimated in any of the strata that are searched, the corresponding data structures are filled with missing values.

The STATUS parameter saves an integer code that describes the type of term, and how it is estimated. If the term is a treatment term, the code also gives information about how its marginal terms are estimated. (For example, the interaction term A.B has the main effects A and B as margins.)

    1 the term is a treatment term; the term itself and all of its margins are orthogonal, and are estimated in the same stratum.
    2 the term is a treatment term; the term itself and all of its margins have the same efficiency factor, and are estimated in the same stratum.
    3 the term is a treatment term; the term and its margins have different efficiency factors, but are all estimated in the same stratum.
    4 the term is a treatment term; the term itself and all of its margins are orthogonal, but are estimated in different strata.
    5 the term is a treatment term; the term itself and all of its margins have the same efficiency factor, but are estimated in different strata.
    6 the term is a treatment term; the term and its margins have different efficiency factors and are all estimated in different strata.
    0 the term is a treatment term; and term itself or one of its margins is aliased.
    -1 the term is an orthogonal block term.
    -2 the term is a non-orthogonal block term.
    * the term was not in either the block or treatment model but all of its factors occurred somewhere in the analysis (AKEEP gives a fault if the term contains factors that did not occur anywhere in the analysis); all other parameters are then ignored for that term.

As explained in the description of the BLOCKSTRUCTURE directive, Genstat will set up an extra “factor” denoted *Units* if the block formula does not specify the final stratum explicitly. AKEEP allows you to refer to this “factor”, if necessary, by putting the string '*Units*' (or '*units*' or '*UNITS*') in the TERMS formula. Thus, to save the residual sum of squares in these circumstances, you could put

AKEEP '*Units*'; SS=ResidSS

Options: FACTORIAL, STRATUM, SUPPRESSHIGHER, TWOLEVEL, RESIDUALS, FITTEDVALUES, CBRESIDUALS, CBCREGRESSION, CBCVCOVARIANCE, TREATMENTSTRUCTURE, BLOCKSTRUCTURE, AFACTORIAL, WEIGHTS, YVARIATE, LSDLEVEL, AOVTABLE, EQFACTORS, RMETHOD, EXIT, SAVE.

Parameters: TERMS, MEANS, SEMEANS, SEDMEANS, VCMEANS, EFFECTS, PARTIALEFFECTS, REPLICATIONS, RESIDUALS, DF, LSDMEANS,DFMEANS, SS, EFFICIENCY, VARIANCE, RTERM, CEFFICIENCY, CREGRESSION, CVCOVARIANCE, CSSP, CONTRASTS, XCONTRASTS, SECONTRASTS, DFCONTRASTS, CBMEANS, SECBMEANS, SEDCBMEANS, VCCBMEANS, LSDCBMEANS,DFCBMEANS, CBEFFECTS, CBVARIANCE, DFCEFFECTS, CBCEFFICIENCY, STRATUMVARIANCE, COMPONENT, STATUS.

See also

Directives: ANOVA, BLOCKSTRUCTURE, COVARIATE, TREATMENTSTRUCTURE.

Procedures:AFMEANS, AUKEEP, A2KEEP, ASPREADSHEET, A2RDA.

Commands for: Analysis of variance.

Example

" Example ANOV-5: randomized block design with two treatment factors"

" This is a field experiment to study the effects of nitrogen and
  sulphur on the yield of wheat with a randomized block design."

FILEREAD [NAME='%gendir%/examples/ANOV-5.DAT'; PRINT=summary]\ 
  Block,Plot,N,S,Yield; FGROUPS=4(yes),no

" Define the structure of the treatments in the design: here we have 
  the factorial structure of nitrogen (N) crossed with sulphur (S).
  The model formula N * S expands to give the 3 model terms:
  N,  S  and  N.S  representing the main effects of nitrogen & sulphur, 
  and their interaction. Each of these will have a line in the aov table."
TREATMENTSTRUCTURE N * S

" The conventional way of analysing these designs, which can be seen in 
  many text books, can be achieved in Genstat simply by putting the block 
  factor (here called Block) at the start of the treatment formula: 
    Block + N * S 
  This works for straightforward designs like the randomized-block design 
  but it is not satisfactory in more complicated situations like the 
  balanced-incomplete-block or split-plot designs (see later examples).
  Moreover, the analysis that is obtained does not reflect the real
  structure of the design - for example that Block is a random term and
  not a fixed term like the treatment main effects and interaction.
  Consequently, the BLOCKSTRUCTURE directive allows you to define the
  underlying structure of the design - and thus the random (or error) 
  terms for the analysis. The randomized-block design has an underlying 
  structure of units (here the factor Plot) nested within blocks."
BLOCKSTRUCTURE Block / Plot
"  The formula expands to give two model terms  Block + Block.Plot
  each of these defines a stratum in the analysis-of-variance table: the
  Block stratum contains the variation between blocks, and the Block.Plot
  stratum contains the variation between the plots within each block.
  As the blocks all contain an identical collection of treatments, no
  treatment terms are estimated between blocks - they are all in the 
  Block.Plot stratum. Genstat discovers this all automatically for you
  and produces the correct analysis-of-variance table."

" The PRINT option is set to give just the analysis-of-variance table, 
  and FPROBABILITY requests probabilities for the variance ratios."
ANOVA [PRINT=aov; FPROBABILITY=yes] Yield

" Use ADISPLAY to print the means (without recalculating the analysis)."
ADISPLAY [PRINT=means]

" Plot the means against S, with a different line for each level of N."
AGRAPH [METHOD=line] S; GROUPS=N

" Example ANOV-5a: saving output from ANOVA"

" Using AKEEP to save the sums of squares 
  and degrees of freedom for N and S."
AKEEP N+S; SS=N_ss,S_ss; DF=N_df,S_df
PRINT N_ss,N_df,S_ss,S_df; DECIMALS=5,0
Updated on March 27, 2024

Was this article helpful?