Copies information from an ANOVA
analysis into Genstat data structures.
Options
FACTORIAL = scalar |
Limit on number of factors in a model term; default 3 |
---|---|
STRATUM = formula |
Model term of the lowest stratum to be searched for effects; default * implies the lowest stratum |
SUPPRESSHIGHER = string token |
Whether to suppress the searching of higher strata if a term is not found in STRATUM (yes , no ); default no |
TWOLEVEL = string token |
Representation of effects in 2n experiments (responses , Yates , effects ); default resp |
RESIDUALS = variate |
Saves residuals from the final stratum (as in the RESIDUALS parameter of ANOVA ) |
FITTEDVALUES = variate |
Saves fitted values (data values or missing value estimates, minus the residuals from the final stratum – as in the FITTEDVALUES parameter of ANOVA ) |
CBRESIDUALS = variate |
Saves the sum of the residuals from all the strata |
CBCREGRESSION = variate |
Saves the estimates of the covariate regression coefficients, combining information from all the strata |
CBCVCOVARIANCE = symmetric matrix |
Saves the variance-covariance matrix of the combined estimates of the covariate regression coefficients |
TREATMENTSTRUCTURE = formula structure |
Saves the treatment formula used for the analysis |
BLOCKSTRUCTURE = formula structure |
Saves the block formula used for the analysis |
AFACTORIAL = scalar |
Saves the setting of the FACTORIAL option used in the ANOVA command that performed the analysis |
WEIGHTS = variate |
Saves the weights used in the analysis |
YVARIATE = dummy |
Dummy to be set to the y-variate of the analysis |
LSDLEVEL = scalar | Significance level (%) to use in the calculation of least significant differences; default 5 |
AOVTABLE = pointer |
Saves the analysis-of-variance table as a pointer with a variate or text for each column (source, d.f., s.s., m.s. etc) |
EQFACTORS = factors |
Factors whose levels are to be assumed to be equal within the comparisons between means calculated for SEMEANS |
RMETHOD = string token |
Type of residuals to form if the RESIDUALS option or parameter is set (simple , standardized ); default simp |
EXIT = scalar |
Saves an exit code indicating the properties of the design |
SAVE = identifier |
Defines the Save structure (from ANOVA ) that provides details of the analysis; default * gives that from the most recent ANOVA |
Parameters
TERMS = formula |
Model terms for which information is required |
---|---|
MEANS = tables |
Table to store means for each term (available for treatment terms only) |
SEMEANS = tables |
Table of effective standard errors for the means, usable for calculating standard errors for differences between means in the table, at equal levels of the factors specified by the EQFACTORS option |
SEDMEANS = symmetric matrices |
Standard errors for comparisons between every pair of entries in the table of means |
VCMEANS = symmetric matrices |
Variances and covariances of means |
EFFECTS = tables or scalars |
Table or scalar (for terms with 1 d.f. when TWOLEVEL=responses or Yates ) to store effects (for treatment terms only) |
PARTIALEFFECTS = tables |
Table or scalar (for terms with 1 d.f. when TWOLEVEL=responses or Yates ) to store partial effects (for treatment terms only) |
REPLICATIONS = tables or scalars |
Table to store replications or scalar if they are all equal |
RESIDUALS = tables |
Table to store residuals (for block terms only) |
DF = scalars |
Number of degrees of freedom for each term |
LSDMEANS = symmetric matrices |
Least significant differences of means |
DFMEANS = symmetric matrices |
Degrees of freedom for comparisons between every pair of entries in the table of means |
SS = scalars |
Sum of squares for each term |
EFFICIENCY = scalars |
Efficiency factor for each term |
VARIANCE = scalars |
Unit variance for the effects of each term |
RTERM = formula structures |
Residual terms: for a treatment term this saves the lowest stratum where the term is estimated (down to the stratum specified by the STRATUM option); for a block term it saves all the strata to which it would be appropriate to compare the term |
CEFFICIENCY = scalars |
Covariance efficiency factor for each term |
CREGRESSION = variates |
Estimated regression coefficients for the covariates in the specified stratum |
CVCOVARIANCE = symmetric matrix |
Variance-covariance matrix of the covariate regression coefficients in the specified stratum |
CSSP = symmetric matrices |
Covariate sums of squares and products in the specified stratum |
CONTRASTS = pointers |
Estimates for the fitted contrasts of each treatment term, stored in a pointer to scalars or tables; units of the pointer are labelled by the contrast name (as used in the analysis-of-variance table) |
XCONTRASTS = pointers |
X-variates used to fit contrasts, as orthogonalized by ANOVA , stored in a pointer to tables; units of the pointer are labelled as for CONTRASTS |
SECONTRASTS = pointers |
Standard errors for estimated contrasts, stored in a pointer to scalars or tables; units of the pointer are labelled as for CONTRASTS |
DFCONTRASTS = pointers |
Degrees of freedom for estimated contrasts, stored in a pointer to scalars; units of the pointer are labelled as for CONTRASTS |
CBMEANS = tables |
Table to store estimates of the means, combining information from all the strata (for treatment terms only) |
SECBMEANS = tables |
Table of standard errors for the combined means, usable for calculating standard errors for differences between means in the table, at equal levels of the factors specified by the EQFACTORS option |
SEDCBMEANS = symmetric matrices |
Standard errors for comparisons between every pair of entries in the table of combined means |
LSDCBMEANS = symmetric matrices |
Least significant differences of combined means |
VCCBMEANS = symmetric matrices |
Variances and covariances of combined means |
DFCBMEANS = symmetric matrices |
Effective degrees of freedom for comparisons between every pair of entries in the table of combined means |
CBEFFECTS = tables or scalars |
Table or scalar (for terms with 1 d.f. when TWOLEVEL=responses or Yates ) to store estimates of the effects, combining information from all the strata (for treatment terms only) |
CBVARIANCE = scalars |
Unit variance for the combined estimates of the effects of each term |
DFCEFFECTS = scalars |
Effective degrees of freedom for the combined estimates of the effects of each term |
CBCEFFICIENCY = scalars |
Covariance efficiency factor for the combined estimates of each term |
STRATUMVARIANCE = scalars |
Estimates of the stratum variances (for block terms only) |
COMPONENT = scalars |
Stratum variance components (for block terms only) |
STATUS = scalars |
Status code describing how the term is estimated (together with its marginal terms, if the term is a treatment term) |
Description
AKEEP
allows you to copy components of the output from an analysis of variance into standard Genstat data structures. You can save the information from the analysis in a save structure, using the SAVE
option of ANOVA
and then specify the same structure in the SAVE
option of AKEEP
. Alternatively, Genstat automatically stores the save structure from the last y-variate that has been analysed, and this is used as a default by AKEEP
if you do not specify a save structure explicitly. Note, however, that the save structure does not store the y-variate nor the block and treatment factors. Almost all of the items saved by AKEEP will be unaffected by any changes to their values. However, the y-variate is needed to save the fitted values, and the block factors are needed to save the bottom-stratum residuals (in a table) by the RESIDUALS parameter. These two items should therefore be saved before any changes are made to the y-variate or block factors.
Several options are provided to save information about the analysis as a whole. The RESIDUALS
and FITTEDVALUES
options allow variates to be specified to store the residuals and fitted values, respectively. The residuals, like those saved by the RESIDUALS
parameter of ANOVA
, are taken only from the final stratum. The RMETHOD
option controls whether these are simple residuals (like those printed by ANOVA
– the default) or whether they are standardized according to their variances. As an alternative, the CBRESIDUALS
option saves residuals that incorporate the variability from all the strata. With an orthogonal design, these are simply the sum of the residuals from every stratum. For a non-orthogonal design, they are the data values minus the combined estimates of the treatment effects. Likewise, the CBCREGRESSION
option allows you to save estimates of covariate regression coefficients that combine information from all the strata, and the CBCVCOVARIANCE
option can save their variances and covariances. (The estimates and their variances and covariances from each individual stratum can be saved using the CREGRESSION
and CVCOVARIANCE
parameters, as described below.) The AOVTABLE
option saves the analysis-of-variance table, as a pointer with a variate or a text for each column of the table. The pointer elements are labelled with the column labels of the table, and the variates contain missing values where the table has blanks. These can be printed as blanks by setting option MISSING=' '
in the PRINT
directive.
The TREATMENTSTRUCTURE
, BLOCKSTRUCTURE
and WEIGHTS
options can save the treatment and block formulae, and the weights variate (if any) that were used to specify the analysis. The AFACTORIAL
option can save the value used for the FACTORIAL
option in the ANOVA
comamnd that did the analysis, and the YVARIATE
option can be set to a dummy to point to the variate that was analysed (i.e. the variate defined by the Y
parameter of ANOVA
). The EXIT
option can save an exit code summarizing the properties of the design; see the description of ANOVA
for details.
The parameters of AKEEP
save information about particular model terms in the analysis. With the TERMS
parameter you specify a model formula, which Genstat expands to form the series of model terms about which you wish to save information. As in ANOVA
, the FACTORIAL
option sets a limit on the number of factors in each term. Any term containing more than that limit is deleted. The subsequent parameters allow you to specify identifiers of data structures to store various components of information for each of the terms that you have specified. If there are components that are not required for some of the terms, you should insert a missing identifier (*
) at that point of the list. For example
AKEEP Source + Amount + Source.Amount; MEANS=*,*,Meangain;\
SS=Ssource,Samount,Ssbya; VARIANCE=Vsource,*,*
sets up a table Meangain
containing the source by amount table of means; it forms scalars Ssource
, Samount
and Ssbya
to hold the sums of squares for Source
, Amount
and Source.Amount
respectively, and scalar Vsource
to store the unit variance for the effects of Source
.
The structures to hold the information are defined automatically, so you need not declare them in advance. If you have declared any of the tables already, its classification set will be redefined, if necessary, to match the factors in the table that you wish to store. Thus Meangain
here would be redefined to be classified by the factors Source
and Amount
, if it had previously been declared with some other set of classifying factors. Sizes of variates and symmetric matrices will also be redefined if necessary.
Many of the components are stored in tables, classified by the factors in the model term. Tables of means and effects are relevant only for treatment terms. Standard errors for a table of means can be saved using the SEMEANS
parameter. For some designs, such as split-plots, different standard errors are needed for the means according to which pair of means is to be compared. The EQFACTORS
option allows you to specify factors within the tables of means whose levels are assumed to be equal for the two means. Alternatively, the SEDMEANS
parameter can save a symmetric matrix containing a standard error of difference for each pair of means, and the VCMEANS
parameter can save a symmetric matrix with the variances and covariances for the means, and the LSDMEANS parameter can save a symmetric matrix containing least significant differences. The LSDLEVEL option specifies the significance level to use; default 5(%). The DFMEANS
parameter saves a symmetric matrix with the degrees of freedom for comparing each pair of means. The rows and columns of these matrices are labelled by the factor name and level (or label if available) of the mean concerned.
Tables of partial effects (saveable only for treatment terms, using the PARTIALEFFECTS
parameter) differ from the usual effects, presented by Genstat, only when there is non-orthogonality. The usual effects of a treatment term are estimated after eliminating the terms that precede it in the model, whereas the partial effects are those that would be estimated after eliminating the subsequent treatment terms as well. The TWOLEVEL
option controls what it stored for terms whose factors all have only two levels. The settings response
(the default) or Yates
generate a scalar response; whereas TWOLEVELS=effects
produces a table of effects. Replications are stored in tables if the values are unequal. For equal replications you can supply either a scalar or a table, but if the saving structure has not been declared AKEEP
will define it as a scalar. Tables of residuals are available only for block terms, and the RMETHOD
option controls whether or not they are standardized.
Sums of squares, numbers of degrees of freedom, efficiency factors and unit variances are saved in scalars. The unit variance of a treatment term is the residual mean square of the stratum where the term is estimated, divided by its efficiency factor and covariance efficiency factor. Thus you can calculate the estimated variance of any of the effects of the term by dividing its unit variance by the replication of the effect.
For a treatment term, the RTERM
parameter can be used to save a formula containing the model term corresponding to the lowest stratum in which it is estimated (down to and including any stratum defined by the STRATUM
option). This can then be used as the setting of the TERMS
parameter of a subsequent AKEEP
statement to obtain further information about the stratum, for example its number of residual degrees of freedom. For a block term, RTERM
saves all the strata to which it would be appropriate to compare the term. So, with a block structure of
Blocks/Plots/Subplots
the command
AKEEP Blocks + Blocks.Plots; RTERM=Rb,Rbp
would define Rb
as the formula !f(Blocks.Plots)
, and Rbp
as the formula !f(Blocks.Plots,Subplots)
. Alternatively, with a block structure of
Reps/(Rows*Columns)
the command
AKEEP Reps; RTERM=Rr
would define Rr
as the formula !f(Reps.Rows
+
Reps.Blocks)
.
There are three parameters for saving information about the covariates. To save the regression coefficients estimated in a particular stratum, you should specify the model term of the stratum with the TERMS
parameter and a variate with the CREGRESSION
parameter. Genstat defines the variate to have a length equal to the number of covariates, and stores the estimated regression coefficients of the covariates in the order in which they were listed in the COVARIATE
statement. The CVCOVARIANCE
parameter saves the variances and covariances of the estimated covariate regression coefficients, in a symmetric matrix. The CSSP
parameter allows you to obtain sums of squares and products between the covariates for the specified model term. These are arranged in a symmetric matrix. The value in row i on the diagonal is the sum of squares for the term in the analysis of variance that has as its y-variate the ith covariate listed in the COVARIATE
statement. The value in row i and column j is the cross-product between the effects estimated for the term in the analysis of variance of covariate i and those estimated for the same term in the analysis of covariate j.
The CONTRASTS
, XCONTRASTS
, SECONTRASTS
and DFCONTRASTS
parameters save information about contrasts. For each treatment term there will generally be several contrasts, so the information is stored in pointers with one element for each contrast. The elements are laballed by the name of the contrasts as it appears, for example, in the analysis-of-variance table.
The CBMEANS
, CBSEMEANS
, CBSEDMEANS
, VCCBMEANS
, LSDCBMEANS
, DFCBMEANS
, CBEFFECTS
, CBVARIANCE
, DFCEFFECTS
, CBCEFFICIENCY
and STRATUMVARIANCES
parameters save details of estimates that combine information from all the strata of the design, and the COMPONENT
parameter saves the stratum variance components.
In designs where there is partial confounding, and treatment terms are estimated in more than one stratum, options STRATUM
and SUPPRESSHIGHER
allow you to specify the strata from which the information is to be taken. This is relevant to tables of effects and partial effects, sums of squares, efficiency factors, unit variances, sums of squares and products between covariates, and information about contrasts. By default, Genstat searches all the strata, and takes the information from the lowest of the strata where the term is estimated. If you set the STRATUM
option, only strata down to the specified stratum are searched. By setting SUPPRESSHIGHER=yes
, you can restrict the search to only that stratum. You cannot save tables of means if you have excluded any stratum from the search. Likewise, tables of residuals and residual sums of squares cannot be saved for any of the excluded strata. If a term is not estimated in any of the strata that are searched, the corresponding data structures are filled with missing values.
The STATUS
parameter saves an integer code that describes the type of term, and how it is estimated. If the term is a treatment term, the code also gives information about how its marginal terms are estimated. (For example, the interaction term A.B
has the main effects A
and B
as margins.)
1 | the term is a treatment term; the term itself and all of its margins are orthogonal, and are estimated in the same stratum. |
---|---|
2 | the term is a treatment term; the term itself and all of its margins have the same efficiency factor, and are estimated in the same stratum. |
3 | the term is a treatment term; the term and its margins have different efficiency factors, but are all estimated in the same stratum. |
4 | the term is a treatment term; the term itself and all of its margins are orthogonal, but are estimated in different strata. |
5 | the term is a treatment term; the term itself and all of its margins have the same efficiency factor, but are estimated in different strata. |
6 | the term is a treatment term; the term and its margins have different efficiency factors and are all estimated in different strata. |
0 | the term is a treatment term; and term itself or one of its margins is aliased. |
-1 | the term is an orthogonal block term. |
-2 | the term is a non-orthogonal block term. |
* |
the term was not in either the block or treatment model but all of its factors occurred somewhere in the analysis (AKEEP gives a fault if the term contains factors that did not occur anywhere in the analysis); all other parameters are then ignored for that term. |
As explained in the description of the BLOCKSTRUCTURE
directive, Genstat will set up an extra “factor” denoted *Units*
if the block formula does not specify the final stratum explicitly. AKEEP
allows you to refer to this “factor”, if necessary, by putting the string '*Units*'
(or '*units*'
or '*UNITS*'
) in the TERMS
formula. Thus, to save the residual sum of squares in these circumstances, you could put
AKEEP '*Units*'; SS=ResidSS
Options: FACTORIAL
, STRATUM
, SUPPRESSHIGHER
, TWOLEVEL
, RESIDUALS
, FITTEDVALUES
, CBRESIDUALS
, CBCREGRESSION
, CBCVCOVARIANCE
, TREATMENTSTRUCTURE
, BLOCKSTRUCTURE
, AFACTORIAL
, WEIGHTS
, YVARIATE
, LSDLEVEL
, AOVTABLE
, EQFACTORS
, RMETHOD
, EXIT
, SAVE
.
Parameters: TERMS
, MEANS
, SEMEANS
, SEDMEANS
, VCMEANS
, EFFECTS
, PARTIALEFFECTS
, REPLICATIONS
, RESIDUALS
, DF
, LSDMEANS
,DFMEANS
, SS
, EFFICIENCY
, VARIANCE
, RTERM
, CEFFICIENCY
, CREGRESSION
, CVCOVARIANCE
, CSSP
, CONTRASTS
, XCONTRASTS
, SECONTRASTS
, DFCONTRASTS
, CBMEANS
, SECBMEANS
, SEDCBMEANS
, VCCBMEANS
, LSDCBMEANS
,DFCBMEANS
, CBEFFECTS
, CBVARIANCE
, DFCEFFECTS
, CBCEFFICIENCY
, STRATUMVARIANCE
, COMPONENT
, STATUS
.
See also
Directives: ANOVA
, BLOCKSTRUCTURE
, COVARIATE
, TREATMENTSTRUCTURE
.
Procedures:AFMEANS
, AUKEEP
, A2KEEP
, ASPREADSHEET
, A2RDA
.
Commands for: Analysis of variance.
Example
" Example ANOV-5: randomized block design with two treatment factors" " This is a field experiment to study the effects of nitrogen and sulphur on the yield of wheat with a randomized block design." FILEREAD [NAME='%gendir%/examples/ANOV-5.DAT'; PRINT=summary]\ Block,Plot,N,S,Yield; FGROUPS=4(yes),no " Define the structure of the treatments in the design: here we have the factorial structure of nitrogen (N) crossed with sulphur (S). The model formula N * S expands to give the 3 model terms: N, S and N.S representing the main effects of nitrogen & sulphur, and their interaction. Each of these will have a line in the aov table." TREATMENTSTRUCTURE N * S " The conventional way of analysing these designs, which can be seen in many text books, can be achieved in Genstat simply by putting the block factor (here called Block) at the start of the treatment formula: Block + N * S This works for straightforward designs like the randomized-block design but it is not satisfactory in more complicated situations like the balanced-incomplete-block or split-plot designs (see later examples). Moreover, the analysis that is obtained does not reflect the real structure of the design - for example that Block is a random term and not a fixed term like the treatment main effects and interaction. Consequently, the BLOCKSTRUCTURE directive allows you to define the underlying structure of the design - and thus the random (or error) terms for the analysis. The randomized-block design has an underlying structure of units (here the factor Plot) nested within blocks." BLOCKSTRUCTURE Block / Plot " The formula expands to give two model terms Block + Block.Plot each of these defines a stratum in the analysis-of-variance table: the Block stratum contains the variation between blocks, and the Block.Plot stratum contains the variation between the plots within each block. As the blocks all contain an identical collection of treatments, no treatment terms are estimated between blocks - they are all in the Block.Plot stratum. Genstat discovers this all automatically for you and produces the correct analysis-of-variance table." " The PRINT option is set to give just the analysis-of-variance table, and FPROBABILITY requests probabilities for the variance ratios." ANOVA [PRINT=aov; FPROBABILITY=yes] Yield " Use ADISPLAY to print the means (without recalculating the analysis)." ADISPLAY [PRINT=means] " Plot the means against S, with a different line for each level of N." AGRAPH [METHOD=line] S; GROUPS=N " Example ANOV-5a: saving output from ANOVA" " Using AKEEP to save the sums of squares and degrees of freedom for N and S." AKEEP N+S; SS=N_ss,S_ss; DF=N_df,S_df PRINT N_ss,N_df,S_ss,S_df; DECIMALS=5,0