Analyses a row-and-column design by REML
, with automatic selection of the best random and spatial covariance model (R.W. Payne).
Options
PRINT = string tokens |
Controls what summary output is produced about the models (deviance , aic , bic , sic , dffixed , dfrandom , change , exit , best , description ); default best , desc |
---|---|
PBEST = string tokens |
Controls the output from the REML analysis with the best model (model , components , effects , means , stratumvariances , monitoring , vcovariance , deviance , Waldtests , missingvalues , covariancemodels , aic , sic , bic ); default * i.e. none |
PTRY = string tokens |
Controls the output to present from the REML analysis used to try each model (model , components , effects , means , stratumvariances , monitoring , vcovariance , deviance , Waldtests , missingvalues , covariancemodels , aic , sic , bic ); default * i.e. none |
FIXED = formula |
Fixed model terms; default * i.e. none |
RANDOM = formula |
Additional random model terms; default * i.e. none |
CONSTANT = string token |
How to treat the constant term (estimate , omit ); default esti |
FACTORIAL = scalar |
Limit on the number of factors or covariates in each fixed term; default 3 |
REPLICATES = factor |
Replicate factor, if relevant |
ROWS = factor |
Row factor; default * i.e. must be specified |
COLUMNS = factor |
Column factor; default * i.e. must be specified |
ROWCOORDINATES = variate or factor |
Row coordinates for fitting trends and spatial models if the design is irregular; if unset, these are defined from the levels of the ROWS factor |
COLCOORDINATES = variate or factor |
Column coordinates for fitting trends and spatial models if the design is irregular; if unset, these are defined from the levels of the COLUMNS factor |
PLOTFACTOR = factor |
Factor numbering the plots in the design; if unset, a local factor is defined automatically |
PTERMS = formula |
Terms (fixed or random) for which effects or means are to be printed; default * implies all the fixed terms |
PSE = string token |
Standard errors to be printed with tables of effects and means (differences , estimates , alldifferences , allestimates , none ); default diff |
MVINCLUDE = string tokens |
Whether to include units with missing values in the explanatory factors and variates and/or the y-variates (explanatory , yvariate ); default * i.e. omit units with missing values in either explanatory factors or variates or y-variates |
VCONSTRAINTS = string token |
Whether to constrain variance components to be positive (none , positive ); default none |
RSTRATEGY = string token |
Strategy for selecting the random model (all , allfeasible , set , setfeasible , fastoptimal , optimal , automatic , comprehensive , full , given ); default allf |
METHOD = string token |
Criterion to choose the best random model (aic , sic , bic ); default sic |
TRYSPATIAL = string token |
Whether to try spatial models (always , ifregular ); default * i.e. no spatial models |
TRYTRENDS = string token |
Whether to see whether row and column trends are needed in the fixed model (yes , no ); default no |
SPATIALFACTOR = factor |
Factor to use to define the term for a 2-dimensional power-distance model; if unset, a local factor is defined automatically |
Parameters
Y = variates |
Response variates |
---|---|
BESTMODEL = pointers |
Saves a model-definition structure for the best model for each y-variate |
EXIT = scalars |
Exit status of the best model for each y-variate |
SAVE = REML save structures |
Save structure from the analysis of the best model for each y-variate |
Description
VAROWCOLUMNDESIGN
allows you to try various random and covariance models for a REML
analysis of data from a row-and-column design, and select the best one according to either their Akaike or Schwarz (Bayesian) information coefficients.
A row-and-column design is a design where the plots are set out in a rectangular grid. Often this is a regular grid, where the rows and columns are equally spaced and there are no gaps, but irregular arrangements can be handled too. Some designs are resolvable. The field can then be divided into sections in which each treatment is replicated once. These replicates can be useful while the experiment is taking place. For example, if several operators are needed to make observations of the plots, it is usual to get each one to observe the plots of a complete replicate. Then any operator differences will be included in the between-replicate variation, and will not add to the variability of the treatment estimates. Of course it can be useful to include a replicate factor even if the “replicates” are not exact, e.g. if some of the treatments do not occur at every level of the replicate factor. The replicate factor, if available, is specified by the REPLICATES
option.
The row and column factors are specified by the ROWS
and COLUMNS
options respectively. If the replicates are adjacent to each other in the field and you want to fit spatial covariance models across the whole field, rather than within each replicate, you should define the levels of the row and column factors to run across the experiment. Otherwise they should be defined within replicates (i.e. using the same numbers within each replicate). The spatial models will then be fitted within replicates.
You can use the ROWCOORDINATES
and COLCOORDINATES
options to specify variates or factors giving the actual positions of the plots in the field. These are needed if you want to fit row or column trends (i.e. covariates) in the fixed model, or to fit a power-distance covariance model when the plots are on an irregular grid. If the levels of the ROWS
and COLUMNS
factors are defined across the whole experiment rather than within replicates, their values are used as defaults if ROWCOORDINATES
and COLCOORDINATES
are not set (or if they are set to variates with no values). Their values are also used as defaults if ROWCOORDINATES
or COLCOORDINATES
are set to variates or factors with no values; the variates or factors are then defined to contain those values.
The PLOTFACTOR
option allows you to specify a factor to index the plots (which is needed to include a random term for measurement error). If this is not set, a local factor called plots
is set up automatically.
The FIXED
option specifies the fixed terms to be fitted in the analysis. The default fixed model consists of just the constant term, which then becomes the grand mean. The constant term can be omitted by setting option CONSTANT=omit
, provided a fixed model has been specified. The FACTORIAL
option sets a limit on the number of factors and variates allowed in each fixed term (default 3); any term containing more than that number is deleted from the model. The RANDOM
option allows you to specify any extra random terms to include (in addition to replicates and blocks-within-replicates). The VCONSTRAINTS
option allows you to constrain the variance components to be positive; by default they are not constrained.
The RSTRATEGY
option selects the strategy to use to determine the random model, with the following settings.
all |
fits the full random model, i.e. replicates, rows within replicates and columns within replicates if REPLICATES is set, or rows and columns otherwise. This is appropriate if the row and column factors played a key role in the design and its randomization. For example, some factors may have been applied to complete rows or complete columns, as in a strip-block design. |
---|---|
allfeasible |
tries to fit the full random model. If this is not possible, it tries models removing first one random term, then two and so on, until successful. |
set |
simply uses the random model (if any) defined by the RANDOM option. This is useful when you know the random model and want to investigate the effect of adding spatial covariance models. |
setfeasible |
tries to fit the random model defined by the RANDOM option. If this is not possible, it tries models removing first one random term, then two and so on, until successful. |
fastoptimal |
follows an automatic strategy that aims to find the best random model without having to fit all of them. So, for example, it does not try models that include a column main effect as well as a spatial covariance model along rows. |
optimal |
tries all feasible random models. This may take a while, and so may be best left for the occasions when you are unsure what to do, or want to check the result from an automatic search. |
full |
synonym of all . |
given |
synonym of set . |
automatic |
synonym of fastoptimal . |
comprehensive |
synonym of optimal . |
VAROWCOLUMNDESIGN
regards a model as successful, if the REML
directive returns an exit status of zero (i.e. successful fitting) and there are no bound or aliased variance parameters. The default is RSTRATEGY=allfeasible
.
The TRYSPATIAL
option indicates whether to try fitting spatial models, with settings:
always |
always tries to fit them, |
---|---|
ifregular |
fit them only if the plots are on a regular grid. |
With the default, TRYSPATIAL=*
, no spatial models are fitted. For a regular grid, VAROWCOLUMNDESIGN
tries models with order 1 auto-regressive structures on the rows and/or the columns of the design, provided there are more than four rows or columns, respectively. For an irregular grid, if there are more than four rows and more four columns, it tries an anisotropic power-distance model using city-block distance. Otherwise, if there is only one dimension with more than four coordinates, it tries an isotropic power-distance model.
The SPATIALFACTOR
option allows you to specify a factor to use to define the term required for a two-dimensional power-distance model. If this is not set, a local factor called RowColumn2d
is used.
You can set option TRYTRENDS=yes
to see whether row and column trends (i.e. covariates) are needed in the fixed model. By default this is not done.
The MVINCLUDE
option controls whether units with missing values in the explanatory factors and variates and/or the y-variate are included in the analysis, as in the REML
directive.
The METHOD
option specifies how to choose the best random model
aic |
uses their Akaike information coefficients, |
---|---|
sic or bic |
uses their Schwarz (Bayesian) information coefficients (default). |
The PRINT
option specifies the summary output to be produced about the models. The settings are mainly the same as those of the VRACCUMULATE
procedure (which is used to store and then print details of the analyses). There is an extra setting, description
, to provide a description of the model and strategy. There is also a setting, best
, to print the description of the best random model. By default, PRINT=best,description
.
The PBEST
option specifies the output to be produced from the REML
analysis with the best model. Similarly, the PTRY
option indicates what output should be produced for each candidate random model when it is tried. Their settings are mainly the same as those of the PRINT
option of the REML
directive. There are also extra settings aic
and sic
(with a synonym bic
) to print the Akaike and Schwarz (Bayesian) information coefficients, respectively. The default for both these options is to produce no output.
The PTERMS
option operates as in REML
, to specify the terms whose means and effects are printed by PBEST
and PTRY
; the default is all the fixed terms. Likewise, the PSE
option controls the type of standard error that is displayed with the means and effects; the default is to give a summary of the standard errors of differences.
The Y
parameter specifies the response variate. A model-definition structure for the best model can be saved, in a pointer, by the BESTMODEL
parameter; the VMODEL
procedure can use this to define the model (using the VCOMPONENTS
and VSTRUCTURE
directives) so that you can reanalyse it yourself using the REML
directive. Alternatively, you can save the REML
save structure from the analysis with the best model using the SAVE
parameter. The EXIT
parameter allows you to save a code from REML
, giving the “exit status” of the fit (zero if successful).
Options: PRINT
, PBEST
, PTRY
, FIXED
, RANDOM
, CONSTANT
, FACTORIAL
, REPLICATES
, ROWS
, COLUMNS
, ROWCOORDINATES
, COLCOORDINATES
, PLOTFACTOR
, PTERMS
, PSE
, MVINCLUDE
, VCONSTRAINTS
, RSTRATEGY
, METHOD
, TRYSPATIAL
, TRYTRENDS
SPATIALFACTOR
.
Parameters: Y
, BESTMODEL
, EXIT
, SAVE
.
Method
Model definition structures are defined for various candidate models, involving rows, columns, measurement error, replicates (if specified) and spatial models (if requested) are defined using the VFMODEL
procedure. (Run the example to see those that are considered for a resolvable, regular row-and-column design.) The VARANDOM
procedure is used to fit them, with the VRACCUMULATE
procedure storing the necessary details for the best one to be selected.
See also
Directives: REML
, VCOMPONENTS
, VSTRUCTURE
.
Procedures: VABLOCKDESIGN
, VAOPTIONS
, VARANDOM
, VARECOVER
, VASERIES
, VALINEBYTESTER
, VFMODEL
.
Commands for: REML analysis of linear mixed models.
Example
CAPTION 'VAROWCOLUMNDESIGN example',\ 'Slate Hall Farm data (Guide to REML in Genstat, Section 1.8).';\ STYLE=meta,plain SPLOAD '%gendir%/data/slatehall.gsh' " find best row-column model (with variety as a fixed term) " VAROWCOLUMNDESIGN [PRINT=best,description,deviance,aic,sic,dfrandom;\ PTRY=*; FIXED=variety; REPLICATES=replicates;\ ROWS=fieldrow; COLUMNS=fieldcolumn;\ TRYSPATIAL=always; TRYTRENDS=yes;\ RSTRATEGY=fastoptimal]\ Y=yield; BESTMODEL=bestmodel; SAVE=savebest VDISPLAY [PRINT=model,components,wald] savebest " plot variogram " VKEEP [RESIDUALS=residuals; SAVE=savebest] F2DRESIDUALVARIOGRAM [ROW=fieldrow; COLUMN=fieldcolumn] residuals " plot residuals against rows and columns " VPLOT [INDEX=fieldrow] index VPLOT [INDEX=fieldcolumn] index