Fits all subsets of the fixed terms in a
REML analysis (R.W. Payne).
PRINT = string tokens
|Controls printed output (
FORCED = formula
|Terms to include in every model
FACTORIAL = scalar
|Limit for expansion of
FORCED terms; default 3
SELECTION = string tokens
|One or two criteria to be printed with the models (
NBESTMODELS = scalar
|Number of models to print; default * i.e. all
BESTMODEL = pointer
|Saves the best model according to the selected criteria
RESULTS = pointer
|Pointer to save variates containing the criteria for the sets, and F and Wald statistics for the terms that they contain
MARGINALTERMS = string token
|How to treat terms that are marginal to other terms (
SAVE = REML save structure
|Specifies the analysis whose fixed terms are to be tested; by default this will be the most recent
VALLSUBSETS fits all subsets of the fixed terms in a
REML analysis. It does this by a generalized regression analysis, with a weight matrix based on the variances estimated from the
REML analysis (i.e. with the full fixed model). The subsets are thus assessed using identical estimates of the variance components, allowing statistics such as the Akaike information criterion to be used to assess which subset may be best.
VALLSUBSETS uses the most recent
REML analysis. However, you can take an earlier analysis, by using the
SAVE option of
VALLSUBSETS to specify its save structure (saved using the
SAVE parameter of the earlier
The subsets are formed from all the fixed terms, but you can use the
FORCED option to specify terms that should always be included. Terms that are marginal to another fixed term are usually also treated as forced. However, you can set option
MARGINALTERMS to free to retain them in the “free” terms that are used to form the subsets. Note that
VALLSUBSETS considers only models that obey the principle of marginality. This states that a model that includes an interaction term must also include all its marginal terms. For example, a model that includes the interaction
A.B must also include the main effects
SELECTION option selects one or two criteria to be printed with the sets, with the settings:
r2 % sum of squares accounted for (taking the total sum of squares as the residual from the forced model),
adjusted % variance accounted for (compared to the residual mean square from the forced model),
cp Mallows Cp,
ep mean squared error of prediction,
aic Akaike information criterion,
bic Schwarz (Bayesian) information criterion,
rss residual sum of squares, and
rms residual mean square.
For more details, see the
RSEARCH procedure (which is used to do the analyses).
VALLSUBSETS reports which subset is best, according to each of the selected criteria. The default selects the Akaike and Schwarz (Bayesian) information criteria.
In addition to the selected criteria, the output shows the number of degrees of freedom fitted in the subset, and probabilities assessing the effect of dropping each of its terms from the subset. The probabilities are obtained from F statistics if the denominator degrees of freedom are available from the original
REML analysis. Otherwise they are based on Wald statistics. Terms that are marginal to another term in the subset cannot be dropped. This is indicated by printing
marg instead of a probability. Also, terms that are aliased are indicated by printing
aaa. By default, all the subsets are printed, but you can set the
NBESTMODELS to a scalar, n say, to print only the n best subsets according to the first criterion specified by the
The results are printed by default. However, you can set option
PRINT=* if you want only to save them, using the
RESULTS option. This saves a pointer containing variates storing all the available criteria and the numbers of degrees of freedom, then the Wald statistics for the terms, followed by their probabilities, and then the F statistics and their probabilities.
You can also use the
BESTMODEL option to save the best model according to each of the selected criteria. It saves them in a pointer containing either one or two model formulae (according to the number of selected criteria). The formulae are stored in the order in which the criteria were specified by the
SELECTION option, and are labelled in the pointer by the names of the criteria.
VALLSUBSETS defines a weighted regression, with weight matrix given by the inverse of the unit-by-unit variance-covariance matrix (obtained using the
UVCOVARIANCE option of
VKEEP). It then calls the
RSEARCH procedure to fit the subsets.
Any restriction applied to vectors used in the
REML analysis will apply also to the results from
CAPTION 'VALLSUBSETS example','Guide Part 2, Example 5.3.6a'; STYLE=meta,plain FACTOR [NVALUES=322; LEVELS=27] Dam & [LEVELS=18] Pup FACTOR [LEVELS=2; LABELS=!T('M','F')] Sex FACTOR [LEVELS=3; LABELS=!T('C','Low','High')] Dose VARIATE [NVALUES=322] Littersize,Weight OPEN '%GENDIR%/Examples/GuidePart2/Rats.dat'; CHANNEL=2 READ [CHANNEL=2] Dose,Sex,Littersize,Dam,Pup,Weight; \ FREPRESENTATION=2(labels),4(levels) CLOSE 2 VCOMPONENTS [FIXED=Littersize+Dose*Sex] RANDOM=Dam/Pup REML Weight VALLSUBSETS [MARGINALTERMS=free]