1. Home
  2. RWALD procedure

RWALD procedure

Calculates Wald and F tests for dropping terms from a regression (R.W. Payne).

Options

PRINT = string token Controls printed output (waldtests); default wald
FACTORIAL = scalar Limit on number of factors in the model terms generated from the TERMS parameter; default 3
Y = variate Y-variate from whose analysis to calculate the statistics; default is the last y-variate in SAVE
RDF = scalar Saves the residual d.f. used to calculate F probabilities when the dispersion is not fixed
SAVE = regression save structure Specifies the save structure (from MODEL) containing the analysis for which to calculate the tests; default is the save structure from the most recent regression

Parameters

TERMS = formula Model terms for which tests are required
WALDSTATISTIC = scalar or pointer to scalars Saves Wald statistics
DF = scalar or pointer to scalars Saves d.f. of Wald statistics
PROBABILITY = scalar or pointer to scalars Saves the probabilities for the Wald statistics if the dispersion is fixed, or the corresponding F statistics if it is estimated

Description

RWALD provides Wald tests to help you decide whether any terms can be dropped from a regression model. The model must have been fitted already by the regression commands (MODEL, FIT etc.) in the usual way. The tests are usually produced for the most recent regression analysis, but you can set the SAVE and Y options to request tests from an earlier analysis.

By default, RWALD produces tests for all the terms that can be dropped from the model: that is, for every term that is not marginal to another term in the model. For example, in the formula

A + B + C + D + A.B + A.D + B.D

the terms C, A.B, A.D and B.D can be dropped as there are no other terms in the model that contain all their factors (i.e. none to which they are marginal). However, A cannot be dropped until A.B and A.D have been dropped. You can use the TERMS parameter to request Wald tests for a specific set of terms. A missing value is then given for any term that cannot be dropped. The FACTORIAL option sets a limit on the number of factors or variates in each term that is formed from the TERMS formula (default 3).

If option PRINT=waldtests (the default), RWALD prints a table with columns containing the Wald statistic, its number of degrees of freedom and a probability value. With an ordinary linear regression, RWALD will also print an F statistic, and use this to obtain the probability. Provided there is no aliasing between the parameters of the terms, these F statistics and probabilities will be identical to those that would be printed in the Change lines of the Summary of Analysis if the terms were dropped from the model explicitly by using the DROP or TRY directives. The advantage of RWALD is that the model does not have to be refitted (excluding each term) to calculate the information. It thus provides a much more efficient method of assessing the model.

F statistics are also given with any generalized linear model in which the dispersion is not fixed (e.g. models involving the gamma distribution). However, in generalized linear models with a fixed dispersion (e.g. binomial or Poisson), the probabilities are obtained by treating the Wald statistics as chi-square statistics. The deviances and deviance ratios used by TRY and DROP are calculated from the likelihoods of the generalized linear models, whereas the Wald and F statistics are essentially based on weighted sums of squares. So probabilities calculated by RWALD will no longer be identical to those given by TRY and DROP. However, both sets of probabilities are based on the asymptotic properties of their statistics, and so they should give similar conclusions.

The WALDSTATISTIC parameter can save the statistics, and the DF parameter can save their numbers of degrees of freedom. If you are making a Wald test for a single term, you can supply a scalar for each of these parameters. However, if you have several terms, you must supply a pointer which will then be set up to contain as many scalars as there are terms. Similarly the PROBABILITY parameter saves the probabilities for the Wald statistics if the dispersion is fixed, or the corresponding F statistics if it is estimated. The number residual degrees of freedom for the F statistics can be saved, in a scalar, by the RDF option. This contains a missing value if the dispersion is fixed.

Options: PRINT, FACTORIAL, Y, RDF, SAVE.

Parameters: TERMS, WALDSTATISTIC, DF, PROBABILITY.

Method

RWALD uses FCLASSIFICATION to form the list of terms that can be dropped. It then calculates the statistics using estimates and variances saved using RKESTIMATES.

See also

Procedures: ASCREEN, RSCREEN.

Commands for: Regression analysis.

Example

CAPTION 'RWALD example',\
'Cloud seeding example; see Guide to Genstat Part 2, Section 3.3.';\
        STYLE=meta,plain
" Variables are: A  Action (NS not seeded, S seeded)
                 D  Days after first day of experiment
                 S  Suitability for seeding (from model)
                 C  Percent cloud cover
                 P  Previous rainfall (in 10**7 cubic m)
                 E  Type of cloud (1 or 2)
                 Y  Subsequent rainfall (in 10**7 cubic m)"
FACTOR  [LABELS=!t(S,NS)] A
FACTOR  [LEVELS=2] E
READ    A,D,S,C,P,E,Y; FREPRESENTATION=labels,4(*),levels,*
NS  0 1.75 13.4 0.274 2 12.85     S  1 2.70 37.9 1.267 1  5.52
 S  3 4.10  3.9 0.198 2  6.29    NS  4 2.35  5.3 0.526 1  6.11
 S  6 4.25  7.1 0.250 1  2.45    NS  9 1.60  6.9 0.018 2  3.61
NS 18 1.30  4.6 0.307 1  0.47    NS 25 3.35  4.9 0.194 1  4.56
NS 27 2.85 12.1 0.751 1  6.35     S 28 2.20  5.2 0.084 1  5.06
 S 29 4.40  4.1 0.236 1  2.76     S 32 3.10  2.8 0.214 1  4.05
NS 33 3.95  6.8 0.796 1  5.74     S 35 2.90  3.0 0.124 1  4.84
 S 38 2.05  7.0 0.144 1 11.86    NS 39 4.00 11.3 0.398 1  4.45
NS 53 3.35  4.2 0.237 2  3.66     S 55 3.70  3.3 0.960 1  4.22
NS 56 3.80  2.2 0.230 1  1.16     S 59 3.40  6.5 0.142 2  5.45
 S 65 3.15  3.1 0.073 1  2.02    NS 68 3.15  2.6 0.136 1  0.82
 S 82 4.01  8.3 0.123 1  1.09    NS 83 4.65  7.4 0.168 1  0.28 :
CALCULATE Lp,Ly = LOG10(P,Y)
MODEL     Ly
TERMS     A*(D+S+C+Lp+E)
FIT       [PRINT=model,estimates] A + S + D + C + Lp + E + S.A
RWALD
TRY       [PRINT=model,summary; NOMESSAGE=residual,leverage; FPROB=yes]\
          D + C + Lp + E + S.A
Updated on March 5, 2019

Was this article helpful?