Stores results from a linear, generalized linear, generalized additive or nonlinear model.
||Whether to put estimates in the order defined by the maximal model for linear or generalized linear models (
||Dispersion parameter to be used as estimate for variability in s.e.s; default as set in the
||Type of residuals to form if parameter
||Basis of estimate of dispersion, if not fixed by
||Probability level for confidence limits; default 0.95|
||Pointer to settings of options of the current
||Pointer to settings of parameters of the current
||Saves all the statistics that could be displayed for the first
||Method to use to calculate confidence intervals for nonlinear models (
||Whether to ignore failure to fit a generalized linear model (
||Saves the maximal model (as defined by
||Saves the currently-fitted model (including any contrast functions)|
||Saves a scalar containing the value one if the constant is included in the fitted model, or zero otherwise|
||Saves a scalar to indicate the type of model that has been fitted|
||Specifies save structure of model; default
||Response variates for which results are to be saved; default is the list of response variates in the most recent
||Residuals for each
||Fitted values for each
||Leverages of the units for each
||Estimates of parameters for each
||Standard errors of the estimates|
||Inverse matrix from a linear or generalized linear model, inverse of second derivative matrix from a nonlinear model|
||Variance-covariance matrix of the estimates|
||Residual ss or deviance|
||Residual degrees of freedom|
||Fitted terms (excluding constant)|
||Iterative weights from a generalized linear model|
||Linear predictor from a generalized linear model|
||Adjusted response of a generalized linear model|
||Exit status from a generalized linear or nonlinear model|
||Derivatives of fitted values with respect to parameters in a nonlinear model|
||Grid of function or deviance values from a nonlinear model|
||Design matrix whose columns are explanatory variates and dummy variates|
||Pearson chi-square statistic from a generalized linear model|
||Saves the identifiers of the variates that have been smoothed in the current model|
||Saves a pointer to variates holding the nonlinear components of the variates that have been smoothed|
||Number of units used in regression, excluding missing data and zero weights and taking account of restrictions|
||Saves standard errors of the fitted values|
||Saves standard errors of the linear predictor|
||Saves the variance inflation factors of the parameter estimates|
||Saves upper confidence limits for the parameter estimates|
||Saves lower confidence limits for the parameter estimates|
||Saves the residual mean deviance (or mean square)|
||Saves the total deviance (or sum of squares)|
||Saves the total degrees of freedom (corrected for the mean or uncorrected as displayed by the fitting directives)|
||Saves the total mean deviance (or mean square)|
||Saves the summary analysis-of-variance (or deviance) table as a pointer with a variate or text for each column (source, d.f. etc)|
||Saves the accumulated analysis-of-variance (or deviance) table as a pointer with a variate or text for each column (source, d.f. etc)|
||Saves all the statistics that could be displayed for the
RKEEP allows you to copy information from a regression analysis (performed, for example, by a
FITNONLINEAR statement) into Genstat data structures. You do not need to declare the structures in advance; Genstat will declare them automatically to be of the correct type and length.
Y parameter specifies the response variates for which the results are to be saved. Unusually for the first parameter of a directive, this has a default: if you leave it out, Genstat assumes that results are to be saved for all the response variates, as given in the previous
SELINEARPREDICTOR parameters allow you to save the standardized residuals, the fitted values, the leverages, the standard errors of the fitted values and the standard errors of the linear predictor. For example,
RESIDUALS=R puts the residuals in a variate
RMETHOD option controls the type of residuals that are formed. You cannot save these values if you have set
RMETHOD=* in the
MODEL statement. The standard errors of fitted values are defined by:
s.e. = √(leverage × variance function × dispersion / weight)
where the variance function is calculated from the fitted value according to the setting of the
DISTRIBUTION option of the current
MODEL statement, and the dispersion is the fixed or estimated value of dispersion, as controlled by the
DMETHOD options of the
SE parameters save the parameter estimates and their standard errors;
RKEEP puts them in variates, using the same order as in the display produced by the
TERMS to define a maximal model, you can set option
EXPAND=yes to reorder the estimates to their order in the maximal model (including missing values for the parameters not currently in the model). The variates saving these values are set up with labels; thus, you can refer to individual values in expressions using the labels as displayed when the estimates are printed. For example, to get the estimate of the constant into a scalar, you could put:
CALCULATE Const = Esti$['Constant']
LOWER parameters allow you to save upper and lower confidence limits for the parameter estimates. The probability for the confidence interval is specifed by the
PROBABILITY option, with default 0.95. The
CIMETHOD option controls the method used with nonlinear models. The default setting,
quadratic, uses the same method as for other types of regression, basing the limits on a quadratic surface fitted to the likelihood surface around the optimum. These may be poor approximations if the surface is very non symmetric. The alternative setting,
exact, calculates the limits directly from the likelihood surface.
VCOVARIANCE parameter saves the variance-covariance matrix of the estimates for each response variate: these are formed by multiplying the inverse matrix by the relevant variance estimate based on the estimated dispersion, or on the dispersion that you have supplied.
DEVIANCE parameter allows you to save the residual sum of squares, or the deviance for distributions other than Normal. The
DF parameter saves the residual degrees of freedom, and the
MEANDEVIANCE parameter saves the residual mean deviance. The
TDEVIANCE parameter saves the total deviance, the
TDF parameter saves the total degrees of freedom (corrected for the mean or uncorrected as displayed by the fitting directives), and the
TMEANDEVIANCE parameter saves the total mean deviance.
LINEARPREDICTOR parameter allows you to save the linear predictor of a generalized linear model; the values of the linear predictor are the same as the fitted values if the link function is the identity function.
ITERATIVEWEIGHTS parameter saves a variate containing the iterative weights used in the last cycle of the iteration for fitting a generalized linear model. The iterative weights do not contain any contribution from the weights that can be specified, whether or not the model is iterative, by the
WEIGHTS option of the
MODEL directive, and they are 1.0 for ordinary linear regression.
YADJUSTED parameter saves the adjusted response variate used in the last cycle of the iteration for fitting a generalized linear model; with the identity link function this is the same as the response variate.
The Pearson chi-square statistic can be saved using the
PEARSONCHI parameter of
RKEEP. It is calculated as the sum of the squared Pearson residuals. This can be used as an alternative to the deviance for testing goodness of fit; see Nelder & McCullagh (1989).
EXIT parameter of
RKEEP provides a code that indicates the success or type of failure of an iterative fit. Codes 0-7 are relevant to standard curves and general nonlinear models, and codes 0 and 8-13 are for generalized linear models:
0 Successful fitting
1 Limit on number of cycles has been reached without convergence
2 Parameter out of bounds
3 Likelihood appears constant
4 Failure to progress towards solution
5 Some standard errors are not available because the information matrix is nearly singular
6 Calculated likelihood may be incorrect because of missing fitted values
7 Curve is close to a limiting form
8 Data incompatible with model
9 Predicted mean or linear predictor out of range
10 Invalid calculation for calculated link or distribution
11 All units have been excluded from the analysis
12 Iterative process has diverged
13 Failure due to lack of space or data access
14 Function returned a missing value
With a generalized linear model, unless you set option
EXIT code is the only information that you can save if the fit has been unsuccessful. Alternatively, with a nonlinear model or when
RKEEP will save any information that may be available. (You may thus, for example, be able to discover more about the cause of the failure.)
The derivatives of the fitted values with respect to each parameter in a standard curve or general nonlinear model can be stored in variates using the
GRADIENTS parameter. You can use these quantities to assess the relative influence of each observation on a parameter; you can also construct a measure of leverage by summing the gradients for all the parameters.
GRID parameter can be used to store a grid of values of the deviance (or any general function) following
DESIGNMATRIX parameter allows you to save the matrix X. The columns correspond to the parameters of the model, ordered as for the
ESTIMATES parameter. For simple linear regression with a constant this has only two columns, the first containing ones and the second containing the values of the explanatory variate. Columns corresponding to aliased parameters are omitted, but you can use the corresponding option of
TERMS to construct the full design matrix.
PEARSONCHI parameter provides the Pearson chi-square statistic for dispersion, which is the same as the residual sum of squares for the Normal distribution, but is different to the deviance for other distributions. The
SCOMPONENTS parameters are relevant to generalized additive models. The
STERMS parameter can be used to store a pointer to those variates whose effects in the model are smoothed. The
SCOMPONENTS parameter stores a pointer to variates, one for each smoothed variate in the same order as in
STERMS, containing the fitted nonlinear component of each smoothed variate – this does not include the linear component or the constant term.
NOBSERVATIONS parameter allows you to save the number of units used in the analysis, omitting units with missing values or excluded by restrictions. This will be the same as the total number of degrees of freedom plus one, except in a regression with no constant term and no explanatory factors when it will equal the total number of degrees of freedom.
SUMMARY parameter can be used to save the summary analysis-of-variance (or deviance) table for each response variate. The summary table is saved as a pointer with a variate or text for each of its columns (source, d.f. etc). Similarly, the
ACCUMULATED parameter can save the accumulated analysis-of-variance (or deviance) tables.
STATISTICS parameter saves all the statistics that could be displayed for each response variate by the
'summary' setting of the
ADD etc. Alternatively, the
STATISTICS option can be used to save the statistics for the first response variate specified by the
DISPERSION option allows you to define the value to be used for the dispersion parameter when calculating the standard errors. The
DMETHOD option indicates how this should be calculated if
DISPERSION is not set. By default the deviance is used but you can set
DMETHOD=Pearson to request the Pearson chi-square statistic to be used instead.
PMODEL allow you to save pointers containing information about the current model. The labels of the pointers can be specified in either lower or upper case, or any mixture.
OMODEL can be set to a pointer to store information about each of the options set in the previous
MODEL statement. For example, the statement
will allow you to refer to the current variate of weights (if one was set in the
WEIGHTS option of
Om['weights']. Whether or not a variate was set, the statement
MODEL [WEIGHTS=Om['weights']] Newobs
will allow a new analysis with the same weighting as the old.
Om has 16 values, with suffixes corresponding to the options of
MODEL in the defined order. Similarly, the statement
will set up a pointer storing the (eight) current parameter settings of the previous
MODEL statement. However, if there was more than one response variate, the first value of the pointer will be the identifier of the first response variate only: the others are not stored. Similarly, only the fitted-values and residuals variates for the first response will be pointed at. For example, the identifier
Pm['y'] can be used to refer to the current response variate after the
RKEEP statement above.
MAXIMALMODEL option saves the maximal model (as defined by
FITMODEL option saves the model that has currently been fitted, including any contrast functions (i.e.
FITCONSTANT option saves a scalar containing the value one if the constant is included in the fitted model, or zero otherwise. The
FITTYPE option saves a scalar to indicate the type of model that has been fitted: 1 for an ordinary regression or generalized linear model (
FIT), 2 for a generalized nonlinear model (
FIT with the
CALCULATION option set), 3 for a standard curve (
FITCURVE) and 4 for a nonlinear model (
McCullagh, P. & Nelder, J.A. (1989). Generalized Linear Models (second edition). Chapman and Hall, London.
Commands for: Regression analysis.
" Example FIT-3: Comparing linear regressions between groups Experiments on cauliflowers in 1957 and 1958 provided data on the mean number of florets in the plant and the temperature during the growing season (expressed as accumulated temperature above 0 deg C." " The counts and temperatures are in a file called 'FIT-3.DAT'" FILEREAD [NAME='%gendir%/examples/FIT-3.DAT'] MnCount,AccTemp " The first 7 values are from 1957 and the rest from 1958; set up a factor to distinguish the two years." FACTOR [LEVELS=!(1957,1958); VALUES=7(1957,1958)] Year " Fit a linear regression model of the mean count of florets on accumulated temperature - first ignoring the division into two years." MODEL MnCount TERMS AccTemp*Year FIT AccTemp " Fit parallel regressions for the two years." ADD Year " Fit separate regressions for the two years." ADD AccTemp.Year " Display the accumulated summary: an analysis of parallelism." RDISPLAY [PRINT=accumulated] " Show the parallel models." DROP [PRINT=*] AccTemp.Year RGRAPH [GRAPHICS=high] " Extract the parameter estimates and s.e.s and display the common slope and its s.e." RKEEP ESTIMATES=Esti; SE=Se CALC Slope,SlopeSE = (Esti,Se)$ PRINT Slope,SlopeSE