LRIDGE procedure

Does logistic ridge regression (A.I. Glaser).

Options

`PRINT` = string token	What output to print (`correlation`, `crossvalidation`, `ridge`, `scaledridge`, `standarderrors`); default `corr`
`PLOT` = string tokens	What graphs to plot (`correlation`, `ridgetrace`, `buildup`); default `*` i.e. none
`LINK` = string token	Link function (`logit`, `probit`, `complementaryloglog`); default `logi`
`DISPERSION` = scalar	Value of the dispersion parameter; default 1
`TERMS` = formula	Explanatory model
`FACTORIAL` = scalar	Limit on number of factors/covariates in a model term; default 3
`LAMBDA` = variate or scalar	Values for the ridge parameter lambda
`CROSSVALIDATION` = string token	Whether to use cross-validation to find an optimal value of lambda (`yes`, `no`); default `no`
`NCROSSVALIDATIONGROUPS` = scalar	Number of groups for cross-validation; default 10
`CVMETHOD` = string token	Which method to use for cross-validation (`deviance`, `squarederror`, `countingerror`); default `devi`
`SEED` = scalar	Seed for random numbers to use in cross-validation; default 0

Parameters

`Y` = variates	Response variate
`NBINOMIAL` = scalars or variates	Number of binomial trials for each unit; default 1
`YVALIDATION` = variates	Response variate for validation
`XVALIDATION` = pointers	Explanatory variables for validation
`XDATA` = pointers	Pointer containing the original explanatory variables in the same order as in `XVALIDATION`; default takes the variables in the order in which they occur in `TERMS`
`NVALIDATION` = variates or scalars	Number of binomial trials for the units of each `YVALIDATION` variate; default 1
`BESTLAMBDA` = scalars	Saves the optimal lambda value from cross-validation
`CVSTATISTICS` = matrices	Saves the cross-validation statistics
`RESIDUALS` = variates	Saves residuals when `LAMBDA` is a scalar
`FITTEDVALUES` = variates	Saves fitted values when `LAMBDA` is a scalar
`ESTIMATES` = variates	Saves parameter estimates when `LAMBDA` is a scalar
`SE` = variates	Saves standard errors of the parameter estimates when `LAMBDA` is a scalar
`DEVIANCE` = scalars	Saves the residual deviance when `LAMBDA` is a scalar
`LINEARPREDICTOR` = variates	Saves the linear predictor when `LAMBDA` is a scalar

Description

Procedure LRIDGE fits a logistic ridge regression model based on penalized likelihood inference, as explained in the Method section. The response variate is specified by the Y parameter. The NBINOMIAL parameter defines the number of binomial trials for each unit, with a default of one. If NBINOMIAL is greater then one, LRIDGE forms a modified copy of the data set in which each of the original observations is expanded into its underlying individuals (i.e. to have binary responses either one or zero).

The model to fit is defined by the TERMS option. The FACTORIAL option sets a limit on the number of variates and/or factors in the model terms generated from the TERMS model formula, as in the FIT directive. The LINK option defines the link function. This can be either logit (the default), probit or complementary-log-log. The DISPERSION option specifies the dispersion parameter in the usual way i.e. the default is to fix the parameter at one, or you can set DISPERSION=* to use a dispersion parameter estimated from the residual deviance.

Printed output is controlled by the PRINT option, with settings:

`correlation`	prints the correlations between the explanatory variables in the `TERMS` formula,
`crossvalidation`	prints the cross-validation results, with optimal lambda value,
`ridge`	prints the ridge coefficients on the original scale,
`scaledridge`	prints the ridge coefficients for the standardized data, and
`standarderrors`	includes standard errors with coefficients printed by the `ridge` or `scaledridge` settings.

Graphical output is controlled by the PLOT option:

`ridgetrace`	produces coefficient estimates against lambda, showing how they decrease as lambda increases,
`buildup`	plots coefficient values against the coefficients divided by their maximum values, showing the relative decrease as lambda increases, and
`correlation`	uses the `DCORRELATION` procedure to produce a graphical representation of the correlation matrix for elements in `TERMS`.

The LAMBDA option allows you to define the values to try for the ridge parameter lambda (see Method). By default LRIDGE takes a range of values between 0 and 1. If you have set LAMBDA to a single value, you can save results from the analysis using the RESIDUALS, FITTEDVALUES, ESTIMATES, DEVIANCE and LINEARPREDICTOR parameters. Note that the residuals are simple residuals, rather than standardized residuals.

LRIDGE can use cross-validation to find an optimal value of lambda. The YVALIDATION, XVALIDATION and NVALIDATION parameters allow you to supply an independent data set for validation. The YVALIDATION parameter specifies the response variate, the NVALIDATION parameter specifies the corresponding numbers of binomial trials (default 1), and the XVALIDATION supplies a pointer containing values for the explanatory variables. LRIDGE needs to match the validation explanatory variables with the original variables in TERMS. You can define the correspondence explicitly by setting the XDATA parameter to a pointer containing the original variables in the same order as the corresponding variables in the XVALIDATION pointer. If XDATA is not set, LRIDGE forms the original list using the CLASSIFICATION of the FCLASSIFICATION directive. The order of variables should easily be predictable for straightforward TERMS models, but it is safest to specify XDATA explicitly for complicated models.

If you do not have an independent data set, LRIDGE can do the validation by selecting subsets of the original data set. The NCROSSVALIDATIONGROUPS option defines the number of subsets (default 10). The data set (modified to contain binary responses, as explained above, if NBINOMIAL is greater than one) is divided into that number of roughly equal-sized subsets. The model is fitted to the data set with each of these parts removed, in turn, and the prediction error is calculated for the omitted subset based on that fit. The method for calculating the prediction error is specified by the CVMETHOD option:

`deviance`	uses the deviance function (defined as twice the difference between the maximum log-likelihood and that achieved under the validation data),
`squarederror`	takes the sum of the squared differences between the validation data and the expected values, and
`countingerror`	counts the number of “wrong” predictions in the validation data, i.e. if the value of the validation data was 1 and the expected probability was less than 0.5, the prediction would be considered to be wrong.

The calculation of the prediction error is repeated for every value of the LAMBDA option. The value that minimizes the mean prediction error is taken as the optimal lambda, and can be saved by the BESTLAMBDA parameter. (You could then use LRIDGE again, with LAMBDA set to that value, and use the parameters RESIDUALS, FITTEDVALUES etc. to save information from the optimal analysis.)

Options: PRINT, PLOT, LINK, DISPERSION, TERMS, FACTORIAL, LAMBDA, CROSSVALIDATION, NCROSSVALIDATIONGROUPS, CVMETHOD, SEED.

Parameters: Y, NBINOMIAL, YVALIDATION, XVALIDATION, XDATA, NVALIDATION, BESTLAMBDA, CVSTATISTICS, RESIDUALS, FITTEDVALUES, ESTIMATES, SE, DEVIANCE, LINEARPREDICTOR.

Method

Logistic ridge regression is carried out as described by le Cessie & van Houwelingen (1992). The usual log-likelihood for logistic regression is extended to include a penalty on the sum of squares of the parameter estimates β, namely λ × √{∑β²}. When the ridge parameter, lambda, is equal to zero, the parameter estimates will be the usual maximum-likelihood estimates, whereas as lambda tends to infinity all of the parameters tend towards zero. The penalty term is applied by setting the RIDGE option of the TERMS directive. The columns of the design matrix in TERMS are standardized. However, estimated coefficients are available for both the standardized and unstandardized data.

Action with `RESTRICT`

There must be no restrictions.

Reference

le Cessie, S. & van Houwelingen, J.C. (1992). Ridge estimators in logistic regression. Applied Statistics, 41, 191-202.

Example

CAPTION 'LRIDGE example'; STYLE=meta
" Data showing presence/absence of frogs in the Snowy Mountain area
  of New South Wales, Australia. See Maindonald & Braun (2007),
  Data Analysis and Graphics Using R, 2nd Edition."
SPLOAD  '%GENDIR%/Examples/LRID-1.gsh'
POINTER [VALUES=No_of_breeding_sites,altitude,average_rain,mean_max_temp,\
        mean_min_temp,log_No_of_pools,log_distance] xvars
" Try a range of LAMBDA values, and select best by cross-validation."
VARIATE [VALUES=0, 0.001, 0.002...0.01, 0.02, 0.03...0.1, 0.2, 0.3...1,\
        2...5] lambda
LRIDGE  [PRINT=correlation,SCAL,ST,ridge; PLOT=ridgetrace,buildup,correlation;\
        LAMBDA=lambda; CROSSVALIDATION=yes; SEED=237819; TERMS=xvars[]]\
        Y=Present; BEST=optlambda
PRINT   optlambda
LRIDGE  [PRINT=*; LAMBDA=optlambda; TERMS=xvars[]]\
        Y=Present; ESTIMATES=estimates; SE=se; FITTED=prob
PRINT   estimates,se
PRINT   Present,prob

Updated on June 19, 2019

Was this article helpful?

Yes No