Does logistic ridge regression (A.I. Glaser).
Options
PRINT = string token |
What output to print (correlation , crossvalidation , ridge , scaledridge , standarderrors ); default corr |
---|---|
PLOT = string tokens |
What graphs to plot (correlation , ridgetrace , buildup ); default * i.e. none |
LINK = string token |
Link function (logit , probit , complementaryloglog ); default logi |
DISPERSION = scalar |
Value of the dispersion parameter; default 1 |
TERMS = formula |
Explanatory model |
FACTORIAL = scalar |
Limit on number of factors/covariates in a model term; default 3 |
LAMBDA = variate or scalar |
Values for the ridge parameter lambda |
CROSSVALIDATION = string token |
Whether to use cross-validation to find an optimal value of lambda (yes , no ); default no |
NCROSSVALIDATIONGROUPS = scalar |
Number of groups for cross-validation; default 10 |
CVMETHOD = string token |
Which method to use for cross-validation (deviance , squarederror , countingerror ); default devi |
SEED = scalar |
Seed for random numbers to use in cross-validation; default 0 |
Parameters
Y = variates |
Response variate |
---|---|
NBINOMIAL = scalars or variates |
Number of binomial trials for each unit; default 1 |
YVALIDATION = variates |
Response variate for validation |
XVALIDATION = pointers |
Explanatory variables for validation |
XDATA = pointers |
Pointer containing the original explanatory variables in the same order as in XVALIDATION ; default takes the variables in the order in which they occur in TERMS |
NVALIDATION = variates or scalars |
Number of binomial trials for the units of each YVALIDATION variate; default 1 |
BESTLAMBDA = scalars |
Saves the optimal lambda value from cross-validation |
CVSTATISTICS = matrices |
Saves the cross-validation statistics |
RESIDUALS = variates |
Saves residuals when LAMBDA is a scalar |
FITTEDVALUES = variates |
Saves fitted values when LAMBDA is a scalar |
ESTIMATES = variates |
Saves parameter estimates when LAMBDA is a scalar |
SE = variates |
Saves standard errors of the parameter estimates when LAMBDA is a scalar |
DEVIANCE = scalars |
Saves the residual deviance when LAMBDA is a scalar |
LINEARPREDICTOR = variates |
Saves the linear predictor when LAMBDA is a scalar |
Description
Procedure LRIDGE
fits a logistic ridge regression model based on penalized likelihood inference, as explained in the Method section. The response variate is specified by the Y
parameter. The NBINOMIAL
parameter defines the number of binomial trials for each unit, with a default of one. If NBINOMIAL
is greater then one, LRIDGE
forms a modified copy of the data set in which each of the original observations is expanded into its underlying individuals (i.e. to have binary responses either one or zero).
The model to fit is defined by the TERMS
option. The FACTORIAL
option sets a limit on the number of variates and/or factors in the model terms generated from the TERMS
model formula, as in the FIT
directive. The LINK
option defines the link function. This can be either logit (the default), probit or complementary-log-log. The DISPERSION
option specifies the dispersion parameter in the usual way i.e. the default is to fix the parameter at one, or you can set DISPERSION=*
to use a dispersion parameter estimated from the residual deviance.
Printed output is controlled by the PRINT
option, with settings:
correlation |
prints the correlations between the explanatory variables in the TERMS formula, |
---|---|
crossvalidation |
prints the cross-validation results, with optimal lambda value, |
ridge |
prints the ridge coefficients on the original scale, |
scaledridge |
prints the ridge coefficients for the standardized data, and |
standarderrors |
includes standard errors with coefficients printed by the ridge or scaledridge settings. |
Graphical output is controlled by the PLOT
option:
ridgetrace |
produces coefficient estimates against lambda, showing how they decrease as lambda increases, |
---|---|
buildup |
plots coefficient values against the coefficients divided by their maximum values, showing the relative decrease as lambda increases, and |
correlation |
uses the DCORRELATION procedure to produce a graphical representation of the correlation matrix for elements in TERMS . |
The LAMBDA
option allows you to define the values to try for the ridge parameter lambda (see Method). By default LRIDGE
takes a range of values between 0 and 1. If you have set LAMBDA
to a single value, you can save results from the analysis using the RESIDUALS
, FITTEDVALUES
, ESTIMATES
, DEVIANCE
and LINEARPREDICTOR
parameters. Note that the residuals are simple residuals, rather than standardized residuals.
LRIDGE
can use cross-validation to find an optimal value of lambda. The YVALIDATION
, XVALIDATION
and NVALIDATION
parameters allow you to supply an independent data set for validation. The YVALIDATION
parameter specifies the response variate, the NVALIDATION
parameter specifies the corresponding numbers of binomial trials (default 1), and the XVALIDATION
supplies a pointer containing values for the explanatory variables. LRIDGE
needs to match the validation explanatory variables with the original variables in TERMS
. You can define the correspondence explicitly by setting the XDATA
parameter to a pointer containing the original variables in the same order as the corresponding variables in the XVALIDATION
pointer. If XDATA
is not set, LRIDGE
forms the original list using the CLASSIFICATION
of the FCLASSIFICATION
directive. The order of variables should easily be predictable for straightforward TERMS
models, but it is safest to specify XDATA
explicitly for complicated models.
If you do not have an independent data set, LRIDGE
can do the validation by selecting subsets of the original data set. The NCROSSVALIDATIONGROUPS
option defines the number of subsets (default 10). The data set (modified to contain binary responses, as explained above, if NBINOMIAL
is greater than one) is divided into that number of roughly equal-sized subsets. The model is fitted to the data set with each of these parts removed, in turn, and the prediction error is calculated for the omitted subset based on that fit. The method for calculating the prediction error is specified by the CVMETHOD
option:
deviance |
uses the deviance function (defined as twice the difference between the maximum log-likelihood and that achieved under the validation data), |
---|---|
squarederror |
takes the sum of the squared differences between the validation data and the expected values, and |
countingerror |
counts the number of “wrong” predictions in the validation data, i.e. if the value of the validation data was 1 and the expected probability was less than 0.5, the prediction would be considered to be wrong. |
The calculation of the prediction error is repeated for every value of the LAMBDA
option. The value that minimizes the mean prediction error is taken as the optimal lambda, and can be saved by the BESTLAMBDA
parameter. (You could then use LRIDGE
again, with LAMBDA
set to that value, and use the parameters RESIDUALS
, FITTEDVALUES
etc. to save information from the optimal analysis.)
Options: PRINT
, PLOT
, LINK
, DISPERSION
, TERMS
, FACTORIAL
, LAMBDA
, CROSSVALIDATION
, NCROSSVALIDATIONGROUPS
, CVMETHOD
, SEED
.
Parameters: Y
, NBINOMIAL
, YVALIDATION
, XVALIDATION
, XDATA
, NVALIDATION
, BESTLAMBDA
, CVSTATISTICS
, RESIDUALS
, FITTEDVALUES
, ESTIMATES
, SE
, DEVIANCE
, LINEARPREDICTOR
.
Method
Logistic ridge regression is carried out as described by le Cessie & van Houwelingen (1992). The usual log-likelihood for logistic regression is extended to include a penalty on the sum of squares of the parameter estimates β, namely λ × √{∑β2}. When the ridge parameter, lambda, is equal to zero, the parameter estimates will be the usual maximum-likelihood estimates, whereas as lambda tends to infinity all of the parameters tend towards zero. The penalty term is applied by setting the RIDGE
option of the TERMS
directive. The columns of the design matrix in TERMS
are standardized. However, estimated coefficients are available for both the standardized and unstandardized data.
Action with RESTRICT
There must be no restrictions.
Reference
le Cessie, S. & van Houwelingen, J.C. (1992). Ridge estimators in logistic regression. Applied Statistics, 41, 191-202.
See also
Commands for: Regression analysis.
Example
CAPTION 'LRIDGE example'; STYLE=meta " Data showing presence/absence of frogs in the Snowy Mountain area of New South Wales, Australia. See Maindonald & Braun (2007), Data Analysis and Graphics Using R, 2nd Edition." SPLOAD '%GENDIR%/Examples/LRID-1.gsh' POINTER [VALUES=No_of_breeding_sites,altitude,average_rain,mean_max_temp,\ mean_min_temp,log_No_of_pools,log_distance] xvars " Try a range of LAMBDA values, and select best by cross-validation." VARIATE [VALUES=0, 0.001, 0.002...0.01, 0.02, 0.03...0.1, 0.2, 0.3...1,\ 2...5] lambda LRIDGE [PRINT=correlation,SCAL,ST,ridge; PLOT=ridgetrace,buildup,correlation;\ LAMBDA=lambda; CROSSVALIDATION=yes; SEED=237819; TERMS=xvars[]]\ Y=Present; BEST=optlambda PRINT optlambda LRIDGE [PRINT=*; LAMBDA=optlambda; TERMS=xvars[]]\ Y=Present; ESTIMATES=estimates; SE=se; FITTED=prob PRINT estimates,se PRINT Present,prob