RLASSO procedure

Performs lasso using iteratively reweighted least-squares (D.A. Murray & P.H.C. Eilers).

Options

`PRINT` = string token	What output to print (`estimates`, `best`, `crossvalidation`, `progress`, `correlation`, `fitted`, `monitoring`); default `best`
`PLOT` = string tokens	What graphs to plot (`correlation`, `coefficients`); default `*` i.e. none
`TERMS` = formula	Explanatory model
`FACTORIAL` = scalar	Limit on number of factors/covariates in a model term; default 3
`LAMBDA` = variate or scalar	Values for the parameter lambda; must be set
`VALIDATIONMETHOD` = string token	Which cross-validation method to use (`crossvalidation`, `gcv`); default `gcv`
`NCROSSVALIDATIONGROUPS` = scalar	Number of groups for k-fold cross-validation; default 10
`NBOOT` = scalar	Number of times to bootstrap data to estimate standard errors and confidence limits for fitted values; default 100
`SEED` = scalar	Seed for random numbers to use in cross-validation and then in bootstrapping; default 0
`CIPROBABILITY` = scalar	Probability level for confidence interval for fitted values; default 0.95
`MAXCYCLE` = scalar	Maximum number of iterations for the iterative process
`TOLERANCE` = variate	Contains two values to define the convergence criterion for iterative least-squares and the adjustment to avoid division by zero in the penalty term; default `!(0.0001,1e-08)`

Parameters

`Y` = variates	Response variate
`BESTLAMBDA` = scalars	Saves the optimal lambda value from cross-validation
`CVSTATISTICS` = matrices	Saves the cross-validation statistics
`RESIDUALS` = variates	Saves residuals for the optimal `LAMBDA`
`FITTEDVALUES` = variates	Saves fitted values for the optimal `LAMBDA`
`ESTIMATES` = variates	Saves parameter estimates for the optimal `LAMBDA`
`SE` = variates	Saves standard errors of the parameter estimates for the optimal `LAMBDA`
`SEFITTED` = variates	Saves standard errors of the fitted values, from bootstrapping, for the optimal `LAMBDA`
`LOWER` = variates	Saves lower confidence limits for the fitted values, from bootstrapping, for the optimal `LAMBDA`
`UPPER` = variates	Saves upper confidence limits for the fitted values, from bootstrapping, for the optimal `LAMBDA`

Description

The RLASSO procedure performs L1-penalized regression (lasso) using iteratively reweighted sums of squares. The lasso method minimizes the residual sums of squares subject to the constraint that the sum of the absolute values of the model coefficients is less than a constant or tuning parameter λ.

The response variate is specified by the Y parameter. The model to be fitted is defined by the TERMS option. The FACTORIAL option sets a limit on the number of variates and/or factors in the model terms generated from the TERMS model formula (as in the FIT directive).

Printed output is controlled by the PRINT option, with settings:

`estimates`	to print, for each value of λ, the lasso coefficients their standard errors on the standardized and original scales.
`best`	prints the lasso estimates for the optimal λ
`crossvalidation`	to print the cross-validation results, with optimal lambda value,
`progress`	shows the progress of the k-fold cross-validation,,
`correlation`	to print the correlations between the explanatory variables in the `TERMS` formula,
`fitted`	to print the fitted values for the optimal λ with their standard errors and confidence limits
monitoring	to print monitoring information during boot strapping.

By default,PRINT=best.

Graphical output is controlled by the PLOT option:

`coefficients`	plots the standardized coefficient estimates against the shrinkage factor, and correlation, and
`correlation`	uses the `DCORRELATION` procedure to produce a graphical representation of the correlation matrix for elements in `TERMS`.

By default, nothing is plotted.

The LAMBDA option must be set to a variate defining the values to try for the tuning parameter λ. The MAXCYCLE option specifies the number of iterations (default 200). The TOLERANCE option specifies the convergence criterion for the iterative procedure (default 0.0001), and the adjustment to use to avoid division by zero in the penalty term (default 10^-8).

The VALIDATIONMETHOD option controls how RLASSO estimates the tuning parameter λ:

`crossvalidation`	uses k-fold cross-validation where the prediction error is calculated using the mean squared error,
`gcv`	uses the generalized cross-validation, as specified by Tibshirani (1996).

By default , VALIDATIONMETHOD=gcv.

For k-fold cross-validation the NCROSSVALIDATIONGROUPS option defines the number of subsets to use (default 10). The data are divided into roughly equal-sized subsets and the model is fitted with each subset removed in turn. The mean squared error is calculated for the omitted subset based on the model from fitting the remaining subsets. The value that minimizes the mean prediction error is taken as the optimal λ, and used to get the lasso estimates. The optimal value of λ can be saved by the BESTLAMBDA parameter, and the prediction error values can be saved by the CVSTATISTICS parameter.

RLASSO can use bootstrapping to provide standard errors and lower and upper confidence intervals for the fitted values. The NBOOT option specifies the number of bootstrap samples that are taken, and the CIPROBABILITY option sets the size of the confidence limits.

You can save results from the optimal fit using the RESIDUALS, FITTEDVALUES, ESTIMATES and SE, SEFITTED, LOWER and UPPER parameters. Note that the residuals are the simple residuals, rather than standardized residuals.

Options: PRINT, PLOT, TERMS, FACTORIAL, LAMBDA, VALIDATIONMETHOD, NCROSSVALIDATIONGROUPS, NBOOT, SEED, CIPROBABILITY, MAXCYCLE, TOLERANCE.

Parameters: Y, BESTLAMBDA, CVSTATISTICS, RESIDUALS, FITTEDVALUES, ESTIMATES, SE, SEFITTED, LOWER, UPPER.

Method

Lasso is carried out by using iteratively reweighted least-squares. RLASSO approximates the absolute sum of the coefficients ∑|β| by ∑(β²/|β|), and the penalty term λ∑(β²/|β|) is imposed on the sum of squares of the parameter estimates β. The penalty term is applied to the diagonal elements of the sums-of-squares-and-products matrix by setting the RIDGE option of the TERMS directive. For a given value of λ, the algorithm iterates to find the lasso estimates. The shrinkage factor s is estimated by

s = t / ∑|β⁽⁰⁾|

where ∑|β⁽⁰⁾| is the absolute sum of the full least squares estimates, and t is the absolute sum of the lasso estimates subject to

t ≤ ∑|β⁽⁰⁾|.

The columns of the design matrix in TERMS are standardized. However, estimated coefficients are available for both the standardized and unstandardized data.

Action with `RESTRICT`

There must be no restrictions.

References

Hastie, T., Tibshirani, R. & Friedman, J (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd Edition. Springer, New York.

Tibshirani, R. (1996). Regression shrinkage and selection by lasso. Journal of the Royal Statistical Society, Series B, 58, 267-288.

Example

CAPTION   'RLASSO example'; STYLE=meta
" Prostate cancer data examining the correlation between the level of
  prostate-specific antigen and some clinical measures. See Tibshirani (1996),
  Regression and Selection by Lasso, JRSS B, 58, 267-288."
SPLOAD    '%GENDIR%/Examples/RLAS-1.gsh'
SUBSET    [train.eq.2] lcavol,lweight,age,lbph,svi,lcp,gleason,pgg45,lpsa
CALCULATE lambdas = 10**(!(1.8,1.7...-2))
RLASSO    [PRINT=correlation,estimates,cross,best;\
          PLOT=coefficients,correlation; LAMBDA=lambdas;\
          TERMS=lcavol,lweight,age,lbph,svi,lcp,gleason,pgg45]\
          Y=lpsa; BEST=optlambda; ESTIMATES=estimates; SE=se
PRINT     optlambda
PRINT     estimates,se

Updated on October 28, 2020

Was this article helpful?

Yes No