Performs lasso using iteratively reweighted least-squares (D.A. Murray & P.H.C. Eilers).

### Options

`PRINT` = string token |
What output to print (`correlation` , `crossvalidation` , `estimates` , `best` ); default `best` |
---|---|

`PLOT` = string tokens |
What graphs to plot (`correlation` , `coefficients` ); default `*` i.e. none |

`TERMS` = formula |
Explanatory model |

`FACTORIAL` = scalar |
Limit on number of factors/covariates in a model term; default 3 |

`LAMBDA` = variate or scalar |
Values for the parameter lambda; must be set |

`VALIDATIONMETHOD` = string token |
Which cross-validation method to use (`crossvalidation` , `gcv` ); default `gcv` |

`NCROSSVALIDATIONGROUPS` = scalar |
Number of groups for k-fold cross-validation; default 10 |

`SEED` = scalar |
Seed for random numbers to use in cross-validation; default 0 |

`MAXCYCLE` = scalar |
Maximum number of iterations for the iterative process |

`TOLERANCE` = variate |
Contains two values to define the convergence criterion for iterative least-squares and the adjustment to avoid division by zero in the penalty term; default `!(0.0001,1e-08)` |

### Parameters

`Y` = variates |
Response variate |
---|---|

`BESTLAMBDA` = scalars |
Saves the optimal lambda value from cross-validation |

`CVSTATISTICS` = matrices |
Saves the cross-validation statistics |

`RESIDUALS` = variates |
Saves residuals for the optimal `LAMBDA` |

`FITTEDVALUES` = variates |
Saves fitted values for the optimal `LAMBDA` |

`ESTIMATES` = variates |
Saves parameter estimates for the optimal `LAMBDA` |

`SE` = variates |
Saves standard errors of the parameter estimates for the optimal `LAMBDA` |

### Description

The `RLASSO`

procedure performs L1-penalized regression (*lasso*) using iteratively reweighted sums of squares. The lasso method minimizes the residual sums of squares subject to the constraint that the sum of the absolute values of the model coefficients is less than a constant or tuning parameter λ.

The response variate is specified by the `Y`

parameter. The model to be fitted is defined by the `TERMS`

option. The `FACTORIAL`

option sets a limit on the number of variates and/or factors in the model terms generated from the `TERMS`

model formula (as in the `FIT`

directive).

Printed output is controlled by the `PRINT`

option, with settings:

`correlation` |
to print the correlations between the explanatory variables in the `TERMS` formula, |
---|---|

`crossvalidation` |
to print the cross-validation results, with optimal lambda value, |

`progress` |
shows the progress of the k-fold cross-validation,, |

`best` |
prints the lasso estimates for the optimal λ, and |

`estimates` |
to print, for each value of λ, the lasso coefficients their standard errors on the standardized and original scales. |

By default,`PRINT=best`

.

Graphical output is controlled by the `PLOT`

option:

`coefficients` |
plots the standardized coefficient estimates against the shrinkage factor, and correlation, and |
---|---|

`correlation` |
uses the `DCORRELATION` procedure to produce a graphical representation of the correlation matrix for elements in `TERMS` . |

By default, nothing is plotted.

The `LAMBDA`

option must be set to a variate defining the values to try for the tuning parameter λ. The `MAXCYCLE`

option specifies the number of iterations (default 200). The `TOLERANCE`

option specifies the convergence criterion for the iterative procedure (default 0.0001), and the adjustment to use to avoid division by zero in the penalty term (default 10^{-8}).

The `VALIDATIONMETHOD`

option controls how `RLASSO`

estimates the tuning parameter λ:

`crossvalidation` |
uses k-fold cross-validation where the prediction error is calculated using the mean squared error, |
---|---|

`gcv` |
uses the generalized cross-validation, as specified by Tibshirani (1996). |

By default , `VALIDATIONMETHOD=gcv`

.

For k-fold cross-validation the `NCROSSVALIDATIONGROUPS`

option defines the number of subsets to use (default 10). The data are divided into roughly equal-sized subsets and the model is fitted with each subset removed in turn. The mean squared error is calculated for the omitted subset based on the model from fitting the remaining subsets. The value that minimizes the mean prediction error is taken as the optimal λ, and used to get the lasso estimates. The optimal value of λ can be saved by the `BESTLAMBDA`

parameter, and the prediction error values can be saved by the `CVSTATISTICS`

parameter.

You can save results from the optimal fit using the `RESIDUALS`

, `FITTEDVALUES`

, `ESTIMATES`

and `SE`

parameters. Note that the residuals are the simple residuals, rather than standardized residuals.

Options: `PRINT`

, `PLOT`

, `TERMS`

, `FACTORIAL`

, `LAMBDA`

, `VALIDATIONMETHOD`

, `NCROSSVALIDATIONGROUPS`

, `SEED`

, `MAXCYCLE`

, `TOLERANCE`

.

Parameters: `Y`

, `BESTLAMBDA`

, `CVSTATISTICS`

, `RESIDUALS`

, `FITTEDVALUES`

, `ESTIMATES`

, `SE`

.

### Method

Lasso is carried out by using iteratively reweighted least-squares. `RLASSO`

approximates the absolute sum of the coefficients ∑|β| by ∑(β^{2}/|β|), and the penalty term λ∑(β^{2}/|β|) is imposed on the sum of squares of the parameter estimates β. The penalty term is applied to the diagonal elements of the sums-of-squares-and-products matrix by setting the `RIDGE`

option of the `TERMS`

directive. For a given value of λ, the algorithm iterates to find the lasso estimates. The shrinkage factor *s* is estimated by

*s* = *t* / ∑|β^{(0)}|

where ∑|β^{(0)}| is the absolute sum of the full least squares estimates, and *t* is the absolute sum of the lasso estimates subject to

*t* ≤ ∑|β^{(0)}|.

The columns of the design matrix in `TERMS`

are standardized. However, estimated coefficients are available for both the standardized and unstandardized data.

### Action with `RESTRICT`

There must be no restrictions.

### References

Hastie, T., Tibshirani, R. & Friedman, J (2009). *The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd Edition*. Springer, New York.

Tibshirani, R. (1996). Regression shrinkage and selection by lasso. *Journal of the Royal Statistical Society, Series B*, 58, 267-288.

### See also

Procedure: `LRIDGE`

.

Commands for: Regression analysis.

### Example

CAPTION 'RLASSO example'; STYLE=meta " Prostate cancer data examining the correlation between the level of prostate-specific antigen and some clinical measures. See Tibshirani (1996), Regression and Selection by Lasso, JRSS B, 58, 267-288." SPLOAD '%GENDIR%/Examples/RLAS-1.gsh' SUBSET [train.eq.2] lcavol,lweight,age,lbph,svi,lcp,gleason,pgg45,lpsa CALCULATE lambdas = 10**(!(1.8,1.7...-2)) RLASSO [PRINT=correlation,estimates,cross,best;\ PLOT=coefficients,correlation; LAMBDA=lambdas;\ TERMS=lcavol,lweight,age,lbph,svi,lcp,gleason,pgg45]\ Y=lpsa; BEST=optlambda; ESTIMATES=estimates; SE=se PRINT optlambda PRINT estimates,se