RIDGE procedure

Produces ridge regression and principal component regression analyses (A.J. Rook & M.S. Dhanoa).

Options

`PRINT` = string token	What to print (`correlation`, `pcp`, `ridge`); default `corr`
`PLOT` = string token	Graphical output required (`ridgetrace`); default *

Parameters

`Y` = variates	Response variate in regression model
`X` = pointers	Containing explanatory variates in regression model

Description

Procedure RIDGE produces analyses for identifying and overcoming collinearity among the independent variates in a multiple regression analysis. The correlation matrix, variance inflation factors (the diagonal elements of the inverse of the correlation matrix) and the ratio of the squared error in the least squares regression coefficients to the expected squared error in orthogonal data are calculated. Principal component regressions excluding 1, 2 or 3 minor principal axes are calculated and transformed back to the original variables on either the original or standardized scale. The “Positive correlation spread association” (PCSA) (Vinod 1976) is also calculated. This is an overall measure of the suitability of the data for the application of principal component regression and ridge regression. Ridge regressions (Hoerl & Kennard 1970) are calculated and the ridge coefficients are printed together with 2 indices of stability proposed by Vinod (1976): the index of stability of relative magnitudes (ISRM) and the numerical largeness of more significant regression coefficients (NLMS). These are 0 and 1 respectively in orthogonal data. High-resolution graphs of the ridge trace can be plotted against Hoerl & Kennard’s k scale and Vinod’s m scale.

The parameters of the procedure are used to input the data: the Y parameter supplies the y-variate, and the X parameter specifies a pointer containing the x-variates. None of these variates must be restricted nor contain missing values.

Printed output is controlled by the PRINT option: correlation prints the correlation matrix, variance inflation factors and ratio of squared error to that in orthogonal data, pcp prints principal component analysis and principal component regression, and ridge prints ridge coefficients and stability parameters.

Graphical output is controlled by the PLOT option: ridgetrace produces ridge traces.

Options: PRINT, PLOT.

Parameters: Y, X.

Method

The correlation matrix is produced using the CORRELATE directive. This is then used to calculate the variance inflation factors (VIF) and the ratio of the squared error to that in orthogonal data (RL).

Principal component analysis is carried out using the PCP directive. The standardized response variable is regressed on the principal component scores using MODEL, FIT and TERMS directives. The coefficients are then transformed back to the original variables on either the standardized or original scale with up to three principal components excluded. The correlations of the standardized variable with each of the principal components are also printed.

Ridge regression is carried out as described by Hoerl & Kennard (1970). Ridge coefficients on both the standardized and original scales are printed for values of the biasing parameter k between 0 and 1 together with the standard errors of the coefficients on the standardized scale. Residual sums of squares (RSS) R-squared and total variance of the ridge coefficients (TVARB) are printed for each value of k. In addition Vinod’s (1976) m scale is printed together with the index of stability of relative magnitudes (ISRM) and the numerical largeness of the more significant regression coefficients (NLMS).

High-resolution graphs of the ridge traces are produced. These are graphs of the ridge coefficients against k or against m and are plotted to the device set up prior to calling the procedure.

Action with `RESTRICT`

None of the input variates must be restricted.

References

Chatterjee, S. & Price, B. (1991). Regression Analysis by Example (second edition). New York, Wiley.

Hoerl, A.E. & Kennard, R.W. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12, 55-67.

Vinod, H.D. (1976). Application of new ridge regression methods to a study of Bell system scale economies. Journal of the American Statistical Association, 71, 835-841.

Example

CAPTION 'RIDGE example',\
        !t('Data on French economy from Chatterjee & Price',\
        '(1991, pages 182, 185, 213, 218 and 220).');\
        STYLE=meta,plain
VARIATE [NVALUES=11] Import,Doprod,Stock,Consum
READ    Import,Doprod,Stock,Consum
15.9 149.3 4.2 108.1
16.4 161.2 4.1 114.8
19.0 171.5 3.1 123.2
19.1 175.5 3.1 126.9
18.8 180.8 1.1 132.1
20.4 190.7 2.2 137.7
22.7 202.1 2.1 146.0
26.5 212.4 5.6 154.1
28.1 226.1 5.0 162.3
27.6 231.9 5.1 164.3
26.3 239.0 0.7 167.6 :
RIDGE [PRINT=corr,pcp,ridge] Import; !p(Doprod,Stock,Consum)

Updated on March 5, 2019

Was this article helpful?

Yes No