Produces ridge regression and principal component regression analyses (A.J. Rook & M.S. Dhanoa).
Options
PRINT = string token |
What to print (correlation , pcp , ridge ); default corr |
---|---|
PLOT = string token |
Graphical output required (ridgetrace ); default * |
Parameters
Y = variates |
Response variate in regression model |
---|---|
X = pointers |
Containing explanatory variates in regression model |
Description
Procedure RIDGE
produces analyses for identifying and overcoming collinearity among the independent variates in a multiple regression analysis. The correlation matrix, variance inflation factors (the diagonal elements of the inverse of the correlation matrix) and the ratio of the squared error in the least squares regression coefficients to the expected squared error in orthogonal data are calculated. Principal component regressions excluding 1, 2 or 3 minor principal axes are calculated and transformed back to the original variables on either the original or standardized scale. The “Positive correlation spread association” (PCSA) (Vinod 1976) is also calculated. This is an overall measure of the suitability of the data for the application of principal component regression and ridge regression. Ridge regressions (Hoerl & Kennard 1970) are calculated and the ridge coefficients are printed together with 2 indices of stability proposed by Vinod (1976): the index of stability of relative magnitudes (ISRM) and the numerical largeness of more significant regression coefficients (NLMS). These are 0 and 1 respectively in orthogonal data. High-resolution graphs of the ridge trace can be plotted against Hoerl & Kennard’s k scale and Vinod’s m scale.
The parameters of the procedure are used to input the data: the Y
parameter supplies the y-variate, and the X
parameter specifies a pointer containing the x-variates. None of these variates must be restricted nor contain missing values.
Printed output is controlled by the PRINT
option: correlation
prints the correlation matrix, variance inflation factors and ratio of squared error to that in orthogonal data, pcp
prints principal component analysis and principal component regression, and ridge
prints ridge coefficients and stability parameters.
Graphical output is controlled by the PLOT
option: ridgetrace
produces ridge traces.
Options: PRINT
, PLOT
.
Parameters: Y
, X
.
Method
The correlation matrix is produced using the CORRELATE
directive. This is then used to calculate the variance inflation factors (VIF) and the ratio of the squared error to that in orthogonal data (RL).
Principal component analysis is carried out using the PCP
directive. The standardized response variable is regressed on the principal component scores using MODEL
, FIT
and TERMS
directives. The coefficients are then transformed back to the original variables on either the standardized or original scale with up to three principal components excluded. The correlations of the standardized variable with each of the principal components are also printed.
Ridge regression is carried out as described by Hoerl & Kennard (1970). Ridge coefficients on both the standardized and original scales are printed for values of the biasing parameter k between 0 and 1 together with the standard errors of the coefficients on the standardized scale. Residual sums of squares (RSS) R-squared and total variance of the ridge coefficients (TVARB) are printed for each value of k. In addition Vinod’s (1976) m scale is printed together with the index of stability of relative magnitudes (ISRM) and the numerical largeness of the more significant regression coefficients (NLMS).
High-resolution graphs of the ridge traces are produced. These are graphs of the ridge coefficients against k or against m and are plotted to the device set up prior to calling the procedure.
Action with RESTRICT
None of the input variates must be restricted.
References
Chatterjee, S. & Price, B. (1991). Regression Analysis by Example (second edition). New York, Wiley.
Hoerl, A.E. & Kennard, R.W. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12, 55-67.
Vinod, H.D. (1976). Application of new ridge regression methods to a study of Bell system scale economies. Journal of the American Statistical Association, 71, 835-841.
See also
Directive: PCP
.
Procedure: LRIDGE
.
Commands for: Regression analysis, Multivariate and cluster analysis.
Example
CAPTION 'RIDGE example',\ !t('Data on French economy from Chatterjee & Price',\ '(1991, pages 182, 185, 213, 218 and 220).');\ STYLE=meta,plain VARIATE [NVALUES=11] Import,Doprod,Stock,Consum READ Import,Doprod,Stock,Consum 15.9 149.3 4.2 108.1 16.4 161.2 4.1 114.8 19.0 171.5 3.1 123.2 19.1 175.5 3.1 126.9 18.8 180.8 1.1 132.1 20.4 190.7 2.2 137.7 22.7 202.1 2.1 146.0 26.5 212.4 5.6 154.1 28.1 226.1 5.0 162.3 27.6 231.9 5.1 164.3 26.3 239.0 0.7 167.6 : RIDGE [PRINT=corr,pcp,ridge] Import; !p(Doprod,Stock,Consum)