||Defines the interior knot values; no default i.e. this option must be set|
||Defines the order of the piecewise polynomial; default 3|
||Controls which spline basis is calculated (
||Left-hand limit L of the interval [L, U); default
||Right-hand limit U of the interval [L, U); default
||Which warning messages to suppress (
||Values for which the basis spline functions are calculated|
||Pointer to save variates containing the values of the basis spline functions|
||Pointer to save variates containing the values of the first order derivatives of the basis spline functions|
Piecewise polynomials or splines can be used for nonparametric function estimation. Splines offer a flexible way to investigate the shape of a relationship or can be used for interpolation and smoothing. There are several types of splines. Smoothing splines, implemented in Genstat by means of regression function
SSPLINE, minimize a penalized residual sums of squares in which lack of smoothness of the estimated function is penalized. Smoothing splines can be less appropriate when local effects are strong or when the estimated function should be monotone, e.g. when estimating growth curves.
An alternative for smoothing splines is to use regression splines which offer more control over the characteristics of the estimated function. With regression splines the user first specifies an interval [L, U) on which the estimated function is non-trivial. This interval is then explicitly divided into segments by the user, and a polynomial, of order say k, is fitted in each segment. The segments are separated by a sequence of so-called knots. It is customary to force the piecewise polynomials to join smoothly at these knots. The piecewise polynomials and all their derivatives are always continuous from the right at the knots. Moreover, when there are no replicated knot values, the (k-1)th derivative is continuous at the knot values. The order of differentiability is lower when there are replicated knot values. The full knot sequence includes the endpoints L and U which are replicated depending on the order of the piecewise polynomial. Ramsay (1988) provides a concise introduction into regression splines, while de Boor (1978) gives a full account.
SPLINE procedure can be used to calculate a set of so called basis functions which have all the required properties of continuity and differentiability. These basis functions can then be used to fit the regression spline. A simple basis is given by truncated polynomials but this has the disadvantage of generating considerable rounding errors. A numerically superior basis is provided by M-splines. Their main features are that any basis function is positive in a series of consecutive segments, is zero elsewhere and is normalized by having unit area. An alternative normalization is provided by B-splines which have the property that the sum over all basis functions is 1 for values in the interval [L, U). Basis functions of M-splines and B-splines are linearly related and are 0 outside [L, U). The resulting piecewise polynomial is discontinuous at the endpoints L and U.
Monotonicity of the estimating function can be imposed by employing a basis consisting of monotone functions. Ramsay (1988) uses integrated M-splines which, when combined with nonnegative regression coefficients, yield a monotone spline. These integrated M-splines are called I-splines. The basis functions for I-splines are not linearly related and they are 0 for values smaller than L and 1 for values greater than or equal to U. The resulting piecewise polynomial is continuous but not differentiable at the endpoints. The choice of the polynomial order and of the knot values are crucial for successful usage of regression splines. Wegman & Wright (1983) summarize practical recommendations for M-splines, while Ramsay (1988) does so for I-splines. In general the knots should be chosen in regions where the relationship changes most markedly. A useful preliminary knot placement is to position a single interior knot at the median, two interior knots at the terciles, three at the quartiles, and so on. The order of the piecewice polynomials is usually taken to be 2 or 3.
The values for which the basis functions are calculated must be specified by the
X parameter. The values of the basis functions are saved with the
BASIS parameter, while the first order derivatives of the basis functions can be saved by setting the
DBASIS parameter. The
DBASIS pointers are redefined in the procedure. If a value in the
X parameter coincides with an interior knot and the basis function or its first order derivative has a discontinuity at that value, it should be remembered that the functions are continuous and differentiable from the right.
The interior knot sequence must be set with the
KNOTS option and the
ORDER option can be used to specify the order of the piecewise polynomials. The
TYPE option determines which spline basis is calculated. The interval [L, U) for which the basis functions are non-trivial can be specified by the
UPPER options. If these are unset the following values are used:
CALCULATE LOWER = MINIMUM(X)
CALCULATE max = MAXIMUM(X)
CALCULATE UPPER = max + ((max.EQ.0) + ABS(max))/500000
In this case the
UPPER value is such that max is just in the interval [L, U). The
NOMESSAGE option can be used to suppress warning messages which are printed when the
KNOTS variate has replicated values and when the interval [L, U) does not overlap the range of
Basis functions for M-splines are calculated by a recurrence relation from Ramsay (1988). These basis functions are multiplied to give B-splines or summed to provide I-splines. Note that unlike Ramsay (1988), the order of the spline is here defined as the order of the piecewise polynomial.
The variates contained in the
DBASIS pointers are restricted in the same way as the
X parameter. Values in the units excluded by the restriction are set to missing. Restrictions on the
KNOTS variate are ignored.
de Boor, C. (1978). A Practical Guide to Splines. Springer-Verlag. New York.
Ramsay, J.O. (1988). Monotone regression splines in action (with discussion). Statistical Science, 3, 425-441.
Wegman, E.J. & Wright, I.W. (1983). Splines in statistics. Journal of the American Statistical Association, 78, 351-365.
CAPTION 'SPLINE example',\ '1) Plots basis spline functions on the interval [0,10].';\ STYLE=meta,plain VARIATE Knots; VALUE=!(4,8) VARIATE [VALUES=0, 0.01 ... 10] X SPLINE [KNOTS=Knots; ORDER=3; TYPE=m] X; Basis CALCULATE N = NVALUES(Basis) VARIATE [NVALUES=N; VALUES=#N...1] Window, Lower, Upper CALCULATE Lower, Upper, Window = (Lower,Upper,Window - 1,0,-9)/N,N,1 CALCULATE Maximum[1...#N] = MAXIMUM(Basis) FRAME WINDOW=#Window; YUPPER=#Upper; YLOWER=#Lower; XLOWER=0;\ XUPPER=1; YMLOWER=0.04; YMUPPER=0.01; XMLOWER=0.12; XMUPPER=0 PEN NUMBER=11,12; COLOUR='red','black'; METHOD=line; LINESTYLE=1;\ SYMBOLS=0 SCALAR ii,ww; 0,9 TEXT screen; VALUE='clear' FOR [NTIMES=N] CALCULATE ii,ww = ii,ww + 1 PRINT [CHANNEL=title; IPRINT=*; SQUASH=yes; SERIAL=yes]\ ii; FIELD=1; DECIMALS=0; SKIP=0 AXES WINDOW=ww; YTITLE=title; YUPPER=Maximum[ii]; YLOWER=0;\ XUPPER=10; XLOWER=0; XMARKS=Knots; YMARKS=!(0,#Maximum[ii]);\ STYLE=grid; PENAXES=12; PENGRID=12 DGRAPH [WINDOW=ww; KEYWINDOW=0; SCREEN=#screen] Basis[ii]; X; PEN=11 TEXT screen; VALUE='keep' ENDFOR CAPTION !t('2) The smoothing spline needs a lot of degrees of freedom to',\ 'fit strong local effects. As a consequence the right-hand',\ 'asymptote has a strong wave.') VARIATE [NVALUES=22] Day, Length READ Day, Length 18 5 20 5 22 7 24 8 26 14 28 25 30 38 32 66 34 92 36 110 38 140 40 155 42 155 44 164 46 160 48 162 50 170 54 160 58 155 62 170 66 165 70 180 : MODEL Length FIT SSPLINE(Day; 6) RKEEP FITTED=Fit PEN NUMBER=1,2; SYMBOLS=1,0; LINESTYLE=1; METHOD=point,open DGRAPH [TITLE='Smoothing Spline'] Length, Fit; Day PRINT !t('3) An alternative is offered by fitting monotone splines.');\ JUSTIFICATION=left QUANTILE [PROPORTION=!(0.25,0.5,0.75)] DATA=Day; QUANTILE=Knots SPLINE [KNOTS=Knots; ORDER=3; TYPE=i] Day; Basis RNONNEGATIVE Basis RKEEP FITTED=Fit DGRAPH [TITLE='Monotone spline'] Length, Fit; Day PRINT !t('4) Omitting the last basis function gives a different asymptote.');\ JUSTIFICATION=left CALCULATE Nmin = NVALUES(Basis) - 1 RNONNEGATIVE Basis[1...Nmin] RKEEP FITTED=Fit DGRAPH [TITLE='Monotone spline without last basis function']\ Length, Fit; Day CAPTION !t('5) Example 5.2 from Montgomery & Peck (1982),',\ 'Introduction to linear regression analysis, Wiley, New York.',\ 'Truncated polynomials can be replaced by M-splines.',\ 'Note that B-splines and I-splines produce the same analysis.') VARIATE [VALUES=0, 0.5...20] Time VARIATE [NVALUES=41] VoltDrop READ VoltDrop 8.33 8.23 7.17 7.14 7.31 7.60 7.94 8.30 8.76 8.71 9.71 10.26 10.91 11.67 11.76 12.81 13.30 13.88 14.59 14.05 14.48 14.92 14.37 14.63 15.18 14.51 14.34 13.81 13.79 13.05 13.04 12.60 12.05 11.15 11.15 10.14 10.08 9.78 9.80 9.95 9.51 : CALCULATE TimeSq, TimeCub = Time ** (2,3) CALCULATE Time6_5, Time13 = ((Time - 6.5,13) * (Time .GT. 6.5,13)) ** 3 MODEL VoltDrop FIT Time, TimeSq, TimeCub, Time6_5, Time13 SPLINE [ORDER=3; KNOTS=!(6.5, 13); TYPE=m] Time; Mbasis MODEL VoltDrop FIT [CONSTANT=omit] Mbasis