Fits and plots quantile regressions for loess or spline models (D.B. Baird).

### Options

`PRINT` = string tokens |
What to print (`model` , `summary` , `fittedvalues` ); default `mode` , `summ` |
---|---|

`PLOT` = string tokens |
What to plot (`rhistogram` , `fittedvalues` ); default `fitt` |

`METHOD` = string token |
Smoothing method (`loess` , `spline` ); default `spli` |

`DF` = scalar |
Spline Degrees of Freedom (3-40); default 4 |

`KNOTS` = variate |
Knot points for smoothing splines; default `*` uses equally spaced percentiles of the `X` variate |

`KERNEL` = string token |
What Kernel to use for Loess (`normal` , `epanechnikov` , `quadratic` , `triweight` , `tukeybiweight` , `quartic` , `linear` , `uniform` ); default `norm` |

`LMETHOD` = string token |
Span method for Loess (`constant` , `adaptive` ); default `adap` |

`BANDWIDTH` = scalar |
Bandwidth for smoothing between 0 and 1; default 0.4 |

`ORDER` = scalar |
Order of local polynomial; default 1 |

`NGRIDPOINTS` = scalar |
Number of points on smooth curve; default 100 |

`NBOOT` = scalar |
Number of times to bootstrap data to estimate confidence limits; default 0 i.e. no bootstrapping |

`SEED` = scalar |
Seed for bootstrap randomization; default 0 |

`CIPROBABILITY` = scalar |
Probability level for confidence interval; default 0.95 |

`TITLE` = text |
Title for plots; default `*` generates titles from the structure names |

`ARRANGEMENT` = string token |
Whether to plot fitted regressions by the `GROUPS` parameter in a trellis plot (`single` , `trellis` ); default `sing` |

### Parameters

`Y` = variates |
Response variate |
---|---|

`X` = variates |
Explanatory variate |

`PRQUANTILES` = scalars or variates |
Proportions at which to calculate quantiles; default 0.5 |

`GROUPS` = factors |
Groups for which independent curves are fitted |

`GRID` = variates |
Grid of equidistant points at which the smooth is calculated |

`OUTGROUPS` = factors |
Groups for the fitted smoothed values saved by the `SMOOTH` parameter |

`SMOOTH` = variates or pointers |
Fitted smooth estimated at the `NGRIDPOINTS` points given in `GRID` |

`SLOPE` = variates or pointers |
Fitted slope from model for the same points as `SMOOTH` |

`RESIDUALS` = variates or pointers |
Residuals from regression for each quantile |

`FITTEDVALUES` = variates or pointers |
Fitted values from regression for each quantile |

`LOWSMOOTH` = variates or pointers |
Lower confidence limit of smooth for each quantile |

`UPPSMOOTH` = variates or pointers |
Upper confidence limit of smooth for each quantile |

`SESMOOTH` = variates or pointers |
Standard error of coefficients for each quantile |

### Description

`RQSMOOTH`

calculates and plots a smooth quantile regression for a given dependent variate y and an explanatory variable x, specified by the `Y`

and `X`

parameters, respectively. You can also specify groups, by supplying a factor using the `GROUPS`

parameter; the model is then fitted independently within each group. The type of the smooth model, either loess or spline, is specified by the `METHOD`

option. The quantiles (between 0 and 1) for which the model is to be fitted are specified by the `PRQUANTILES`

parameter, as a scalar is there is only one, or a variate if there are several. The default value for `PRQUANTILES`

is 0.5, i.e. the median.

For a spline model, the number of degrees of freedom can be specified using the `DF`

option. This must be greater or equal to 3 and less then or equal to 40. The knot points for the spline basis curves can be set using the `KNOTS`

option. This must have `DF`

points and no missing values. If `KNOTS`

is not provided, the default knot points are `DF`

equally spaced percentiles of the `X`

variate.

For a loess model the bandwidth is set by the `BANDWIDTH`

option, and must lie between 0 and 1; the default is 0.4. With large bandwidths the function will be smoother but less responsive, allowing for higher bias where the curve is rapidly changing. With smaller bandwidths the curve will be more responsive the curve, but the confidence limits around the curve will be larger. So the choice of bandwidth controls the trade-off between variance and bias. The loess model uses a moving window centred around the point to be predicted. The width of this window is controlled by the bandwidth and the `LMETHOD`

option. Setting `LMETHOD=constant`

gives a constant window width of `BANDWIDTH`

`*`

`RANGE(X)`

. Alternatively, setting `LMETHOD=adaptive`

uses a varying window width, defined so that it always contains the proportion of the total points, defined by `bandwidth`

. The window will thus be narrower where the points are denser. A local polynomial is fitted to the points in the window. The order is defined by the `ORDER`

option as either 1 (linear) or 2 (quadratic). The points are in the polynomial regression weighted by their distance from the point that is to be predicted. The weighting function W(*d*) is selected using the `KERNEL`

option, with settings:

`uniform` |
W(d) = 1 |
---|---|

`linear` |
W(d) = 1 – `ABS` (d) |

`quadratic` |
W(d) = 1 – d^{2} |

`quartic` |
W(d) = (1 – d^{2})^{2} |

`triweight` |
W(d) = (1 – d^{2})^{3} |

`Normal` |
W(d) = `PRNORMAL` (d) |

`epanechnikov` |
synonym of `quadratic` |

`tukeybiweight` |
synonym of `quartic` |

where *d* is the distance within the window from the predicted point, scaled to take the values -1 and +1 at the lower and upper window edges.

Output is controlled by the `PRINT`

option with settings:

`model` |
the details of model that is being fitted; |
---|---|

`summary` |
a summary of the fit; and |

`fittedvalues` |
the residuals and fitted values from the model. |

The `PLOT`

option controls what plots are displayed, with settings

`rhistogram` |
histograms of residuals; and |
---|---|

`fittedvalues` |
observed and fitted values plotted against the explanatory variate specified by the `XPLOT` option (if `XPLOT` is not set, the first expolanatory variate is used). |

The `ARRANGEMENT`

option controls whether the models for each group are displayed in a trellis plot or in a single plot with all groups together.

Bootstrapping can be used to estimate standard errors and confidence limits for the fitted values. The `NBOOT`

option specifies the number of bootstrap samples that are taken; the default is zero, which indicates that no bootstrapping is to be done. The `CIPROBABILITY`

option sets the size of the confidence limits. The `SEED`

option defines the seed for the random numbers that are used to select the bootstrap samples. The default of zero continues the existing sequence of random numbers if any have already been used in the current Genstat job. If none have been used, Genstat picks a seed at random.

The results from the model fit can be saved in various parameters. They will be saved in a variate if only one quantile has been defined, or in a pointer to a set of variates (one for each quantile) if there were several. The fitted curve(s) can be saved by the `SMOOTH`

parameter, and the slope of the fitted curve by the `SLOPE`

parameter. The `NGRIDPOINTS`

option controls how many points are estimated on each curve. The `GRID`

parameter can save the positions of the points, which will be spaced equally between the minimum and maximum value of `X`

. The `UPPSMOOTH`

, `LOWSMOOTH`

and `SESMOOTH`

parameters save variates containing the bootstrap confidence limits and standard errors of the estimated curve respectively. If a `GROUPS`

factor has been specified, the estimated values for the curves have `NLEVELS(GROUPS)`

`*`

`NGRIDPOINTS`

points, with the values for group 1 being given first, followed by those for group 2, and so on. The `OUTGROUPS`

factor can save a factor to identify the groups within the variates.

Options: `PRINT`

, `PLOT`

, `METHOD`

, `KERNEL`

, `LMETHOD`

, `BANDWIDTH`

, `ORDER`

, `DF`

, `KNOTS`

, `NGRIDPOINTS`

, `NBOOT`

, `SEED`

, `CIPROBABILITY`

, `TITLE`

, `ARRANGEMENT`

.

Parameters: `Y`

, `X`

, `PRQUANTILES`

, `GROUPS`

, `GRID`

, `OUTGROUPS`

, `SMOOTH`

, `SLOPE`

, `RESIDUALS`

, `FITTEDVALUES`

, `LOWSMOOTH`

, `UPPSMOOTH`

, `SESMOOTH`

.

### Method

The `FRQUANTILES`

directive is used to fit the quantile regression for a design matrix generated for the spline basis or a locally weighted regression about the points in the smooth. For further details of the underlying methodology, see Koenker & D’Orey (1987) or Koenker (2005).

### Action with `RESTRICT`

Restrictions in the `Y`

and `X`

variate and `GROUPS`

factor are combined, and only those units which are unrestricted in all structures are used in the regression.

### References

Koenker, R. (2005). *Quantile Regression*. Cambridge University Press, New York.

Koenker, R.W. & D’Orey, V. (1987). Algorithm AS229 computing regression quantiles. *Applied Statistics*, 36, 383-393.

### See also

Directive: `FRQUANTILES`

.

Procedures: `RQLINEAR`

, `RQNONLINEAR`

.

Commands for: Regression analysis.

### Example

CAPTION 'RQSMOOTH example'; STYLE=meta SPLOAD '%GENDIR%/Examples/MelbourneTemp.gsh' RQSMOOTH [PRINT=model,summary; PLOT=fitted; METHOD=Spline;\ DF=6; NGRID=100; NBOOT=0] Y=MaxTemp; X=PrevMax;\ PRQUANTILES=!(0.05,0.1,0.25,0.5,0.75,0.9,0.95)