RLOESSGROUPS procedure

Fits locally weighted regression models (loess) to data with groups (D.B. Baird).

Options

`PRINT` = string tokens	What to print (`model`, `deviance`, `summary`, `estimates`, `correlations`, `fittedvalues`, `accumulated`, `monitoring`, `confidence`, `groups`, `submodels`); default `mode`, `summ`, `esti`
`PLOT` = string tokens	What to plot (`fittedvalues`, `residuals`); default `*` – no plots
`FINALMODEL` = string token	What to model to fit as the final model (`common`, `parallel`, `separateslopes`, `full`); default `full`
`CONSTANT` = string token	How to treat the constant (`estimate`, `omit`); default `esti`
`DENOMINATOR` = string token	Whether to base ratios in accumulated summary on rms from model with smallest residual ss or smallest residual ms (`ss`, `ms`); default `ss`
`NOMESSAGE` = string tokens	Which warning messages to suppress (`dispersion`, `leverage`, `residual`, `aliasing`, `marginality`, `vertical`, `df`, `inflation`); default `*`
`FPROBABILITY` = string token	Printing of probabilities for variance and deviance ratios (`yes`, `no`); default `no`
`TPROBABILITY` = string token	Printing of probabilities for t-statistics (`yes`, `no`); default `no`
`PROBABILITY` = scalar	Probability level for confidence intervals for parameter estimates; default 0.95
`MAXCYCLE` = scalar	Maximum number of iterations for the back-fitting algorithm; default 100
`DEVIANCE` = scalar	Saves the residual deviance
`DF` = scalar	Saves the residual d.f.

Parameters

`X` = variate	Explanatory x-variate to be fitted
`GROUPS` = factor	Groups to be fitted
`SMOOTH` = scalar	Smoothing value to be used in the loess term; default 4
`SMTYPE` = string token	Type of value provided in `SMOOTH` (`df`, `smoothing`); default `df`
`ORDER` = scalar	Order of regression used in loess term (1 or 2); default 1
`RESIDUALS` = variates	Simple residuals from the fitted loess model
`FITTEDVALUES` = variates	Fitted values from the fitted loess model
`ACCUMULATED` = pointer	Saves the accumulated analysis-of-variance (or deviance) table as a pointer with a variate or text for each column (source, d.f. etc.)
`SAVE` = pointer	Save structure for the fitted model

Description

RLOESSGROUPS is provided to allow the full interaction between a loess smooth on an explanatory variate X and a factor GROUPS to be fitted. It is not possible to include LOESS(X)*GROUPS in the TERMS directive, so the procedure loops around the groups to fit individual models for each group, and then combines the results.

The use of RLOESSGROUPS is similar to FIT. It must be preceded by a MODEL statement, and can be followed by RDLOESSGROUPS and RKLOESSGROUPS to display and save the results, which operate similarly to RDISPLAY and RKEEP respectively. It also has options PRINT, CONSTANT, DENOMINATOR, NOMESSAGE, FPROBABILITY, TPROBABILITY and PROBABILITY that operate like those of FIT. However, the PRINT option has two extra settings: submodel to print the three submodels (explained below), and groups to print the individual fits for each group. The output from each submodel or group will use the other settings of PRINT.

The form of the loess curve can be specified by the SMOOTH, SMTYPE and ORDER parameters which specify the arguments to the LOESS function. If SMTYPE=df, SMOOTH gives the number of degrees of freedom used in the function (which should be 2 or greater), while if SMTYPE=smooth, SMOOTH gives the smoothing parameter (which should be between 0 and 1). The ORDER parameter is 1 for a linear loess model and 2 for a quadratic one.

RLOESSGROUPS fits a sequence of models, starting with a common line (ignoring the groups). The next, parallel, model fits a common slope and loess curve, but different intercepts for the groups. The third model (separate slopes) has a common loess curve but different slopes and intercepts. Finally, the fourth (full) model has different loess curves, slopes and intercepts. Groups with less than four observations should be restricted out when fitting the full model, as these cannot be fitted by a loess model. To fit the full model, RLOESSGROUPS uses SUBSET to break up the data into separate groups. It fits these individually using FIT, and then combines the results. The results of these individual fits are printed only if the groups setting is included in the PRINT option.

The FINALMODEL option specifies how far to take the sequence of models, with settings common, parallel, separateslopes and the default, full, corresponding to the models just described. Results from the models earlier than the requested final model are printed only if the submodels setting is included in the PRINT option. Further output displayed by RDLOESSGROUPS and information saved by RKLOESSGROUPS will only be from the final model.

The DEVIANCE option saves the residual deviance, and the DF option saves the residual number of degrees of freedom. The RESIDUALS and FITTEDVALUES parameters save the residuals and fitted values, respectively. The ACCUMULATED parameter saves the accumulated analysis-of-variance (or deviance) table as a pointer. The suffixes of ACCUMULATED for the last 4 columns in the pointer depend on whether it is an analysis of variance ('s.s.', 'm.s.', 'v.r.', 'F pr.') or an analysis of deviance table ('deviance', 'mean dev.', 'dev. r.', 'approx F pr.').

The SAVE parameter can save a pointer, with information about the analysis, for use by the procedures RDLOESSGROUPS and RKLOESSGROUPS.

Options: PRINT, PLOT, FINALMODEL, CONSTANT, DENOMINATOR, NOMESSAGE, FPROBABILITY, TPROBABILITY, PROBABILITY, MAXCYCLE, DEVIANCE, DF.
Parameter: X, GROUPS, SMOOTH, SMTYPE, ORDER, RESIDUALS, FITTEDVALUES, ACCUMULATED, SAVE.

Method

RLOESSGROUPS uses SUBSET to break the data into separate groups and fits these individually using FIT, and then combines the results. The 3 sub-models are fitted first using the usual TERMS and FIT directives so obtain the accumulated analysis-of-variance (or deviance) table. The model with separate slopes may be dropped if this has negative sums of squares.

Action with `RESTRICT`

As in FIT, the y-variate (specified in an earlier MODEL directive) can be restricted to analyse a subset of the data.

Example

CAPTION 'RLOESSGROUPS example',\
        'Yield of sugar beet vs soil phosphate in 4 years'; STYLE=major,plain
FACTOR  [LEVELS=4; VALUES=16(1...4)] Year
OPEN    '%EXAMPLES%/GuidePart2/beet.dat'; CHANNEL=2
READ    [PRINT=*; CHANNEL=2] Beetwt,%sugar,SoilP
CLOSE   2
CALCULATE Sugar = Beetwt * %sugar / 100
MODEL   Sugar
RLOESSGROUPS [PRINT=model,summary,estimates,accumulated; \
   PLOT=residuals,fitted; FPROB=yes; TPROB=yes] \
   SoilP; GROUP=Year; SMOOTH=2

Updated on February 3, 2023

Was this article helpful?

Yes No