Makes predictions using a regression tree (R.W. Payne).
Options
PRINT = string tokens |
Controls printed output (prediction , transcript ); if PRINT is unset in an interactive run BRPREDICT will ask what you want to print, in a batch run the default is pred |
---|---|
TREE = tree |
Specifies the tree |
PREDICTIONS = variate |
Saves the prediction for the observations |
TERMINALNODES = pointer |
Saves the numbers of the terminal nodes from which each prediction was obtained |
MVINCLUDE = string token |
Whether to provide predictions for units with missing or unavailable values of the x-variables (explanatory ); default expl |
Parameters
X = variates or factors |
Explanatory variables |
---|---|
VALUES = scalars, variates or texts |
Values to use for the explanatory variables; if these are unset for any variable, its existing values are used |
Description
BRPREDICT
makes predictions using a regression tree, as constructed by the BREGRESSION
procedure. The tree can be saved from BREGRESSION
(using the TREE
option of BREGRESSION
), and specified for BRPREDICT
using its own TREE
option. Alternatively, BRPREDICT
will ask you for the identifier of the tree if you do not specify TREE
when running interactively.
The x-values for the predictions can be specified in the variates or factors listed by the X
parameter. These must have identical names (and levels) to those used originally to construct the tree. You can use the VALUES
parameter to supply new values, if those stored in any of the variates or factors are unsuitable.
If you do not set X
when running interactively, BRPREDICT
will ask you to supply the relevant x-values in turn, as required by the tree. Otherwise, if an x-variable in the tree is not specified in the X
parameter list, its values are assumed to be unavailable (i.e. missing).
By default, when the x-variable required at a node in the tree is unavailable or contains a missing value, BRPREDICT
will follow all the branches from that node, and average the predictions that they generate. You can set option MVINCLUDE=*
, if you would prefer the prediction to be missing.
The PRINT
option controls printed output, with settings:
prediction |
prints the predictions obtained using the tree; |
---|---|
transcript |
prints the x-values supplied in response to questions in an interactive run. |
If you do not set PRINT
in an interactive run, BRPREDICT
will ask what you would like to print. In batch, the default is to print the predictions.
You can save the predictions, in a variate, using the PREDICTIONS
option. The TERMINALNODES
option allows you to save a pointer, with an element for each prediction, containing the numbers of the terminal nodes reached in the tree to provide the predictions. This will be a scalar if the prediction was derived from a single node, or a variate if it involved more than one (because several branches have been taken, as the result of a missing x-value).
Options: PRINT
, TREE
, PREDICTIONS
, TERMINALNODES
.
Parameters: X
, VALUES
.
Method
BRPREDICT
uses BIDENTIFY
to find the terminal nodes of the tree that correspond to the values of the explanatory variables.
Action with RESTRICT
Restrictions are ignored.
See also
Procedures: BREGRESSION
, BRKEEP
, BRDISPLAY
.
Commands for: Regression analysis, Multivariate and cluster analysis.
Example
CAPTION 'BRPREDICT example',!t('Water usage data (Draper & Smith 1981,',\ 'Applied Regression Analysis, Wiley, New York).'); STYLE=meta,plain READ temp,product,opdays,employ,water 58.8 7.107 21 129 3.067 65.2 6.373 22 141 2.828 70.9 6.796 22 153 2.891 77.4 9.208 20 166 2.994 79.3 14.792 25 193 3.082 81.0 14.564 23 189 3.898 71.9 11.964 20 175 3.502 63.9 13.526 23 186 3.060 54.5 12.656 20 190 3.211 39.5 14.119 20 187 3.286 44.5 16.691 22 195 3.542 43.6 14.571 19 206 3.125 56.0 13.619 22 198 3.022 64.7 14.575 22 192 2.922 73.0 14.556 21 191 3.950 78.9 18.573 21 200 4.488 79.4 15.618 22 200 3.295 : "form the regression tree" BREGRESSION [PRINT=*; Y=water; TREE=tree] employ,opdays,product,temp "prune the tree" BPRUNE [PRINT=table] tree; NEWTREES=pruned "use tree 6 - renumber nodes" BCUT [RENUMBER=yes] pruned[6]; NEWTREE=tree "display the tree" BRDISPLAY [PRINT=labelled] tree "predict water usage and compare with the original data values" BRPREDICT [PRINT=*; TREE=tree; PREDICTION=prediction]\ employ,opdays,product,temp PRINT water,prediction; FIELD=8,12