Makes predictions using a regression tree (R.W. Payne).
|Controls printed output (
||Specifies the tree|
||Saves the prediction for the observations|
||Saves the numbers of the terminal nodes from which each prediction was obtained|
||Whether to provide predictions for units with missing or unavailable values of the x-variables (
||Values to use for the explanatory variables; if these are unset for any variable, its existing values are used|
BRPREDICT makes predictions using a regression tree, as constructed by the
BREGRESSION procedure. The tree can be saved from
BREGRESSION (using the
TREE option of
BREGRESSION), and specified for
BRPREDICT using its own
TREE option. Alternatively,
BRPREDICT will ask you for the identifier of the tree if you do not specify
TREE when running interactively.
The x-values for the predictions can be specified in the variates or factors listed by the
X parameter. These must have identical names (and levels) to those used originally to construct the tree. You can use the
VALUES parameter to supply new values, if those stored in any of the variates or factors are unsuitable.
If you do not set
X when running interactively,
BRPREDICT will ask you to supply the relevant x-values in turn, as required by the tree. Otherwise, if an x-variable in the tree is not specified in the
X parameter list, its values are assumed to be unavailable (i.e. missing).
By default, when the x-variable required at a node in the tree is unavailable or contains a missing value,
BRPREDICT will follow all the branches from that node, and average the predictions that they generate. You can set option
MVINCLUDE=*, if you would prefer the prediction to be missing.
||prints the predictions obtained using the tree;|
||prints the x-values supplied in response to questions in an interactive run.|
If you do not set
BRPREDICT will ask what you would like to print. In batch, the default is to print the predictions.
You can save the predictions, in a variate, using the
PREDICTIONS option. The
TERMINALNODES option allows you to save a pointer, with an element for each prediction, containing the numbers of the terminal nodes reached in the tree to provide the predictions. This will be a scalar if the prediction was derived from a single node, or a variate if it involved more than one (because several branches have been taken, as the result of a missing x-value).
BIDENTIFY to find the terminal nodes of the tree that correspond to the values of the explanatory variables.
Restrictions are ignored.
CAPTION 'BRPREDICT example',!t('Water usage data (Draper & Smith 1981,',\ 'Applied Regression Analysis, Wiley, New York).'); STYLE=meta,plain READ temp,product,opdays,employ,water 58.8 7.107 21 129 3.067 65.2 6.373 22 141 2.828 70.9 6.796 22 153 2.891 77.4 9.208 20 166 2.994 79.3 14.792 25 193 3.082 81.0 14.564 23 189 3.898 71.9 11.964 20 175 3.502 63.9 13.526 23 186 3.060 54.5 12.656 20 190 3.211 39.5 14.119 20 187 3.286 44.5 16.691 22 195 3.542 43.6 14.571 19 206 3.125 56.0 13.619 22 198 3.022 64.7 14.575 22 192 2.922 73.0 14.556 21 191 3.950 78.9 18.573 21 200 4.488 79.4 15.618 22 200 3.295 : "form the regression tree" BREGRESSION [PRINT=*; Y=water; TREE=tree] employ,opdays,product,temp "prune the tree" BPRUNE [PRINT=table] tree; NEWTREES=pruned "use tree 6 - renumber nodes" BCUT [RENUMBER=yes] pruned; NEWTREE=tree "display the tree" BRDISPLAY [PRINT=labelled] tree "predict water usage and compare with the original data values" BRPREDICT [PRINT=*; TREE=tree; PREDICTION=prediction]\ employ,opdays,product,temp PRINT water,prediction; FIELD=8,12