Select menu: Stats | Regression Analysis | Regression Trees
Use this to form a regression tree, which is a mechanism for predicting a response variable from a set of independent variables. The construction process splits the observations into subsets, according to whether or not they are less than a particular value of one of the independent variates. The aim is to form subsets that have similar values for the response variate.
- After you have imported your data, from the menu select
Stats | Regression Analysis | Regression Trees.
Stats | Multivariate Analysis | Trees | Regression Tree.
- Fill in the fields as required then click Run.
You can set additional options before running by clicking Options.
The predicted value of the response variable for each node of the tree is the mean of its value for the subset of observations at that node. The accuracy of the node is the squared distance of the values of the response variate from their mean for the observations at the node, divided by the total number of observations. The potential splits at the node are assessed by their effect on the accuracy, that is the difference between the accuracy of the node and the sum of the accuracies of the two potential successor nodes. The node will become a terminal node if none of the splits provides any improvement in accuracy, or if the mean square of the observations at the node is less than a specified limit.
This lists data structures appropriate to the current input field. The contents will change as you move from one field to the next. Double-click a name to copy it to the current input field; alternatively, you can type the name directly into the input field.
Specify a response variate for the regression.
Specifies the independent (x) variables available for constructing the tree. The variables can be factors or variates. You can transfer multiple selections from Available data by holding the Ctrl key on your keyboard while selecting items, then click to move them all across in one action..
Save tree in:
Specifies an identifier name to save the resulting tree in. The tree will be saved within a Genstat Tree data structure.
The button opens the Options dialog which allows you to control the algorithm used to produce the tree.
This opens the Tree Prune menu in a modal manner, so you must select a pruned tree or cancel the menu before continuing.The construction of a regression tree generally results in over fitting, that is it continues to extend the branches of the tree beyond the point that can be justified statistically. One solution is to prune the tree to remove the uninformative sub-branches. Clicking the Prune button opens a menu where you can prune the tree.
The button opens the Further Output dialog which allows you to display information on the tree from the analysis or the new pruned tree.
The button opens the Predict dialog which allows you to store or display the predictions for the current data set or for a new set of observations.