Identifies specimens using a random classification forest (R.W. Payne).
Options
PRINT = string tokens |
Controls printed output (identification ); default * i.e. none |
---|---|
IDENTIFICATION = factor, variate or scalar |
Saves the identification of specimens |
VOTES = matrix |
Saves number of terminal nodes reached by each group for the specimens |
SAVE = pointers |
Save structure from BCFOREST providing information about the random forest |
arameters
X = variates or factors |
Explanatory variables |
---|---|
VALUES = scalars, variates or texts |
Values to use for the explanatory variables; if these are unset for any variable, its existing values are used |
Description
BCFIDENTIFY
identifies specimens using a random classification forest, constructed by the BCFOREST
procedure. The SAVE
parameter can be set to a pointer, saved using the SAVE
option of BCFOREST
, containing the necessary information about the forest. Alternatively, if you do not set SAVE
, the identification will be made using the forest most recently constructed by BCFOREST
.
The characteristics of the specimens can be specified in the variates or factors listed by the X
parameter. These must have identical names (and levels) to those used originally to construct the tree. You can use the VALUES
parameter to supply new values, if those stored in any of the variates or factors are unsuitable.
The PRINT
option controls printed output, with settings:
identification |
to print the identifications obtained using the tree. |
---|
By default nothing is printed.
The IDENTIFICATION
option can be set to the identifier of a data structure to save the identifications. If this has not already been declared, it will be defined as a factor with the same levels and labels as the original groups factor. Alternatively, you can supply a variate, or you can supply a scalar if there is only one specimen. A missing value is given if there is no clear result (i.e. more than one group has the maximum vote) for the specimen concerned. The VOTES
option can save a specimens-by-groups matrix with the votes given by the forest for each of the groups with each specimen.
Options: PRINT
, IDENTIFICATION
, VOTES
, SAVE
.
Parameters: X
, VALUES
.
Action with RESTRICT
Restrictions are ignored.
See also
Procedures: BCFOREST
, BCFDISPLAY
.
Commands for: Multivariate and cluster analysis.
Example
CAPTION 'BCFOREST example',\ !t('Random classification forest for automobile data',\ 'from UCI Machine Learning Repository',\ 'http://archive.ics.uci.edu/ml/datasets/Automobile');\ STYLE=meta,plain SPLOAD FILE='%gendir%/examples/Automobile.gsh' BCFOREST [GROUPS=symboling; NTREES=8; NXTRY=10; NUNITSTRY=75; SEED=197883]\ normalized_losses,make,fuel_type,aspiration,number_doors,\ body_style,drive_wheels,engine_location,wheel_base,\ length,width,height,curb_weight,engine_type,number_cylinders,\ engine_size,fuel_system,bore,stroke,compression_ratio,\ horsepower,peak_rpm,city_mpg,highway_mpg,price BCFIDENTIFY [PRINT=identification]\ normalized_losses,make,fuel_type,aspiration,number_doors,\ body_style,drive_wheels,engine_location,wheel_base,\ length,width,height,curb_weight,engine_type,number_cylinders,\ engine_size,fuel_system,bore,stroke,compression_ratio,\ horsepower,peak_rpm,city_mpg,highway_mpg,price