1. Home
  2. BCFIDENTIFY procedure

BCFIDENTIFY procedure

Identifies specimens using a random classification forest (R.W. Payne).

Options

PRINT = string tokens Controls printed output (identification); default * i.e. none
IDENTIFICATION = factor, variate or scalar Saves the identification of specimens
VOTES = matrix Saves number of terminal nodes reached by each group for the specimens
SAVE = pointers Save structure from BCFOREST providing information about the random forest

arameters

X = variates or factors Explanatory variables
VALUES = scalars, variates or texts Values to use for the explanatory variables; if these are unset for any variable, its existing values are used

Description

BCFIDENTIFY identifies specimens using a random classification forest, constructed by the BCFOREST procedure. The SAVE parameter can be set to a pointer, saved using the SAVE option of BCFOREST, containing the necessary information about the forest. Alternatively, if you do not set SAVE, the identification will be made using the forest most recently constructed by BCFOREST.

The characteristics of the specimens can be specified in the variates or factors listed by the X parameter. These must have identical names (and levels) to those used originally to construct the tree. You can use the VALUES parameter to supply new values, if those stored in any of the variates or factors are unsuitable.

The PRINT option controls printed output, with settings:

    identification to print the identifications obtained using the tree.

By default nothing is printed.

The IDENTIFICATION option can be set to the identifier of a data structure to save the identifications. If this has not already been declared, it will be defined as a factor with the same levels and labels as the original groups factor. Alternatively, you can supply a variate, or you can supply a scalar if there is only one specimen. A missing value is given if there is no clear result (i.e. more than one group has the maximum vote) for the specimen concerned. The VOTES option can save a specimens-by-groups matrix with the votes given by the forest for each of the groups with each specimen.

Options: PRINT, IDENTIFICATION, VOTES, SAVE.
Parameters: X, VALUES.

Action with RESTRICT

Restrictions are ignored.

See also

Procedures: BCFOREST, BCFDISPLAY.
Commands for: Multivariate and cluster analysis.

Example

CAPTION    'BCFOREST example',\
           !t('Random classification forest for automobile data',\
           'from UCI Machine Learning Repository',\
           'http://archive.ics.uci.edu/ml/datasets/Automobile');\
           STYLE=meta,plain
SPLOAD     FILE='%gendir%/examples/Automobile.gsh'
BCFOREST   [GROUPS=symboling; NTREES=8; NXTRY=10; NUNITSTRY=75; SEED=197883]\
           normalized_losses,make,fuel_type,aspiration,number_doors,\
           body_style,drive_wheels,engine_location,wheel_base,\
           length,width,height,curb_weight,engine_type,number_cylinders,\
           engine_size,fuel_system,bore,stroke,compression_ratio,\
           horsepower,peak_rpm,city_mpg,highway_mpg,price
BCFIDENTIFY [PRINT=identification]\ 
           normalized_losses,make,fuel_type,aspiration,number_doors,\
           body_style,drive_wheels,engine_location,wheel_base,\
           length,width,height,curb_weight,engine_type,number_cylinders,\
           engine_size,fuel_system,bore,stroke,compression_ratio,\
           horsepower,peak_rpm,city_mpg,highway_mpg,price
Updated on February 6, 2023

Was this article helpful?