TREE directive

Declares one or more tree data structures and initializes each one to have a single node known as its root.

 

No options

 

Parameter

IDENTIFIER = identifiers Identifiers of the trees

 

Description

BTREE declares and initializes Genstat tree structures. These can be used to represent hierarchical structures like classification trees, identification keys and regression trees. These types of tree can be constructed by special-purpose procedures BCLASSIFICATION, BKEY and BREGRESSION, respectively, and displayed by procedures BGRAPH and BPRINT. Most users will use only these special-purpose procedures, and will not need to operate on trees directly, nor to be aware of how they are formed, stored or manipulated. The procedures, however, are based on a suite of directives, functions and procedures summarized below, which provide the tool kit not only for the officially-supported tree facilities but also for user enhancements and extensions.

The tree structure is like a real tree, which starts from a root and then splits into branches, except that it is usually viewed as growing downwards instead of upwards. The branch-points in the tree are known as nodes, with the initial node being called the root (as in a real tree). There is also a node at the end of each branch, known as its terminal node. In Genstat a tree is similar to a pointer, with an element for each node. These elements are the identifiers of data structures which can be used to store information about the nodes. Usually the data structures will be pointers, so that several pieces of information can be stored for each node, but the precise contents depend on the type of tree (see, for example, procedures BCLASSIFICATION, BKEY and BREGRESSION).

Each node thus has a number, corresponding to the index of its element in the tree. The root is always numbered one, and this is the only node that the tree contains when it is declared by TREE. Further nodes can be added by the BGROW or BJOIN directives, which form branches from a terminal node or join another tree to a terminal node, respectively. The converse process of cutting a tree at a defined node and discarding the nodes and information below it is provided by the BCUT directive.

The numbers of the subsequent nodes can be obtained from the functions that are provided to navigate around a tree:

    BNEXT provides the numbers of the nodes below a node;
    BPREVIOUS provides the number of the node immediately above a node;
    BTERMINAL finds the next terminal node after a node;
    BSCAN finds the number of the node immediately after a node in a standard branch-by-branch order that visits each node once.

Other useful functions include:

    BNBRANCHES provides the number of branches below a node;
    BDEPTH calculates the depth of a node (taking the root as being at depth 1);
    BPATH provides a variate containing the numbers of the nodes on the branch to a node;
    BBRANCHES provides a variate containing the numbers of the branches taken on the path to a node;
    BBELOW provides a variate containing numbers of all the nodes or all the terminal nodes below a node;
    BNNODES provides the number of nodes in a tree;
    BMAXNODE provides the maximum node number in a tree.

There are also several utility procedures, which are used by the special-purpose tree procedures.

    BCONSTRUCT constructs a tree (using subsidiary procedure BSELECT, which is customized according to the type of tree).
    BGRAPH plots a tree.
    BPRINT displays a tree.
    BPRUNE prunes a tree using minimal cost complexity (assuming that “accuracy” values have been stored at each node of the tree, which can be done using customized procedure BVALUES).

New tree-based analyses can thus be added by writing a main procedure (like BCLASSIFICATION, BREGRESSION etc), and defining appropriate versions of BSELECT.

Options: none.

Parameter: IDENTIFIER.

 

See also

Directives: BASSESS, BCUT, BGROW, BJOIN, POINTER.

Procedures: BCONSTRUCT, BCLASSIFICATION, BGRAPH, BKEY, BPRINT, BPRUNE.

Functions: BBELOW, BBRANCHES, BDEPTH, BMAXNODE, BNBRANCHES, BNEXT, BNNODES, BPATH, BPREVIOUS, BSCAN, BTERMINAL.

Commands for: Data structures.

Example

" Examples 1:4.12.3, 1:4.12.4 & 1:4.12.5 "
" Declare the original tree."
TREE      T
" Define texts to use as labels for the nodes."
TEXT      Lab[1...26]; VALUES=\
          'a','b','c','d','e','f','g','h','i','j','k','l','m',\
          'n','o','p','q','r','s','t','u','v','w','x','y','z'
" Define information at root to be a pointer
  with a single element called 'label'."
POINTER   [NVALUES=!t(label)] T[1]
" Set that element to the first value of Lab, i.e. 'a'."
TEXT      T[1]['label']; VALUE=Lab[1]
" Display the tree - first with labels of nodes, then with numbers."
BPRINT    [PRINT=labelleddiagram,numbereddiagram] T
" Extend the tree by forming 3 branches from node 1 (root)."
BGROW     T; NODE=1; NBRANCH=3; NEWNODES=Gnew
" Define the information for the new nodes."
POINTER   [NVALUES=!t(label)] T[#Gnew]
TEXT      T[#Gnew]['label']; VALUE=Lab[#Gnew]
" Display the extended tree."
BPRINT    [PRINT=labelleddiagram,numbereddiagram] T
" Find the node number of the first terminal node "
CALCULATE N1 = BTERMINAL(T; 0)
" and then the second terminal node."
CALCULATE N2 = BTERMINAL(T; N1)
PRINT     N1,N2; DECIMALS=0
" Extend the tree by adding 2 branches at the second
  and then the first terminal node."
BGROW     T; NODE=N2; NBRANCH=2; NEWNODES=Gnew2
" Define the information for the new nodes."
POINTER   [NVALUES=!t(label)] T[#Gnew2]
TEXT      T[#Gnew2]['label']; VALUE=Lab[#Gnew2]
BGROW     T; NODE=N1; NBRANCH=2; NEWNODES=Gnew1
POINTER   [NVALUES=!t(label)] T[#Gnew1]
TEXT      T[#Gnew1]['label']; VALUE=Lab[#Gnew1]
" Display the extended tree."
BPRINT    [PRINT=labelleddiagram,numbereddiagram] T
"4.12.4 Remove the branches below N2, saving these as tree T2;
  also save and print the node mapping variates."
BCUT      T; NODE=N2; CUTTREE=T2; OLDNODES=Oldn;\
          NEWNODES=Newn; CUTNODES=Cutn
PRINT     [ORIENT=across] Oldn,Newn,Cutn; FIELD=3; DECIMALS=0
" Display the modified tree, and the cut-tree."
BPRINT    [PRINT=labelleddiagram,numbereddiagram] T,T2
" Redefine the root of the cut tree so that it no
  longer shares the same information pointer as the
  node where the cut was made in the original tree." 
POINTER   [NVALUES=!t(label)] T2root
TEXT      T2root['label']; VALUE='t2root'
ASSIGN    T2root; T2; 1
BPRINT    [PRINT=labelleddiagram] T2
" Use BCUT to form T3 as a duplicate of T
  but with renumbered nodes."
BCUT      [RENUMBER=yes] T; NEWTREE=T3
BPRINT    [PRINT=labelleddiagram,numbereddiagram] T3
"4.12.5 Join tree T2 onto node 4 of T3; save and print the
  numbers of the joined nodes in the revised tree."
BJOIN     T3; NODE=4; JOINTREE=T2; NEWNODES=Jnew
BPRINT    [PRINT=labelleddiagram,numbereddiagram] T3
PRINT     Jnew; DECIMALS=0
" Tree functions: all nodes below node 2,"
CALCULATE Below  = BBELOW(T3; 0; 0)
PRINT     Below; DECIMALS=0
" all terminal nodes below node 2,"
CALCULATE Below0 = BBELOW(T3; 0; 0)
PRINT     Below0; DECIMALS=0
" first three terminal nodes,"
CALCULATE N1 = BTERMINAL(T3; 0)
&         N2 = BTERMINAL(T3; N1)
&         N3 = BTERMINAL(T3; N2)
PRINT     N1,N2,N3; DECIMALS=0
" nodes and branches on path to N3,"
CALCULATE Pn3 = BPATH(T3; N3)
&         Ln3 = BBRANCHES(T3; N3)
PRINT     Pn3,Ln3; DECIMALS=0
" depth and number of branches at node 2,"
CALCULATE Nn2 = BNBRANCHES(T3; 2)
&         Dn2 = BDEPTH(T3; 2)
PRINT     Nn2,Dn2; DECIMALS=0
" next nodes on branches 1-3 from node 1,
  and branch 1 from node 2."
PRINT     BNEXT(T3; 1; 1); DECIMALS=0
PRINT     BNEXT(T3; 1; 2); DECIMALS=0
PRINT     BNEXT(T3; 1; 3); DECIMALS=0
PRINT     BNEXT(T3; 2; 1); DECIMALS=0
" Scan the tree, taking the nodes in standard order."
SCALAR    Scan[0]; value=0
CALCULATE Scan[1] = BSCAN(T3; Scan[0])
&         Scan[2] = BSCAN(T3; Scan[1])
&         Scan[3] = BSCAN(T3; Scan[2])
&         Scan[4] = BSCAN(T3; Scan[3])
&         Scan[5] = BSCAN(T3; Scan[4])
&         Scan[6] = BSCAN(T3; Scan[5])
&         Scan[7] = BSCAN(T3; Scan[6])
&         Scan[8] = BSCAN(T3; Scan[7])
&         Scan[9] = BSCAN(T3; Scan[8])
PRINT     Scan[1...9]; FIELD=8; DECIMALS=0

Updated on January 17, 2018

Was this article helpful?