1. Home
  2. SOMADJUST procedure

SOMADJUST procedure

Performs adjustments to the weights of a self-organizing map (R.W. Payne).

Options

SOM = pointer Self-organizing map
DATA = matrix or pointer Data values for training the map
DMETHOD = string token Method for calculating the distances of data points from the modes (euclidean, cityblock); default eucl
WMETHOD = string token Method for calculating the contribution of a data point to each node when revising the weights (gaussian, neighbour); default gaus

Parameters

ALPHA = scalars Alpha value for each iteration
SIGMA = scalars Sigma value for each iteration when WMETHOD=gaussian
THRESHOLD = scalars Threshold for each iteration when WMETHOD=neighbour
ERRORS = matrices Saves the reconstruction errors at the nodes of the map after each iteration
TOTALERROR = scalars Saves the total reconstruction error after each iteration
FITNODES = factors Saves the nodes allocated to the data points after each iteration

Description

A self-organizing map is a two dimensional grid of nodes, used to classify vectors of observations on p variables. Each node is characterized by a vector of p weights (one for each variable). Genstat has a special SOM data structure to represent a map. This is declared using the SOM procedure, which also defines the row and column positions of the nodes on the grid. In addition, SOM stores the names of the weight variables, and information about how distances are to be measured on the grid and how the weights should be adjusted during their estimation.

The training dataset to estimate the weights is specified by the DATA option, either as a matrix with n rows and p columns (where n is the number of observations in the training set) or as a pointer containing p variates each with n units. SOMADJUST gives a warning if the row names of a DATA matrix or the names of the variates in a DATA pointer differ from the names stored for the weight variables in the SOM structure.

The weights are estimated by a sequence of iterations. In each iteraction, the training observations are taken in turn. Each observation i is assessed to find its closest node. The method to use to measure distance on the map will have been specified, by the DMETHOD option of SOM, and stored with the SOM structure when it was declared. However, SOMADJUST also has a DMETHOD option in case you want to override the stored setting. The default setting for the DMETHOD option of SOM is euclidean. If X_i is a variate containing the values of the variables for observation i and W_k is the variate of weights at node j, the distance is then given by

d_ij = SQRT(SUM((X_i - W_j)**2))

The alternative setting, cityblock, calculates the distance as

d_ij = SUM(ABS(X_i - W_j)))

Once the closest node, k, has been found, the weights at that node and other nodes are adjusted. The method to use will have been specified when the SOM structure was declared, by the WMETHOD option of SOM. However, SOMADJUST again has its own WMETHOD option, that you can use to override the stored setting. The default setting for the DMETHOD option of SOM is gaussian. This adjusts the weights W_j at every node j to become

W_j + alpha * EXP( -0.5 * (d_jk / sigma)**2) * (X_i - W_j)

where d_jk is the distance between nodes j and k. With the alternative setting, neighbour, the weights at node j are adjusted to become

W_j + alpha * (X_i - W_j)

but only if d_jk is less than a threshold r.

The values of alpha, sigma and r for the iterations are listed by the ALPHA, SIGMA and THRESHOLD parameters of SOMADJUST. Each of these supplies a list of scalars (one for each iteration). The ERRORS parameter can save a list of matrices containing reconstruction error at the nodes of the map after each iteration. The TOTALERROR parameter can save a list of scalars with the total reconstruction error after each iteration. Finally, the FITNODES parameter can save a list of factors indicating how the observations are allocated to the nodes by each iteration.

SOMADJUST thus allows you define your own sequence of adjustment iteractions leading to the estimation of the weights. An alternative is to use procedure SOMESTIMATE, which initializes the weights and runs through an automatic sequence of iterations (each performed using SOMADJUST).

Options: SOM, DATA, DMETHOD, WMETHOD.

Parameters: ALPHA, SIGMA, THRESHOLD, ERRORS, TOTALERROR, FITNODES.

Action with RESTRICT

SOMADJUST takes account of any restrictions defined on the DATA variates.

See also

Procedures: SOM, SOMDESCRIBE, SOMESTIMATE, SOMIDENTIFY, SOMPREDICT.

Commands for: Data mining.

Example

CAPTION 'SOMADJUST example',!t('Fisher''s Iris Data'); STYLE=meta,plain
SOM     Som; VARIABLENAMES=!t(Sepal_L,Sepal_W,Petal_L,Petal_W)
MATRIX  [ROWS=150; COLUMNS=!t(Sepal_L,Sepal_W,Petal_L,Petal_W)] Measures
READ    Measures
 5.1  3.5  1.4  0.2
 4.9  3.0  1.4  0.2
 4.7  3.2  1.3  0.2
 4.6  3.1  1.5  0.2
 5.0  3.6  1.4  0.2
 5.4  3.9  1.7  0.4
 4.6  3.4  1.4  0.3
 5.0  3.4  1.5  0.2
 4.4  2.9  1.4  0.2
 4.9  3.1  1.5  0.1
 5.4  3.7  1.5  0.2
 4.8  3.4  1.6  0.2
 4.8  3.0  1.4  0.1
 4.3  3.0  1.1  0.1
 5.8  4.0  1.2  0.2
 5.7  4.4  1.5  0.4
 5.4  3.9  1.3  0.4
 5.1  3.5  1.4  0.3
 5.7  3.8  1.7  0.3
 5.1  3.8  1.5  0.3
 5.4  3.4  1.7  0.2
 5.1  3.7  1.5  0.4
 4.6  3.6  1.0  0.2
 5.1  3.3  1.7  0.5
 4.8  3.4  1.9  0.2
 5.0  3.0  1.6  0.2
 5.0  3.4  1.6  0.4
 5.2  3.5  1.5  0.2
 5.2  3.4  1.4  0.2
 4.7  3.2  1.6  0.2
 4.8  3.1  1.6  0.2
 5.4  3.4  1.5  0.4
 5.2  4.1  1.5  0.1
 5.5  4.2  1.4  0.2
 4.9  3.1  1.5  0.2
 5.0  3.2  1.2  0.2
 5.5  3.5  1.3  0.2
 4.9  3.6  1.4  0.1
 4.4  3.0  1.3  0.2
 5.1  3.4  1.5  0.2
 5.0  3.5  1.3  0.3
 4.5  2.3  1.3  0.3
 4.4  3.2  1.3  0.2
 5.0  3.5  1.6  0.6
 5.1  3.8  1.9  0.4
 4.8  3.0  1.4  0.3
 5.1  3.8  1.6  0.2
 4.6  3.2  1.4  0.2
 5.3  3.7  1.5  0.2
 5.0  3.3  1.4  0.2
 7.0  3.2  4.7  1.4
 6.4  3.2  4.5  1.5
 6.9  3.1  4.9  1.5
 5.5  2.3  4.0  1.3
 6.5  2.8  4.6  1.5
 5.7  2.8  4.5  1.3
 6.3  3.3  4.7  1.6
 4.9  2.4  3.3  1.0
 6.6  2.9  4.6  1.3
 5.2  2.7  3.9  1.4
 5.0  2.0  3.5  1.0
 5.9  3.0  4.2  1.5
 6.0  2.2  4.0  1.0
 6.1  2.9  4.7  1.4
 5.6  2.9  3.6  1.3
 6.7  3.1  4.4  1.4
 5.6  3.0  4.5  1.5
 5.8  2.7  4.1  1.0
 6.2  2.2  4.5  1.5
 5.6  2.5  3.9  1.1
 5.9  3.2  4.8  1.8
 6.1  2.8  4.0  1.3
 6.3  2.5  4.9  1.5
 6.1  2.8  4.7  1.2
 6.4  2.9  4.3  1.3
 6.6  3.0  4.4  1.4
 6.8  2.8  4.8  1.4
 6.7  3.0  5.0  1.7
 6.0  2.9  4.5  1.5
 5.7  2.6  3.5  1.0
 5.5  2.4  3.8  1.1
 5.5  2.4  3.7  1.0
 5.8  2.7  3.9  1.2
 6.0  2.7  5.1  1.6
 5.4  3.0  4.5  1.5
 6.0  3.4  4.5  1.6
 6.7  3.1  4.7  1.5
 6.3  2.3  4.4  1.3
 5.6  3.0  4.1  1.3
 5.5  2.5  4.0  1.3
 5.5  2.6  4.4  1.2
 6.1  3.0  4.6  1.4
 5.8  2.6  4.0  1.2
 5.0  2.3  3.3  1.0
 5.6  2.7  4.2  1.3
 5.7  3.0  4.2  1.2
 5.7  2.9  4.2  1.3
 6.2  2.9  4.3  1.3
 5.1  2.5  3.0  1.1
 5.7  2.8  4.1  1.3
 6.3  3.3  6.0  2.5
 5.8  2.7  5.1  1.9
 7.1  3.0  5.9  2.1
 6.3  2.9  5.6  1.8
 6.5  3.0  5.8  2.2
 7.6  3.0  6.6  2.1
 4.9  2.5  4.5  1.7
 7.3  2.9  6.3  1.8
 6.7  2.5  5.8  1.8
 7.2  3.6  6.1  2.5
 6.5  3.2  5.1  2.0
 6.4  2.7  5.3  1.9
 6.8  3.0  5.5  2.1
 5.7  2.5  5.0  2.0
 5.8  2.8  5.1  2.4
 6.4  3.2  5.3  2.3
 6.5  3.0  5.5  1.8
 7.7  3.8  6.7  2.2
 7.7  2.6  6.9  2.3
 6.0  2.2  5.0  1.5
 6.9  3.2  5.7  2.3
 5.6  2.8  4.9  2.0
 7.7  2.8  6.7  2.0
 6.3  2.7  4.9  1.8
 6.7  3.3  5.7  2.1
 7.2  3.2  6.0  1.8
 6.2  2.8  4.8  1.8
 6.1  3.0  4.9  1.8
 6.4  2.8  5.6  2.1
 7.2  3.0  5.8  1.6
 7.4  2.8  6.1  1.9
 7.9  3.8  6.4  2.0
 6.4  2.8  5.6  2.2
 6.3  2.8  5.1  1.5
 6.1  2.6  5.6  1.4
 7.7  3.0  6.1  2.3
 6.3  3.4  5.6  2.4
 6.4  3.1  5.5  1.8
 6.0  3.0  4.8  1.8
 6.9  3.1  5.4  2.1
 6.7  3.1  5.6  2.4
 6.9  3.1  5.1  2.3
 5.8  2.7  5.1  1.9
 6.8  3.2  5.9  2.3
 6.7  3.3  5.7  2.5
 6.7  3.0  5.2  2.3
 6.3  2.5  5.0  1.9
 6.5  3.0  5.2  2.0
 6.2  3.4  5.4  2.3
 5.9  3.0  5.1  1.8  :
FACTOR       [NVALUES=150; LABELS=!t(Setosa,Versicolor,Virginica);\ 
             VALUES=50(1,2,3)] Species
CALCULATE    [SEED=187123] Random = GRUNIFORM(NVALUES(Som['weights']);\ 
             MINIMUM(Measures); MAXIMUM(Measures))
EQUATE       Random; Som['weights']
SOMADJUST    [SOM=Som; DATA=Measures] 1,0.99...0.01; SIGMA=5,4.95...0.05;\
             TOTALERROR=Errors[1...100]
PRINT        Som['weights']
VARIATE      [VALUES=1...100] Iteration,Totalerror
EQUATE       Errors; Totalerror
PEN          11; METHOD=line; SYMBOL=0
DGRAPH       Totalerror; Iteration; PEN=11
Updated on March 5, 2019

Was this article helpful?