NNFIT directive

Fits a multi-layer perceptron neural network.

Options

`PRINT` = string tokens	Controls fitted output (`description`, `estimates`, `fittedvalues`, `summary`); default `desc`, `esti`, `summ`
`NHIDDEN` = scalar	Number of functions in the hidden layer; no default, must be set
`HIDDENMETHOD` = string token	Type of activation function in the hidden layer (`logistic`, `hyperbolictangent`); default `logi`
`OUTPUTMETHOD` = string token	Type of activation function in the output layer (`linear`, `logistic`, `hyperbolictangent`); default `line`
`GAIN` = scalar	Multiplicative constant to use in the functions; default 1
`NTRIES` = scalar	Number of times to search for a good initial starting point for the optimization; default 5
`NSTARTITERATIONS` = scalar	Number of iterations to use to find a good starting point for the optimization; default 30
`VALIDATIONOPTIONS` = variate	Variate containing three integers to control validation for early stopping; default `*` i.e. no early stopping; default `!(10,4,16)`
`SEED` = scalar	Seed for random numbers to generate initial values for the free parameters; default 0
`MAXCYCLE` = scalar	Maximum number of iterations of the conjugate-gradient algorithm; default 50

Parameters

`Y` = variates	Response variates
`X` = pointers	Input variates
`YVALIDATION` = variates	Validation data for the dependent variates
`XVALIDATION` = pointers	Validation data for the independent variates
`FITTEDVALUES` = variates	Fitted values generated for each y-variate by the neural network
`OBJECTIVE` = scalars	Value of the sum of squares objective function at the end of the optimization
`NCOMPLETED` = scalars	Number of completed iterations of the conjugate-gradient algorithm
`EXIT` = scalars	Saves the exit code
`SAVE` = pointers	Saves details of the network and the estimated parameters

Description

A neural network is a method for describing a nonlinear relationship between a response variate supplied here by the Y parameter, and a set of input variates supplied here in a pointer by the X parameter. The type of neural network fitted by NNFIT is a fully-connected feed-forward multi-layer perceptron with a single hidden layer. This network starts with a row of nodes, one for each input variable (i.e. x-variate), which are all connected to every node in the hidden layer. The nodes in the hidden layer are then all connected to the output node in the final, output layer. The number of nodes in the hidden layer is specified by the NHIDDEN option.

The output value y is given by

y = ψ( Σ_{k = 1…m} w_k φ( Σ_{j = 1…d} w_jk x_j – θ) – η)

where d	is the number of input nodes (i.e. x-variates),
m	is the number of hidden nodes (`NHIDDEN`),
x_j	is value of the jth x-variate,
w_jk	are weight parameters in the connections between the nodes in the input and hidden layers,
w_k	are weight parameters in the connections between the nodes in the hidden and output layer,
θ	is the threshold value subtracted at the hidden layer,
η	is the threshold value subtracted at the single node in the output layer,
φ(.)	is the activation function applied at the hidden layer,
ψ(.)	is the activation function applied at the output layer.

The activation functions for the hidden and outer layer are specified by the HIDDENMETHOD and OUTPUTMETHOD options, respectively, with settings:

`linear`	φ(z) = z (`OUTPUTMETHOD` only),
`logistic`	φ(z) = 1 / (1 + exp(-γz)),
`hyperbolictangent`	φ(z) = tanh(γz),

where the parameter γ is specified by the GAIN option; the default setting is logistic for HIDDENMETHOD, and linear for OUTPUTMETHOD.

Values for the free parameters in the multi-layer perceptron model are optimized by using a preconditioned, limited-memory quasi-Newton conjugate gradients method to minimize the objective (sum of squares) function equal to 0.5 times the average sum of squared deviation of the estimated y-values from the observed y-values.

Printed output is controlled by the PRINT option, with settings:

`description`	a description of the network (number of input variables, nodes etc.),
`estimates`	estimates of the free parameters,
`fittedvalues`	fitted values,
`summary`	summary (numbers of iterations, objective function etc.).

The NTRIES option defines the number of times to search for a good initial starting point for the optimization (default 5). The NSTARTITERATIONS option defines the number of iterations to use to find a good starting point for the optimization (default 30).

The SEED option supplies a seed for the random numbers to generate initial values for the free parameters. The default of zero continues the existing sequence of random numbers if any have already been used in the current Genstat job. If none have yet been used, Genstat picks a seed at random.

The MAXCYCLE option sets a limit on the number of iterations of the conjugate-gradient algorithm to use for the estimation (default 50).

To improve the accuracy of the neural-network approximations to new data records, it is usually desirable to stop the optimization before the value of the objective function reaches a global minimum on the training set. This method, which is known as early stopping, and can be performed by using a validation set of data records, specified by the YVALIDATION and XVALIDATION parameters. The optimization is then halted when the sum of squares error function achieves a minimum over the validation set of data records which has not been used to estimate the values of the free parameters in the model. The VALIDATIONOPTIONS option specifies a variate containing three integers to control validation for early stopping. The first integer defines the number of iterations of the optimizing function to complete before beginning validation; default 10. The second integer defines the number of iterations between consecutive validations; default 4. The third integer defines the number of iterations to continue validating beyond the current minimum of the objective function before stopping; default 16. This is to try to avoid the possibility of getting stuck at a local minimum. The variates in the XVALIDATION pointer must be in the same order as the corresponding variates in the X pointer.

The results of the fit, together with details about design of the neural network, can be saved using the SAVE parameter. This can then be used in the NNDISPLAY directive to display further output, or the NNPREDICT directive to form predictions.

Options: PRINT, NHIDDEN, HIDDENMETHOD, OUTPUTMETHOD, GAIN, NTRIES, NSTARTITERATIONS, VALIDATIONOPTIONS, SEED, MAXCYCLE.

Parameters: Y, X, YVALIDATION, XVALIDATION, FITTEDVALUES, OBJECTIVE, NCOMPLETED, EXIT, SAVE.

Method

NNFIT uses the function nagdmc_mlp from the Numerical Algorithms Group’s library of Data Mining Components (DMCs), which estimates the free parameters using a conjugate gradient method.

Action with `RESTRICT`

You can restrict the set of units used for the estimation by applying a restriction to the y-variate or any of the x-variates. If several of these are restricted, they must all be restricted to the same set of units. Similarly, you can restrict the set of units used for the validation by applying a restriction to the YVALIDATION variate or any of the XVALIDATION variates.

Example

" Example NNFI-1: Fitting a multi-layer perceptron neural network."

" This example fits a multi-layer perceptron neural network with five hidden
  layers, a hyperbolic activation function in the hidden layer and a linear 
  activation function in the output layer."

" The data are in a file called iris.GSH and contain the data from Fisher's
  Iris data set."

SPLOAD   [PRINT=*] '%GENDIR%/Data/iris.GSH' 
POINTER  [VALUES=Sepal_Length,Sepal_Width,Petal_Length,Petal_Width] Measures
CALC     yval = NEWLEVELS(Species)
NNFIT    [PRINT=description,estimates,summary; NHIDDEN=5;\
          HIDDENMETHOD=hyperbolictangent; OUTPUTMETHOD=linear; SEED=12]\ 
           Y=yval; X=Measures

Updated on March 7, 2019

Was this article helpful?

Yes No