Calculations and manipulation

The directive CALCULATE allows arithmetic calculations on the values of any numeric data structure; logical tests can also be done on numerical and textual values. Functions and operators are available for a very wide range of calculations on matrices and tables. Another general directive is EQUATE, which allows values to be copied from one set of data structures to another; the structures must store values of the same mode (for example, numbers or text), but need not be of the same type.

Structure values can be deleted to save space within Genstat; attributes can also be deleted so that the structure can be redefined, for example as another type. Contents of data structures can be compared, to see if they contain the same distinct items, or whether the distinct values in one structure are a subset of those in another. You can also find all the locations where a number, identifier or string occurs within a data structure.

`CALCULATE`	performs arithmetic and logical calculations
`DELETE`	allows values and attributes of data structures to be deleted
`EQUATE`	copies values between sets of data structures
`SETRELATE`	compares the sets of values in two data structures
`GETLOCATIONS`	finds locations of an identifier within a pointer, or a string within a factor or text, or a number within any numerical data structure

There are several general directives for manipulating vectors (variates, factors or texts). Units of vectors can be sorted into systematic order or into random order. Boolean arithmetic can be performed on their contents, or you can form all the ways of partitioning them into subsets. A “restriction” can be associated with a vector, so that subsequent statements operate on only a subset of its units. A default length and labelling can be defined for vectors formed later in the job. Facilities for specific types of vector allow interpolation of values for variates, monotonic regression, calculation of regression quantiles, generation of factor values, and concatenation, editing and searching of text.

`SORT`	sorts units of vectors into alphabetic or numerical order of an index vector, or forms a factor from a variate or text
`SETCALCULATE`	performs Boolean set calculations on the contents of vectors and pointers
`SETALLOCATIONS`	runs through all ways of allocating a set of objects to subsets
`RESTRICT`	defines a “restriction” on the units of a vector
`UNITS`	defines default length or labelling for vectors defined subsequently in the job
`INTERPOLATE`	calculates variates of interpolated values
`FRQUANTILES`	forms regression quantiles
`MONOTONIC`	fits an increasing monotonic regression
`GROUPS`	forms a factor (or grouping variable) from a variate or text, together with the set of distinct values that occur
`CONCATENATE`	concatenates together lines of text vectors
`EDIT`	line editor for units of text vectors
`TXBREAK`	breaks a text structure into individual words
`TXCONSTRUCT`	forms a text structure by appending or concatenating values of scalars, variates, texts, factors or pointers; allows the case of letters to be changed or values to truncated and reversed
`TXFIND`	finds a subtext within a text structure
`TXINTEGERCODES`	converts textual characters to and from their corresponding integer codes
`TXPOSITION`	locates strings within the lines of a text structure
`TXREPLACE`	replaces a subtext within a text structure

Another general directive allows you to run many algorithms from the Numerical Algorithms Group Library, for example to build mathematical models.

`NAG`	calls an algorithm from the NAG Library

Other facilities for vectors are provided by the procedures in the Genstat Procedure Library, including:

`APPEND`	appends a list of vectors of compatible types
`FACAMEND`	permutes the levels and labels of a factor
`FACCOMBINATIONS`	forms a factor to indicate observations with identical values of a set of variates, texts or factors
`FACDIVIDE`	represents a factor by factorial combinations of a set of factors
`FACEXCLUDEUNUSED`	redefines the levels and labels of a factor to exclude those that are unused
`FACMERGE`	merges levels of factors
`FACPRODUCT`	forms a factor with a level for every combination of other factors
`FACSORT`	sorts the levels of a factor according to an index vector
`FACLEVSTANDARDIZE`	redefines a list of factors so that they have the same levels or labels
`FACUNIQUE`	redefines a factor so that its levels and labels are unique
`FBETWEENGROUPVECTORS`	forms variates and classifying factors containing within-group summaries to use e.g. in a between-group analysis
`FDISTINCTFACTORS`	checks sets of factors to remove any that define duplicate classifications
`FMFACTORS`	forms a pointer of factors representing a multiple-response
`FFREERESPONSEFACTOR`	forms multiple-response factors from free-response data
`FREGULAR`	expands vectors onto a regular two-dimensional grid
`FRESTRICTEDSET`	forms vectors with the restricted subset of a list of vectors
`FROWCANONICALMATRIX`	puts a matrix into row canonical, or reduced row echelon, form
`FSTRING`	forms a single string from a list of strings in a text
`FTEXT`	forms a text structure from a variate
`FUNIQUEVALUES`	redefines a variate or text so that its values are unique
`FWITHINTERMS`	forms factors to define terms representing the effects of one factor within another factor
`FVSTRING`	forms a string listing the identifiers of a set of data structures
`GRANDOM`	generates pseudo-random numbers from probability distributions
`GRMNOMIAL`	generates multinomial pseudo-random numbers
`GRMULTINORMAL`	generates multivariate normal pseudo-random numbers
`JOIN`	joins or merges two sets of vectors together, based on classifying keys
`MVFILL`	replaces missing values in a vector with the previous non-missing value
`ORTHPOLYNOMIAL`	calculates orthogonal polynomials
`QUANTILE`	calculates quantiles of the values in a variate
`RANK`	produces ranks, from the values in a variate, allowing for ties
`RESHAPE`	reshapes a data set with classifying factors for rows and columns, into a reorganized data set with new identifying factors
`SAMPLE`	samples from a set of units, possibly stratified by factors
`SVSAMPLE`	constructs stratified random samples
`STACK`	combines several data sets by “stacking” the corresponding vectors
`STANDARDIZE`	standardizes columns of a data matrix to have mean 0 and variance 1
`SUBSET`	forms vectors containing subsets of the values in other vectors
`TXPAD`	pads strings of a text structure with extra characters so that their lengths are equal
`TXPROGRESSION`	forms a text containing a progression of strings
`TXSPLIT`	splits a text into individual texts, at positions on each line marked by separator character(s)
`UNSTACK`	splits vectors into individual vectors according to levels of a factor
`VEQUATE`	equates values across a set of data structures
`VINTERPOLATE`	performs linear and inverse linear interpolation between variates
`VREPLACE`	replaces values of vectors and pointers

There are several procedures for calculating or fitting splines, and for manipulating series of observations of a theoretical curve.

`SPLINE`	calculates a set of basis functions for M-, B- or I-splines
`LSPLINE`	calculates design matrices to fit a natural polynomial or trignometric L-spline as a linear mixed model
`NCSPLINE`	calculates natural cubic spline basis functions (for use e.g. in `REML`)
`PENSPLINE`	calculates design matrices to fit a penalized spline as a linear mixed model
`PSPLINE`	calculates design matrices to fit a P-spline as a linear mixed model
`RADIALSPLINE`	calculates design matrices to fit a radial-spline surface as a linear mixed model
`TENSORSPLINE`	calculates design matrices to fit a tensor-spline surface as a linear mixed model
`ALIGNCURVE`	forms an optimal warping to align an observed series of observations with a standard series
`BASELINE`	estimates a baseline for a series of numbers whose minimum value is drifting
`PEAKFINDER`	finds the locations of peaks in an observed series

Directives are available for eigenvalue, QR and singular-value decompositions of matrices, and to form the values of SSPM structures.

`FLRV`	calculates latent roots and vectors (that is, eigenvalues and eigenvectors)
`QRD`	calculates QR decompositions of matrices
`SVD`	calculates singular-value decompositions of matrices
`FSSPM`	calculates values for SSPM structures (sums of squares and products, means, etc.)

Procedures in the Library for operating on matrices include:

`FCORRELATION`	forms the correlation matrix for a list of variates
`PARTIALCORRELATIONS`	calculates partial correlations for a list of variates
`FHADAMARDMATRIX`	forms Hadamard matrices
`FPROJECTIONMATRIX`	forms a projection matrix for a set of model terms
`FRTPRODUCTDESIGNMATRIX`	forms summation, or relationship, matrices for model terms
`FVCOVARIANCE`	forms the variance-covariance matrix for a list of variates
`GINVERSE`	calculates the generalized inverse of a matrix
`LINDEPENDENCE`	finds the linear relations associated with matrix singularities
`MPOWER`	forms integer powers of a square matrix
`POSSEMIDEFINITE`	calculates a positive semi-definite approximation of a non-positive semi-definite symmetric matrix
`VMATRIX`	copies values and row/column labels from a matrix to variates and texts

Tables can be formed containing summaries of values in variates: totals, minimum and maximum values, quantiles, numbers of missing and non-missing values, means and variances. Manipulations of multi-way structures include the ability to add various types of marginal summaries to tables, and to combine “slices” of tables, of matrices or of variates.

`TABULATE`	forms tables of summaries of the values of a variate
`MARGIN`	calculates or deletes margins of tables
`COMBINE`	combines or omits “slices” of tables, matrices or variates

Procedures in the Library for operating on tables include:

`BACKTRANSFORM`	calculates back-transformed means with approximate standard errors and confidence intervals
`MEDIANTETRAD`	gives robust identification of multiple outliers in 2-way tables
`MTABULATE`	tabulates data classified by multiple-response factors
`PERCENT`	expresses the body of a table as percentages of one of its margins
`SVBOOT`	bootstraps data from random surveys
`SVCALIBRATE`	performs generalized calibration of survey data
`SVGLM`	fits generalized linear models to survey data
`SVREWEIGHT`	modifies survey weights adjusting to ensure that their overall sum weights remains unchanged
`SVSAMPLE`	constructs stratified random samples
`SVSTRATIFIED`	analyses stratified random surveys by expansion or ratio raising
`SVTABULATE`	tabulates data from random surveys, including multistage surveys and surveys with unequal probabilities of selection
`SVWEIGHT`	forms survey weights
`TABINSERT`	inserts the contents of a sub-table into a table
`TABMODE`	forms summary tables of modes of values
`TABSORT`	sorts tables so their margins are in ascending or descending order
`TCOMBINE`	combines several tables into a single table
`T%CONTROL`	expresses tables as percentages of control cells
`VTABLE`	forms a variate and set of classifying factors from a table

Directives are available for adding and removing branches of trees, and to assist in the construction and use of trees.

`BASSESS`	assesses potential splits for regression and classification trees
`BCUT`	cuts a tree at a defined node, discarding nodes and information below it
`BIDENTIFY`	identifies specimens using a tree
`BJOIN`	extends a tree by joining another tree to a terminal node
`BGROW`	adds new branches to a node of a tree

There are also procedures for displaying and pruning trees. These are provide basic utilities for tree-based analysis, and are used by the existing procedures for classification trees, identification keys and regression trees (BCLASSIFICATION, BKEY and BREGRESSION).

`BCONSTRUCT`	constructs a tree
`BGRAPH`	plots a tree
`BPRINT`	displays a tree
`BPRUNE`	prunes a tree using minimal cost complexity

Formulae and expressions can be interpreted, revised or constructed automatically from the contents of pointers.

`FARGUMENTS`	forms lists of arguments involved in an expression
`FCLASSIFICATION`	forms classification sets for the terms in a formula or breaks a formula up into separate formulae (one for each term)
`REFORMULATE`	modifies a formula or an expression to operate on a different set of data structures
`SET2FORMULA`	forms a model formula using structures supplied in a pointer

Values can be assigned to dummies and pointers.

`ASSIGN`	sets values of dummies and pointers

Aspects of the “environment” of the current job can be modified, such as whether or not Genstat starts output from a statistical analysis at the top of a new page, or whether it should pause during interactive output. New defaults can be set for options and parameters. Details of the environmental settings can be copied into Genstat data structures. Attributes of data structures can also be accessed.

`SET`	sets details of the “environment” of a Genstat job
`SETOPTION`	sets or modifies defaults of options of Genstat directives or procedures
`SETPARAMETER`	sets or modifies defaults of parameters of Genstat directives or procedures
`GET`	gets details of the “environment” of a Genstat job
`GETATTRIBUTE`	accesses attributes of data structures
`GETNAME`	forms the name of a structure according to its `IPRINT` attribute

There are also various specialist mathematical facilities.

`BPCONVERT`	converts bit patterns between integers, pointers of set bits and textual descriptions
`FPARETOSET`	forms the Pareto optimal set of non-dominated groups
`GALOIS`	forms addition and multiplication tables for a Galois finite field
`NCONVERT`	converts integers between base 10 and other bases
`PERMUTE`	forms all possible permutations of the integers 1…n
`PRIMEPOWER`	decomposes a positive integer into its constituent prime powers

And there are games.

`BINGO`	can be used to set up and then play a game of bingo
`FRUITMACHINE`	runs a fruit machine using pop-up menus and Genstat graphics
`LIFE`	plays John Conway’s Game of Life
`NOUGHTSANDCROSSES`	plays a game of noughts and crosses

Updated on February 7, 2023

Was this article helpful?

Yes No