1. Home
  2. GENPROCRUSTES procedure

GENPROCRUSTES procedure

Performs a generalized Procrustes analysis (G.M. Arnold & R.W. Payne).

Options

PRINT = string tokens Printed output required (analysis, centroid, column, individual, monitoring); default anal, cent
SCALING = string token Type of scaling to use (none, isotropic, separate); default none
METHOD = string token Method to be used (Gower, TenBerge); default Gowe
NROOTS = scalar Number of roots (i.e. dimensions) to print for the output configurations, consensus and rotation matrices, and number of dimensions to save with the XOUTPUT, CONSENSUS and ROTATIONS paramaters if their matrices have alread not been defined; default is to print and save all the dimensions
PLOT = string tokens Controls which graphs to display (consensus, individuals, projections); default * i.e. none
NDROOTS = scalar Number of dimensions to display in the consensus and individuals plots; default 3
TOLERANCE = scalar The algorithm is assumed to have converged when (last residual sum of squares) – (current residual sum of squares) < TOLERANCE × (number of configurations); default 0.00001
MAXCYCLE = scalar Limit on number of iterations; default 50

Parameters

XINPUT = pointers Each pointer points to a set of matrices holding the original input configurations
XOUTPUT = pointers Each pointer points to a set of matrices to store a set of final (output) configurations
CONSENSUS = matrices Stores the final consensus configuration from each analysis
ROTATIONS = pointers Each pointer points to a set of matrices to store the rotations required to transform each set of XINPUT configurations to their final (scaled) XOUTPUT configurations
RESIDUALS = pointers Each pointer points to a set of matrices to store the distances of a set of scaled XINPUT configurations from its consensus
RSS = scalars Stores the residual sum of squares from each analysis
ROOTS = diagonal matrices Stores the latent roots from referring the centroid configuration to its principal axis form (consensus) for each analysis
WSS = scalars Stores the initial within-configuration sum of squares from each analysis
SCALINGFACTOR = variates Stores the isotropic scaling factors for configurations from each analysis
PROJECTIONS = pointers Each pointer points to a set of matrices to store a set of projection matrices

Description

An N × V matrix represents a configuration of N points in V dimensions. Given a set of M such matrices (XINPUT), a generalized Procrustes analysis iteratively matches them to a common centroid configuration by the operations of translation to a common origin, rotation/reflection of axes and possibly also scale changes. This matching seeks to minimise the sum of the squared distances between the centroid and each individual configuration summed over all points (the Procrustes statistic for each configuration and the centroid, summed over all configurations). The final centroid is referred to principal axes to give a unique consensus configuration. Two methods of scaling are available (controlled by the SCALING option). Isotropic scaling, which scales the all the dimensions of each configuration by an equal amount, takes place during the Procrustes analysis. The alternative is to scale each configuration prior to the analysis so that the trace of each matrix is one (see Arnold 1992). If this latter method is used, the subsequent residuals represent pure lack-of-fit and the scaling factors given in the results represent differences in relative size/spread of the original (centred) configurations, whereas for overall isotropic scaling the scaling factor contains components of both size and lack-of-fit.

Procedure GENPROCRUSTES carries out a generalized Procrustes analysis and has parameters for saving various results for future use (XOUTPUT, CONSENSUS, ROTATIONS, RESIDUALS, RSS, ROOTS, WSS, SCALINGFACTOR, PROJECTIONS). There are options for different methods to use for the matching (SCALING, METHOD), control of convergence (TOLERANCE, MAXCYCLE) and printing and plotting of results (PRINT, PLOT, NROOTS and NDROOTS).

Note that the special case of M=2 corresponds to the classical pairwise Procrustes matching (ROTATE directive) except that by fitting each configuration to a common centroid the requirement to regard one of the initial configurations as fixed is obviated.

Options: PRINT, SCALING, METHOD, NROOTS, PLOT, NDROOTS, TOLERANCE, MAXCYCLE.

Parameters: XINPUT, XOUTPUT, CONSENSUS, ROTATIONS, RESIDUALS, RSS, ROOTS, WSS, SCALINGFACTOR, PROJECTIONS.

Method

The default method used for generalized Procrustes analysis in GENPROCRUSTES is that described by Gower (1975). Each input configuration (XINPUT – referred to henceforth as Xi, i=1…M) is initially column-centred, with the individual column means for each configuration optionally printed (by including the column setting with the PRINT option). If separate scaling is requested (option SCALING=separate), the matrices are also scaled to have trace one (see Arnold 1992). A constraint is required on the overall sum of squares to prevent the trivial solution of matching by all configurations collapsing to the origin. In this procedure the constraint used is

∑ ( trace ( XiXi ) ) = M.

An initial estimate of the centroid is found from these centred and scaled configurations; firstly X2 is rotated to X1, with the rotated X2 saved as the new X2 and the centroid computed as the mean of X1 and the new X2; X3 is rotated to this centroid which is then recalculated as the mean of the three current configurations; and so on until all configurations Xi (i=1…M) have been included. The centroid thus found is taken as the initial centroid estimate Y, with the rotated values as the new Xi. The initial residual sum of squares Sr is calculated as

Sr = M × ( 1 – trace ( Y′ Y )).

Each of the current configurations Xi is then rotated to Y and the rotated position saved as the new Xi. The updated estimate of the centroid Yn is calculated as the mean of the new Xi (i=1…M) and the new residual sum of squares calculated as

Srn = M × ( 1 – trace ( YnYn )).

If isotropic scaling has been requested (option SCALING=isotropic) new estimates roi′ of the individual scaling factors roi (originally set to 1) are now found by

roi′/roi= √( trace( XiYn )/( trace( XiXi ) × trace( YnYn )))

and each Xi is updated by a factor of roi′/roi. The centroid is then recalculated as the mean of the new Xi and the new residual sum of squares calculated in a similar manner to before. If the change in residual Sr is less than a preset tolerance (controlled by option TOLERANCE) the algorithm is taken to have converged. If not, the process is repeated until the tolerance is reached, up to a maximum number of iterations as set by the option MAXCYCLE (default 50) after which a message of non-convergence is printed and the procedure terminated. Monitoring information about convergence can be printed by including the monitoring setting with the PRINT option.

After convergence a unique consensus configuration is found by referring the final centroid to principal axes; the corresponding latent roots may be saved using the ROOTS parameter. Final results for the consensus and individual configurations (referred to the same principal axes) may be printed using the centroid and individual settings of the PRINT option, and/or saved using the parameters XOUTPUT, CONSENSUS and ROTATIONS. By default, results are presented and saved for the maximum available dimensionality but the option NROOTS allows a reduced number of dimensions to be set. Analysis of variation for the M configurations (including the individual scaling factors) and for the N points, along with the initial within and between configurations sums of squares (WSS and BSS), the final residual sum of squares (RSS) and number of steps in the iteration process may be printed using the analysis setting of the PRINT option. The initial within-configuration sum of squares, final residual sum of squares and individual isotropic scaling factors may also be saved using, respectively, the WSS, RSS and SCALINGFAC parameters. (Note that the final results are still scaled by the original factor from the initial overall constraint; to return to the original scale all sums of squares need adjustment by a factor of WSS/M and configurations by the square root of that factor).

Independently of the choice of dimensionality for printing and saving, the NDROOTS option controls the dimensionality of the graphical output requested using the PLOT option (default 3). The consensus setting plots the consensus solution in the chosen dimensionalty, and the individual setting gives the individual final configurations as well as the consensus. The projection setting displays the projections (calculated from the individual rotation matrices scaled by the singular values from the consensus solution in principal axis form) as vectors labelled by configuration number and colour-coded for order of column. This projection plot can be particularly helpful in comparing the use of terms/attributes (columns of the configurations) by individual assessors in sensory analysis, both in conventional and free-choice profiling; see Arnold & Collins (1993) for further details.

Modifications to the method described above are given in TenBerge (1975), and may be invoked by the TenBerge setting of the METHOD option. This may give considerable savings in the time to reach convergence (Arnold 1988).

References

Arnold, G.M. (1988). Comparisons of algorithms for generalized Procrustes analyses. Genstat Newsletter, 22, 7-11.

Arnold, G.M. (1992). Scaling factors in generalized Procrustes analysis. Computational Statistics, Volume 1, Proceedings of the 10th Symposium on Computational Statistics, COMPSTAT, Neuchatel, Switzerland, August 1992, 61-66.

Arnold, G.M. & Collins, A.J. (1993). Interpretation of transformed axes in multivariate analysis. Applied Statistics, 42, 381-400.

Gower, J.C. (1975). Generalized Procrustes analysis. Psychometrika, 40, 33-51.

TenBerge, J.M.F. (1977). Orthogonal Procrustes rotation for two or more matrices. Psychometrika, 42, 267-276.

See also

Directives: ROTATE. FACROTATE.

Procedures: PCOPROCRUSTES, SAGRAPES.

Commands for: Multivariate and cluster analysis.

Example

CAPTION  'GENPROCRUSTES example',!t('Data from',\
         'Gower (1975), Psychometrika, 40, pages 33-51.',\ 
         'Note, however, that in Table 3 the scaling factors printed',\ 
         'were SQRT(ro[i]) instead of ro[i], and in Table 4 the',\ 
         'Between and Within Judges sums of squares were transposed.');\ 
         STYLE=meta,plain
MATRIX   [ROWS=9; COLUMNS=7] X[1...3]
READ     [SERIAL=yes] X[]
47 44 49 38 35 40 40
72 45 41 77 72 73 35
61 49 40 58 58 62 30
66 56 45 55 53 46 30
37 72 50 27 30 33 25
76 76 53 81 79 75 45
64 59 51 72 61 66 40
21 70 43 27 22 26 20
71 70 34 72 72 71 35 :
31 39 33 29 48 38 42
30 60 36 22 36 34 39
27 55 30 18 28 22 42
48 52 53 27 21 30 31
20 55 28 22 33 27 35
21 42 31 46 76 33 42
30 52 53 35 44 30 44
 5 57 53 12 13  6 31
55 63 53 77 79 57 49 :
43 46 44 22 53 44 29
53 79 75 79 73 52 27
22 85 83 19 27 17 22
28 89 78 13 29 20 24
75 86 85 34 75 55 38
53 79 82 72 78 74 38
15 85 85 46 75 52 35
 5 95 95  3 20  2 24
27 78 85 89 92 81 41 :
GENPROCRUSTES [PRINT=analysis,centroid,column,individual,monitoring;\ 
              SCALING=isotropic] XINPUT=X
Updated on March 7, 2019

Was this article helpful?