Performs a generalized Procrustes analysis (G.M. Arnold & R.W. Payne).
|Printed output required (
||Type of scaling to use (
||Method to be used (
||Number of roots (i.e. dimensions) to print for the output configurations, consensus and rotation matrices, and number of dimensions to save with the
||Controls which graphs to display (
||Number of dimensions to display in the consensus and individuals plots; default 3|
||The algorithm is assumed to have converged when (last residual sum of squares) – (current residual sum of squares) <
||Limit on number of iterations; default 50|
||Each pointer points to a set of matrices holding the original input configurations|
||Each pointer points to a set of matrices to store a set of final (output) configurations|
||Stores the final consensus configuration from each analysis|
||Each pointer points to a set of matrices to store the rotations required to transform each set of
||Each pointer points to a set of matrices to store the distances of a set of scaled
||Stores the residual sum of squares from each analysis|
||Stores the latent roots from referring the centroid configuration to its principal axis form (consensus) for each analysis|
||Stores the initial within-configuration sum of squares from each analysis|
||Stores the isotropic scaling factors for configurations from each analysis|
||Each pointer points to a set of matrices to store a set of projection matrices|
An N × V matrix represents a configuration of N points in V dimensions. Given a set of M such matrices (
XINPUT), a generalized Procrustes analysis iteratively matches them to a common centroid configuration by the operations of translation to a common origin, rotation/reflection of axes and possibly also scale changes. This matching seeks to minimise the sum of the squared distances between the centroid and each individual configuration summed over all points (the Procrustes statistic for each configuration and the centroid, summed over all configurations). The final centroid is referred to principal axes to give a unique consensus configuration. Two methods of scaling are available (controlled by the
SCALING option). Isotropic scaling, which scales the all the dimensions of each configuration by an equal amount, takes place during the Procrustes analysis. The alternative is to scale each configuration prior to the analysis so that the trace of each matrix is one (see Arnold 1992). If this latter method is used, the subsequent residuals represent pure lack-of-fit and the scaling factors given in the results represent differences in relative size/spread of the original (centred) configurations, whereas for overall isotropic scaling the scaling factor contains components of both size and lack-of-fit.
GENPROCRUSTES carries out a generalized Procrustes analysis and has parameters for saving various results for future use (
PROJECTIONS). There are options for different methods to use for the matching (
METHOD), control of convergence (
MAXCYCLE) and printing and plotting of results (
Note that the special case of M=2 corresponds to the classical pairwise Procrustes matching (
ROTATE directive) except that by fitting each configuration to a common centroid the requirement to regard one of the initial configurations as fixed is obviated.
The default method used for generalized Procrustes analysis in
GENPROCRUSTES is that described by Gower (1975). Each input configuration (
XINPUT – referred to henceforth as Xi, i=1…M) is initially column-centred, with the individual column means for each configuration optionally printed (by including the
column setting with the
SCALING=separate), the matrices are also scaled to have trace one (see Arnold 1992). A constraint is required on the overall sum of squares to prevent the trivial solution of matching by all configurations collapsing to the origin. In this procedure the constraint used is
∑ ( trace ( Xi′ Xi ) ) = M.
An initial estimate of the centroid is found from these centred and scaled configurations; firstly X2 is rotated to X1, with the rotated X2 saved as the new X2 and the centroid computed as the mean of X1 and the new X2; X3 is rotated to this centroid which is then recalculated as the mean of the three current configurations; and so on until all configurations Xi (i=1…M) have been included. The centroid thus found is taken as the initial centroid estimate Y, with the rotated values as the new Xi. The initial residual sum of squares Sr is calculated as
Sr = M × ( 1 – trace ( Y′ Y )).
Each of the current configurations Xi is then rotated to Y and the rotated position saved as the new Xi. The updated estimate of the centroid Yn is calculated as the mean of the new Xi (i=1…M) and the new residual sum of squares calculated as
Srn = M × ( 1 – trace ( Yn′ Yn )).
If isotropic scaling has been requested (option
SCALING=isotropic) new estimates roi′ of the individual scaling factors roi (originally set to 1) are now found by
roi′/roi= √( trace( Xi′Yn )/( trace( Xi′Xi ) × trace( Yn′Yn )))
and each Xi is updated by a factor of roi′/roi. The centroid is then recalculated as the mean of the new Xi and the new residual sum of squares calculated in a similar manner to before. If the change in residual Sr is less than a preset tolerance (controlled by option
TOLERANCE) the algorithm is taken to have converged. If not, the process is repeated until the tolerance is reached, up to a maximum number of iterations as set by the option
MAXCYCLE (default 50) after which a message of non-convergence is printed and the procedure terminated. Monitoring information about convergence can be printed by including the
monitoring setting with the
After convergence a unique consensus configuration is found by referring the final centroid to principal axes; the corresponding latent roots may be saved using the
ROOTS parameter. Final results for the consensus and individual configurations (referred to the same principal axes) may be printed using the
individual settings of the
ROTATIONS. By default, results are presented and saved for the maximum available dimensionality but the option
NROOTS allows a reduced number of dimensions to be set. Analysis of variation for the M configurations (including the individual scaling factors) and for the N points, along with the initial within and between configurations sums of squares (WSS and BSS), the final residual sum of squares (RSS) and number of steps in the iteration process may be printed using the
analysis setting of the
SCALINGFAC parameters. (Note that the final results are still scaled by the original factor from the initial overall constraint; to return to the original scale all sums of squares need adjustment by a factor of WSS/M and configurations by the square root of that factor).
Independently of the choice of dimensionality for printing and saving, the
NDROOTS option controls the dimensionality of the graphical output requested using the
PLOT option (default 3). The
consensus setting plots the consensus solution in the chosen dimensionalty, and the
individual setting gives the individual final configurations as well as the consensus. The
projection setting displays the projections (calculated from the individual rotation matrices scaled by the singular values from the consensus solution in principal axis form) as vectors labelled by configuration number and colour-coded for order of column. This projection plot can be particularly helpful in comparing the use of terms/attributes (columns of the configurations) by individual assessors in sensory analysis, both in conventional and free-choice profiling; see Arnold & Collins (1993) for further details.
Modifications to the method described above are given in TenBerge (1975), and may be invoked by the
TenBerge setting of the
METHOD option. This may give considerable savings in the time to reach convergence (Arnold 1988).
Arnold, G.M. (1988). Comparisons of algorithms for generalized Procrustes analyses. Genstat Newsletter, 22, 7-11.
Arnold, G.M. (1992). Scaling factors in generalized Procrustes analysis. Computational Statistics, Volume 1, Proceedings of the 10th Symposium on Computational Statistics, COMPSTAT, Neuchatel, Switzerland, August 1992, 61-66.
Arnold, G.M. & Collins, A.J. (1993). Interpretation of transformed axes in multivariate analysis. Applied Statistics, 42, 381-400.
Gower, J.C. (1975). Generalized Procrustes analysis. Psychometrika, 40, 33-51.
TenBerge, J.M.F. (1977). Orthogonal Procrustes rotation for two or more matrices. Psychometrika, 42, 267-276.
Commands for: Multivariate and cluster analysis.
CAPTION 'GENPROCRUSTES example',!t('Data from',\ 'Gower (1975), Psychometrika, 40, pages 33-51.',\ 'Note, however, that in Table 3 the scaling factors printed',\ 'were SQRT(ro[i]) instead of ro[i], and in Table 4 the',\ 'Between and Within Judges sums of squares were transposed.');\ STYLE=meta,plain MATRIX [ROWS=9; COLUMNS=7] X[1...3] READ [SERIAL=yes] X 47 44 49 38 35 40 40 72 45 41 77 72 73 35 61 49 40 58 58 62 30 66 56 45 55 53 46 30 37 72 50 27 30 33 25 76 76 53 81 79 75 45 64 59 51 72 61 66 40 21 70 43 27 22 26 20 71 70 34 72 72 71 35 : 31 39 33 29 48 38 42 30 60 36 22 36 34 39 27 55 30 18 28 22 42 48 52 53 27 21 30 31 20 55 28 22 33 27 35 21 42 31 46 76 33 42 30 52 53 35 44 30 44 5 57 53 12 13 6 31 55 63 53 77 79 57 49 : 43 46 44 22 53 44 29 53 79 75 79 73 52 27 22 85 83 19 27 17 22 28 89 78 13 29 20 24 75 86 85 34 75 55 38 53 79 82 72 78 74 38 15 85 85 46 75 52 35 5 95 95 3 20 2 24 27 78 85 89 92 81 41 : GENPROCRUSTES [PRINT=analysis,centroid,column,individual,monitoring;\ SCALING=isotropic] XINPUT=X