Performs a generalized Procrustes analysis (G.M. Arnold & R.W. Payne).
Options
PRINT = string tokens |
Printed output required (analysis , centroid , column , individual , monitoring ); default anal , cent |
---|---|
SCALING = string token |
Type of scaling to use (none , isotropic , separate ); default none |
METHOD = string token |
Method to be used (Gower , TenBerge ); default Gowe |
NROOTS = scalar |
Number of roots (i.e. dimensions) to print for the output configurations, consensus and rotation matrices, and number of dimensions to save with the XOUTPUT , CONSENSUS and ROTATIONS paramaters if their matrices have alread not been defined; default is to print and save all the dimensions |
PLOT = string tokens |
Controls which graphs to display (consensus , individuals , projections ); default * i.e. none |
NDROOTS = scalar |
Number of dimensions to display in the consensus and individuals plots; default 3 |
TOLERANCE = scalar |
The algorithm is assumed to have converged when (last residual sum of squares) – (current residual sum of squares) < TOLERANCE × (number of configurations); default 0.00001 |
MAXCYCLE = scalar |
Limit on number of iterations; default 50 |
Parameters
XINPUT = pointers |
Each pointer points to a set of matrices holding the original input configurations |
---|---|
XOUTPUT = pointers |
Each pointer points to a set of matrices to store a set of final (output) configurations |
CONSENSUS = matrices |
Stores the final consensus configuration from each analysis |
ROTATIONS = pointers |
Each pointer points to a set of matrices to store the rotations required to transform each set of XINPUT configurations to their final (scaled) XOUTPUT configurations |
RESIDUALS = pointers |
Each pointer points to a set of matrices to store the distances of a set of scaled XINPUT configurations from its consensus |
RSS = scalars |
Stores the residual sum of squares from each analysis |
ROOTS = diagonal matrices |
Stores the latent roots from referring the centroid configuration to its principal axis form (consensus) for each analysis |
WSS = scalars |
Stores the initial within-configuration sum of squares from each analysis |
SCALINGFACTOR = variates |
Stores the isotropic scaling factors for configurations from each analysis |
PROJECTIONS = pointers |
Each pointer points to a set of matrices to store a set of projection matrices |
Description
An N × V matrix represents a configuration of N points in V dimensions. Given a set of M such matrices (XINPUT
), a generalized Procrustes analysis iteratively matches them to a common centroid configuration by the operations of translation to a common origin, rotation/reflection of axes and possibly also scale changes. This matching seeks to minimise the sum of the squared distances between the centroid and each individual configuration summed over all points (the Procrustes statistic for each configuration and the centroid, summed over all configurations). The final centroid is referred to principal axes to give a unique consensus configuration. Two methods of scaling are available (controlled by the SCALING
option). Isotropic scaling, which scales the all the dimensions of each configuration by an equal amount, takes place during the Procrustes analysis. The alternative is to scale each configuration prior to the analysis so that the trace of each matrix is one (see Arnold 1992). If this latter method is used, the subsequent residuals represent pure lack-of-fit and the scaling factors given in the results represent differences in relative size/spread of the original (centred) configurations, whereas for overall isotropic scaling the scaling factor contains components of both size and lack-of-fit.
Procedure GENPROCRUSTES
carries out a generalized Procrustes analysis and has parameters for saving various results for future use (XOUTPUT
, CONSENSUS
, ROTATIONS
, RESIDUALS
, RSS
, ROOTS
, WSS
, SCALINGFACTOR
, PROJECTIONS
). There are options for different methods to use for the matching (SCALING
, METHOD
), control of convergence (TOLERANCE
, MAXCYCLE
) and printing and plotting of results (PRINT
, PLOT
, NROOTS
and NDROOTS
).
Note that the special case of M=2 corresponds to the classical pairwise Procrustes matching (ROTATE
directive) except that by fitting each configuration to a common centroid the requirement to regard one of the initial configurations as fixed is obviated.
Options: PRINT
, SCALING
, METHOD
, NROOTS
, PLOT
, NDROOTS
, TOLERANCE
, MAXCYCLE
.
Parameters: XINPUT
, XOUTPUT
, CONSENSUS
, ROTATIONS
, RESIDUALS
, RSS
, ROOTS
, WSS
, SCALINGFACTOR
, PROJECTIONS
.
Method
The default method used for generalized Procrustes analysis in GENPROCRUSTES
is that described by Gower (1975). Each input configuration (XINPUT
– referred to henceforth as Xi, i=1…M) is initially column-centred, with the individual column means for each configuration optionally printed (by including the column
setting with the PRINT
option). If separate scaling is requested (option SCALING=separate
), the matrices are also scaled to have trace one (see Arnold 1992). A constraint is required on the overall sum of squares to prevent the trivial solution of matching by all configurations collapsing to the origin. In this procedure the constraint used is
∑ ( trace ( Xi′ Xi ) ) = M.
An initial estimate of the centroid is found from these centred and scaled configurations; firstly X2 is rotated to X1, with the rotated X2 saved as the new X2 and the centroid computed as the mean of X1 and the new X2; X3 is rotated to this centroid which is then recalculated as the mean of the three current configurations; and so on until all configurations Xi (i=1…M) have been included. The centroid thus found is taken as the initial centroid estimate Y, with the rotated values as the new Xi. The initial residual sum of squares Sr is calculated as
Sr = M × ( 1 – trace ( Y′ Y )).
Each of the current configurations Xi is then rotated to Y and the rotated position saved as the new Xi. The updated estimate of the centroid Yn is calculated as the mean of the new Xi (i=1…M) and the new residual sum of squares calculated as
Srn = M × ( 1 – trace ( Yn′ Yn )).
If isotropic scaling has been requested (option SCALING=isotropic
) new estimates roi′ of the individual scaling factors roi (originally set to 1) are now found by
roi′/roi= √( trace( Xi′Yn )/( trace( Xi′Xi ) × trace( Yn′Yn )))
and each Xi is updated by a factor of roi′/roi. The centroid is then recalculated as the mean of the new Xi and the new residual sum of squares calculated in a similar manner to before. If the change in residual Sr is less than a preset tolerance (controlled by option TOLERANCE
) the algorithm is taken to have converged. If not, the process is repeated until the tolerance is reached, up to a maximum number of iterations as set by the option MAXCYCLE
(default 50) after which a message of non-convergence is printed and the procedure terminated. Monitoring information about convergence can be printed by including the monitoring
setting with the PRINT
option.
After convergence a unique consensus configuration is found by referring the final centroid to principal axes; the corresponding latent roots may be saved using the ROOTS
parameter. Final results for the consensus and individual configurations (referred to the same principal axes) may be printed using the centroid
and individual
settings of the PRINT
option, and/or saved using the parameters XOUTPUT
, CONSENSUS
and ROTATIONS
. By default, results are presented and saved for the maximum available dimensionality but the option NROOTS
allows a reduced number of dimensions to be set. Analysis of variation for the M configurations (including the individual scaling factors) and for the N points, along with the initial within and between configurations sums of squares (WSS and BSS), the final residual sum of squares (RSS) and number of steps in the iteration process may be printed using the analysis
setting of the PRINT
option. The initial within-configuration sum of squares, final residual sum of squares and individual isotropic scaling factors may also be saved using, respectively, the WSS
, RSS
and SCALINGFAC
parameters. (Note that the final results are still scaled by the original factor from the initial overall constraint; to return to the original scale all sums of squares need adjustment by a factor of WSS/M and configurations by the square root of that factor).
Independently of the choice of dimensionality for printing and saving, the NDROOTS
option controls the dimensionality of the graphical output requested using the PLOT
option (default 3). The consensus
setting plots the consensus solution in the chosen dimensionalty, and the individual
setting gives the individual final configurations as well as the consensus. The projection
setting displays the projections (calculated from the individual rotation matrices scaled by the singular values from the consensus solution in principal axis form) as vectors labelled by configuration number and colour-coded for order of column. This projection plot can be particularly helpful in comparing the use of terms/attributes (columns of the configurations) by individual assessors in sensory analysis, both in conventional and free-choice profiling; see Arnold & Collins (1993) for further details.
Modifications to the method described above are given in TenBerge (1975), and may be invoked by the TenBerge
setting of the METHOD
option. This may give considerable savings in the time to reach convergence (Arnold 1988).
References
Arnold, G.M. (1988). Comparisons of algorithms for generalized Procrustes analyses. Genstat Newsletter, 22, 7-11.
Arnold, G.M. (1992). Scaling factors in generalized Procrustes analysis. Computational Statistics, Volume 1, Proceedings of the 10th Symposium on Computational Statistics, COMPSTAT, Neuchatel, Switzerland, August 1992, 61-66.
Arnold, G.M. & Collins, A.J. (1993). Interpretation of transformed axes in multivariate analysis. Applied Statistics, 42, 381-400.
Gower, J.C. (1975). Generalized Procrustes analysis. Psychometrika, 40, 33-51.
TenBerge, J.M.F. (1977). Orthogonal Procrustes rotation for two or more matrices. Psychometrika, 42, 267-276.
See also
Directives: ROTATE
. FACROTATE
.
Procedures: PCOPROCRUSTES
, SAGRAPES
.
Commands for: Multivariate and cluster analysis.
Example
CAPTION 'GENPROCRUSTES example',!t('Data from',\ 'Gower (1975), Psychometrika, 40, pages 33-51.',\ 'Note, however, that in Table 3 the scaling factors printed',\ 'were SQRT(ro[i]) instead of ro[i], and in Table 4 the',\ 'Between and Within Judges sums of squares were transposed.');\ STYLE=meta,plain MATRIX [ROWS=9; COLUMNS=7] X[1...3] READ [SERIAL=yes] X[] 47 44 49 38 35 40 40 72 45 41 77 72 73 35 61 49 40 58 58 62 30 66 56 45 55 53 46 30 37 72 50 27 30 33 25 76 76 53 81 79 75 45 64 59 51 72 61 66 40 21 70 43 27 22 26 20 71 70 34 72 72 71 35 : 31 39 33 29 48 38 42 30 60 36 22 36 34 39 27 55 30 18 28 22 42 48 52 53 27 21 30 31 20 55 28 22 33 27 35 21 42 31 46 76 33 42 30 52 53 35 44 30 44 5 57 53 12 13 6 31 55 63 53 77 79 57 49 : 43 46 44 22 53 44 29 53 79 75 79 73 52 27 22 85 83 19 27 17 22 28 89 78 13 29 20 24 75 86 85 34 75 55 38 53 79 82 72 78 74 38 15 85 85 46 75 52 35 5 95 95 3 20 2 24 27 78 85 89 92 81 41 : GENPROCRUSTES [PRINT=analysis,centroid,column,individual,monitoring;\ SCALING=isotropic] XINPUT=X