Plots results from correspondence analysis or multiple correspondence analysis (A.I. Glaser).
Options
DIMENSIONS = scalars |
Two numbers specifying which axes of the ordinations to plot; default 1,2 |
---|---|
PLOT = string tokens |
Which scores to plot (rowscores , rowactive , rowpassive , colscores , colactive , colpassive ); default rows , cols for correspondence analysis and cols for multiple correspondence analysis |
ROWSCALING = string token |
Scaling to use for row coordinates (principal , standard , mass , sqrtmass ); default prin |
COLSCALING = string token |
Scaling to use for column coordinates (principal , standard , mass , sqrtmass ); default prin |
COLOURMETHOD = string tokens |
Whether colour of symbol should show level of inertia of rows or columns (rowinertia , colinertia ); default * |
SIZEMETHOD = string tokens |
Whether size of symbol should show row or column masses (rowmass , colmass ); default * |
FACCOLOURS = text, variate or scalar |
Specifies a colour or colours for the factors in a multiple correspondence analysis; if this is unset, a different colour is selected automatically for every factor |
WINDOW = scalar |
Which graphical window to use; default 1 |
KEYWINDOW = scalar |
Graphical window for the key |
SAVE = pointer |
Supplies results from a analysis by CORANALYSIS or MCORANALYSIS ; default uses the most recent analysis |
Parameters
TITLE = texts |
Titles for the plot |
---|---|
LMROWVARIABLES = string tokens |
How to label the row scores (identifiers , labels , none , numbers ); default labe if LROWVARIABLES is set, otherwise iden |
LMCOLVARIABLES = string tokens |
How to label the column scores (identifiers , labels , none , numbers ); default labe if LCOLVARIABLES is set, otherwise iden |
LROWVARIABLES = texts |
Labels for row variables |
LCOLVARIABLES = texts |
Labels for column variables |
Description
CABIPLOT
provides a graphical representation of results from CORANALYSIS
or MCORANALYSIS
. By default CABIPLOT
plots both sets of scores (rowscores, colscores) for correspondence analysis or just columns scores for multiple correspondence analysis, but you can set option PLOT
to select which ones are required. For correspondence analysis, you can also select settings that will plot only active or passive scores (see CORANALYSIS
for further explanation).
The row scores are plotted as blue circles, while the column scores are plotted as red squares; active scores have filled symbols, but passive scores are not filled. With multiple correspondence analysis, the FACCOLOURS
option can be used to define the colour to use for each factor, using either RGB values (in a variate or scalar) or the standard Genstat colour names (in a text); see PEN
for more details. If insufficient colours are specified, CABIPLOT
will recycle the list. So you can set FACCOLOURS
to a scalar or to a text with a single string if you want to use the same colour for all the factors. If FACCOLOURS
is not set, CABIPLOT
will select a different colour for each factor automatically.
The ROWSCALING
and COLSCALING
options are define the scaling to use for the row and columns coordinates respectively, with settings:
principal |
plots principal coordinates (default), |
---|---|
standard |
plots standard coordinates, |
mass |
plots standard coordinates multiplied by the row (or column) mass, |
sqrtmass |
plots standard coordinates multiplied by the square root of the row (or column) mass. |
These are based on the row and column scores obtained from CORANALYSIS
or MCORANALYSIS
. Principal coordinates are scaled so that they have inertia equal to the square of the singular values, whereas the weighted sum-of-squares of the standard coordinates are equal to one. At least one of ROWSCALING
or COLSCALING
must be set to principal
, which is the default for both options. These default settings produce a plot, which is not a biplot, but which is used very often to illustrate relationships between and amongst variables. The reasoning behind multiplying the standard coordinates by the corresponding mass or its square root is to “pull” the rarer categories to be closer to the origin; see Chapter 13 of Greenacre (2007).
The COLOURMETHOD
option has settings rowinertia
and colinertia
that plot the row or coordinates scores, respectively, at a different level of shading; the coordinates with higher inertias are plotted with darker colours then those with low inertias. The shading is proportional to the square root of the inertia relative to the row or column with the highest inertia. Symbols representing passive points will appear completely transparent on the plot as they are perceived to have zero inertia.
The SIZEMETHOD
option similarly has settings rowmass
and colmass
that plot the row and column coordinates, respectively, in sizes that depend on the row and column mass. The sizes of the symbols are proportional to the square root of the mass compared to the square root of the row or column with the highest mass, plus a constant to ensure all symbols are visible.
By default the first two dimensions are plotted, but you can specify other dimensions to be plotted using the DIMENSIONS
option.
The data used in MCORANALYSIS
may have many repeated values (particularly in survey data). To avoid replotting the same points in a large data set (i.e. with more than 500 units), only one point is plotted and the label refers to the first point in the data set. If the COLOURMETHOD
or SIZEMETHOD
options are set, these will use the mass and/or inertia of the labelled point.
The labels for the row and column scores can be set using the LMROWVARIABLES
and LMCOLVARIABLES
parameters, by selecting one of the following settings:
identifiers |
uses the identifiers of the row or column scores, |
---|---|
labels |
expects labels to be supplied (in a text) using the LROWVARIABLES or LCOLVARIABLES parameter, |
none |
gives no labels, and |
numbers |
uses the row or column numbers of the original matrix. |
The default for both parameters is identifiers
, unless LROWVARIABLES
or LCOLVARIABLES
is set, when the corresponding default becomes labels
. Note that the texts supplied by LROWVARIABLES
or LCOLVARIABLES
must have the same number of values as number of the rows or columns in the original data matrix, even if active or passive points are being omitted from the plot. Similarly, if the setting numbers
is chosen, these will refer to the corresponding row or column of the original matrix, ignoring any any active or passive rows or columns, or subsetting of rows or columns in CORANALYSIS
.
By default CABIPLOT
uses the results from the most recent analysis from by CORANALYSIS
or MCORANALYSIS
. However, you can display results from an earlier analysis by saving the information about the analysis with the SAVE
parameter of CORANALYSIS
or MCORANALYSIS
, and then using this as the setting of the SAVE
option of CABIPLOT
.
Options: DIMENSIONS
, PLOT
, ROWSCALING
, COLSCALING
, COLOURMETHOD
, SIZEMETHOD
, FACCOLOURS
, WINDOW
, KEYWINDOW
, SAVE
.
Parameters: TITLE
, LMROWVARIABLES
, LMCOLVARIABLES
, LROWVARIABLES
, LCOLVARIABLES
.
Method
The plots are explained in Chapter 13 and 18 of Greenacre (2007).
Reference
Greenacre, M. (2007). Correspondence Analysis in Practice, second edition. Chapman & Hall, London.
See also
Procedures: CORANALYSIS
, MCORANALYSIS
.
Commands for: Multivariate and cluster analysis, Graphics.
Example
CAPTION 'CABIPLOT examples',!t('1) Correspondence analysis:',\ 'biplot for smoking data from Table 3.1 of Greenacre (1984)';\ STYLE=meta,minor TEXT Staff; VALUES=!T(Sen_Mngr,Jun_Mngr,Sen_Empl,Jun_Empl,Secretary) & Smoke; VALUES=!T(None,Light,Medium,Heavy) MATRIX [ROWS=Staff; COLUMNS=Smoke; VALUES=\ 4,2,3,2, 4,3,7,4, 25,10,12,4, 18,24,33,13, 10,6,7,2] Smoking CORANALYSIS Smoking CABIPLOT CAPTION !t('2) Multiple correspondence analysis:',\ 'Exhibit 18.2 (p. 139) from Greenacre (2007)'); STYLE=minor " The data come from an International Social Survey Programme (ISSP) survey of Family and Changing Gender Roles in 1994 in 24 countries. The spreadsheet MCOR-1.gsh contains the opinions of German residents about working women for 4 questions, each with 4 possible responses." SPLOAD FILE='%gendir%/examples/MCOR-1.GSH' POINTER [VALUES=Q1,Q2,Q3,Q4] women MCORANALYSIS [COLMETHOD=indicator] women CABIPLOT [PLOT=colscores]