1. Home
  2. DBIPLOT procedure

DBIPLOT procedure

Plots a biplot from an analysis by PCP, CVA or PCO (A.I. Glaser).

Options

PLOT = string tokens Additional features for the plot (convexhull, means); default * i.e. none
METHOD = string token Type of axes to plot (predictive, interpolative); default pred
HORIZONTAL = identifier Which axis to make horizontal; default * i.e. none
PREDICTIONS = matrix Saves predicted values
GROUPS = factor Factor defining groupings of individuals for a PCP biplot; default * i.e. none
LMINDIVIDUALS = string tokens How to label the individuals (labels, none, numbers, unitlabels); default labe if LINDIVIDUALS is set, otherwise unit
LMVARIABLES = string tokens How to label the variables (identifiers, labels, none, numbers); default labe if LVARIABLES is set, otherwise iden
LINDIVIDUALS = texts Labels for individuals (i.e. scores)
LVARIABLES = texts Labels for variables (i.e. biplot axes)
MULTIPLIER = scalar Value to multiply vector loadings; default * i.e. determined automatically
TITLE = text Title for the plot; if this is unset, an appropriate title is formed automatically
WINDOW = scalar Which graphical window to use; default 1
KEYWINDOW = scalar Which graphical window to use for the key when there are groupings of individuals (0 for none); default 2
SCREEN = string token Whether to clear the screen before plotting or to continue plotting on the old screen (clear, keep); default clea
SIZEMULTIPLIER = scalar Multiplier used in the calculation of the size in which to draw symbols and labels; default 1
SAVE = pointer Supplies results from an ordination analysis by PCP, CVA or PCO; default uses the most recent analysis

Parameters

VARIABLE = identifiers Axis variables
DISPLAY = string tokens Whether to show, hide or omit each axis (show, hide, omit); default show
COLOUR = texts or scalars Colour to use to plot each axis

Description

DBIPLOT plots biplots displaying the results from a principal components, canonical variates or principal coordinates analysis, performed by the PCP, CVA or PCO directives. By default DBIPLOT uses the results from the most recent PCP, CVA or PCO, but you can display results from an earlier analysis by saving the information with the SAVE parameter of PCP, CVA or PCO, and then providing this to DBIPLOT using its own SAVE parameter.

Following the approach of Gower & Hand (1996), the biplot can be viewed as a multivariate analogue of the scatterplot. The information is plotted on the plane defined by the first two principal axes of the analysis (i.e. the first two principal components for a PCP, or the first two canonical variates for a CVA). The default title of the biplot contains the percentage of variance explained by the first and second dimension combined, whilst the title of the x- and y-axis shows the amount of variation explained by the first and second dimension individually (you can specify your own title using the TITLE option). The scores from the analysis are plotted, to show the positions of the individual observations. More importantly, the plot contains an oblique “axis” for each variable (its biplot axis) that allows you to see how each individual’s projection into this plane relates to its value for the variable concerned. The type of axis to be displayed will depend on how you want to use the plot. The possibilities, selected by the setting of the METHOD option, are as follows:

    predictive plots predictive axes (default),
    interpolative plots interpolative axes.

Predictive axes show the values of the variables that are predicted by the projection into 2-dimensions that is defined for each point by the analysis; essentially this is done by taking an orthogonal projection of the point onto each the biplot axis. Interpolative axes show the values of the variables that would lead to a point being placed at the position of the selected point on the graph. So here the point is being predicted by the variables, rather than the variables by the point. This is done by taking the sum of a set of vectors, one in the direction of each variable, with lengths equal to the values of the variables for that point.

The axes are defined from the loadings from the analysis. With a PCP analysis (or a PCO analysis based on a data matrix), the directions of the axes are given by loadings calculated in the analysis (but the positions of the scale points on the axes differ between the two types of axis). For a CVA analysis, the loadings define the interpolative axes for the biplots, and their inverses define the predictive axes. However, no loadings are available for PCO analyses based a dissimilarity matrices, and so no axes can be plotted. For further explanation, and details of the underlying mathematics, see Gower & Hand (1996).

Arrows are plotted on the axes to represent their loadings (or inverse loadings); the loadings show the approximate contribution of each variable in the first two dimensions. If the loadings are all close to the origin, they are multiplied by a scalar to make them easier to read. By default, the multiplier is calculated automatically, but you can supply a specific value by using the MULTIPLIER option. To save the automatic value, you can set MULTIPLIER to a scalar containing a missing value.

In general, each axis will be at an angle to the traditional x-axis. However, you can arrange for one of the biplot axes to be in the direction of the x-axis, by setting the HORIZONTAL option to the identifier of its variate. It should be noted that this operation is purely cosmetic and, if HORIZONTAL is not set, then the direction of the x-axis will represent the direction of maximum variance.

By default all the axes are plotted, each in a colour chosen automatically by DBIPLOT. However, there are parameters to allow you to modify this for any axis. The VARIABLE parameter specifies the axis to change (using its identifier). The DISPLAY parameter indicates whether the axis is to be shown, hidden or omitted altogether. (The Graphics Viewer of Genstat for Windows allows you to toggle displayed items to become hidden, or hidden items to become displayed.) The COLOUR parameter defines the colour to be used, by supplying either a single-valued text with the name of the colour or a scalar containing the RGB value for the colour (see the PEN directive for details).

The scores from PCP analyses are plotted to identify the position of each individual as a red circle, unless you use the GROUPS option to define groupings of the individuals (the groups are then plotted in different colours). With a CVA analysis, groupings are automatically defined from the groups in the analysis itself.

Hotpoints are defined at the point for each of individual to allow you to view the values corresponding to that individual on the axes. In the Graphics viewer in Genstat for Windows, you can click on the hotpoint symbol and then click on any score to see how that point is represented on each of the axes. In addition, whatever axes are defined, you can use the PREDICTIONS option to save a matrix with the predicted values of the individuals for all the variables.

The PLOT option allows you to illustrate other aspects of the scores.

    convexhull draws a convex hull around the points (or the points in each group if groupings have been defined).
    means plots the group means for a CVA, or the group means for a PCP (if the GROUPS option is set), or the overall mean for a PCO biplot. (In other situations the centroid is the origin, which is where all the oblique axes cross, so it would clutter up an already congested plot.)

The types of label for the scores and loadings can be set using the LMINDIVIDUALS and LMVARIABLES parameters respectively, by selecting one of the following settings:

    identifiers uses the identifiers of the variables (LMVARIABLES only),
    labels expects labels to be supplied (in a text) using the LINDIVIDUALS or LVARIABLES parameter,
    none gives no labels,
    numbers uses the row or column numbers of the scores and variables, and
    unitlabels unit labels of the data variates or row labels of the data matrix, if present, otherwise the unit numbers (LMINDIVIDUALS only).

If LINDIVIDUALS is set, the default for LMINDIVIDUALS is defined to be labels; otherwise the default is unitlabels. Similarly, the default for LMVARIABLES is labels if LVARIABLES is set; otherwise it is defined to be identifiers.

The WINDOW and KEYWINDOW options specify the windows to use for the plot and its key, respectively, in the usual way. The SCREEN option controls whether the graphical display is cleared before the biplot is plotted.

The SIZEMULTIPLIER option allows you to modify the sizes of the symbols and labels in the plot. The default of 0.75 works well under most circumstances, but you might want to specify a smaller value to prevent overlapping, when there are large numbers of points or axes to be displayed.

Options: PLOT, METHOD, HORIZONTAL, PREDICTIONS, GROUPS, LMINDIVIDUALS, LMVARIABLES, LINDIVIDUALS, LVARIABLES, MULTIPLIER, TITLE, WINDOW, KEYWINDOW, SCREEN, SIZEMULTIPLIER, SAVE.

Parameters: VARIABLES, DISPLAY, COLOUR.

Method

The plots in DBIPLOT are explained in Gower & Hand (1996); see Chapter 2 for principal components, and Chapter 5 for canonical variates.

Reference

Gower, J.C. & Hand, D.J. (1996). Biplots. Chapman & Hall, London.

See also

Directives: CVA, PCO, PCP.

Procedures: BIPLOT, CABIPLOT, CRBIPLOT, CRTRIPLOT, CVAPLOT, GGEBIPLOT.

Commands for: Multivariate and cluster analysis, Graphics.

Example

CAPTION  'DBIPLOT example 1: PCP'; STYLE=meta
POINTER  [VALUES=Height,Length,Width,Weight] Dmat
VARIATE  [NVALUES=12] Dmat[]
READ     Dmat[]
4.1 5.2 1.2 3.1 4.2 1.5 3.2 5.6 2.3 0.2 0.1 0.2
6.2 4.1 4.1 4.1 2.3 6.2 6.3 5.1 0.2 0.9 4.9 7.3
10.1 5.6 3.2 9.4 1.2 9.8 1.0 1.0 6.1 9.7 1.0 3.7
6.1 9.6 9.7 5.5 2.3 5.0 9.4 8.1 4.5 4.9 0.3 1.8 :
PCP      [PRINT=loadings,roots] Dmat
" Default: predictive axes."
DBIPLOT
" You can use the HORIZONTAL option to rotate the axes so that one of them is
  horizontal. This is a purely cosmetic exercise but can aid visualisation."
DBIPLOT  [HORIZONTAL=Width]
" interpolative axes "
DBIPLOT  [METHOD=interpolative]

CAPTION  'DBIPLOT example 2: CVA'; STYLE=meta
SPLOAD   '%gendir%/examples/CVA-1.gsh'; ISAVE=Data
FACTOR   [LEVELS=4; VALUES=3,1,2,2,2,1,1,4,2,3,3,4,2,2,2,2,2,4,\ 
         1,3,4,4,2,2,2,1,1,3] Groupno
SSPM     [TERMS=Data[]; GROUPS=Groupno] W
FSSPM    W
CVA      [PRINT=roots,loadings,means,tests,distances] W
" Default, now with a key due to factors."
DBIPLOT   
" The PLOT option allows you to plot extra features, like a convex hull
  and group means."
DBIPLOT  [WIN=4;KEY=0;PLOT=convex,mean]
Updated on June 20, 2019

Was this article helpful?