Produces a mosaic plot to display a table of counts (D.B. Baird).
LINECOLOUR = text or scalar
|Colour to use for the outlines of the boxes; default
EMPTYCOLOUR = text or scalar
|Colour to use for the outlines of the empty boxes; default
THICKNESS = scalar
|Line thickness for the outlines of the boxes; default 1
LABELSIZE = scalar
|Label size for the axis labels; default 1
GAP = scalar
|Relative size of the gaps between boxes; default 1
MINSIZE = scalar
|Minimum row/column dimension for a box; default 0.002
DATA = tables or pointers
|Data to be plotted
ROWFACTORS = pointers
|Factors to be displayed down the window; if
COLFACTORS is not specified, the default is to display the factors in the second half of the classification set of the table, otherwise it is the classifying factors not included in
COLFACTORS = pointers
|Factors to be displayed across the window; if
ROWFACTORS is not specified, the default is to display the factors in the first half of the classification set of the table, otherwise it is the classifying factors not included in
TITLE = texts
|Title for the plot; default
* i.e. none
COLOURS = variate or text
|The colours to shade the boxes; by default the colours are taken from the pens 2 onwards, with a final colour of white
LABELWIDTH = scalars or variates
|Maximum length of the labels to display for each factor; default
* uses the full text of the factor labels
WINDOW = scalar
|Window number for the graph; default 3
SCREEN = string token
|Whether to clear the screen before plotting or to or continue plotting on the existing screen (
DMOSAIC produces a mosaic plot (Friendly 1994) of a table of counts. The
DATA parameter supplies the data to plot, as either a table or a pointer to a set of factors that are then tabulated to create a table of counts. The display takes the form of a set of boxes arranged in rows and columns. The size of each box reflects the proportion of the observations that fall into the corresponding cell of the table. The boxes are coloured by the levels of the final factor, to represent the changing proportions of this factor within the other factors.
COLFACTORS parameters can supply pointers containing the factors to be displayed down and across the window, respectively. If both are defined, then together they must contain all the factors in the table. If just one is defined, the other is formed from the remaining factors in the table in the order given in the
CLASSIFICATION of the table or the
DATA pointer. If neither is specified, the first half of the factors (rounded down to the integer below in the case of an odd number) are assigned to columns and remainder to rows, in the order defined by the
CLASSIFICATION of the table or by the
DATA pointer. Changing the contents of the factors to be displayed across and down the screen, or their ordering, can give a very different view of the data, especially the choice for last factor displayed across the screen. This is used to colour the boxes, as explained below.
The width of each box is determined from the relative proportion of the observations that fall into the corresponding column-factor combinations. Within each column, the heights of the boxes are proportional to the number of observations in the corresponding cells of the table.
There are gaps between the rows and columns. These are largest between the levels of the first row or column factor, smaller between the levels of the second row or column factor, continuing to shrink until the smallest gaps are between the levels of the final row or column factor. The sizes of the gaps can all be made larger or smaller by the
GAP option. Setting a value of zero gives no gaps between boxes while, for example, two doubles the sizes of the gaps.
The outline of each box is drawn with a pen, whose colour and thickness are specified by the
THICKNESS options, with defaults of
'black' and 1 respectively. The
MINSIZE option puts a lower limit on the dimensions of the boxes in both row and column directions. The default of 0.002 prevents boxes with little or no counts from being lost to the eye. Empty boxes (i.e. those with no observations) have their outlines drawn in the colour specified by the
EMPTYCOLOUR option (default
'purple') and have no fill. The other boxes are coloured according to the levels of the final column factor. The colours for each of its levels can be specified by the
COLOURS parameter, as either a text with colour names (e.g.
'blue')) or a variate containing RGB colours. By default, the colours for all but the last level are taken from the colours assigned to pens 2 onwards, with white assigned to the last level.
The labels of the last column factor are displayed in the lower x-margin, and the rest in the upper x-margin. The labels of the last row factor are displayed in the lower y-margin, and the rest in the upper y-margin. If there are many factors or long labels, it can be difficult to fit all the labels on the axes. If so, you can use the
LABELSIZE option to change the size of the labels, and the
LABELWIDTH parameter to truncate the labels at a maximum width.
LABELWIDTH can be set to a variate defining the maximum width for each factor (with factors in the order defined by the
COLFACTORS and then the
ROWFACTORS parameters), or a scalar to apply the same maximum width to all the factors.
WINDOW parameter defines the window to be used for the plot (default 3), and the
SCREEN parameter controls whether or not the screen is cleared before plotting (default
clear). To display multiple plots on the same screen, you should set
SCREEN=keep for the second and subsequent plot. You can use the
FRAME directive or the
FFRAME procedure to specify the numbers and locations of the windows if the Genstat defaults are unsuitable. The
TITLE parameter can be used to specify a title for the plot; by default there is none.
SCREEN. Method The proportions of observations falling into the row classes are calculated and these form the x dimensions of the boxes. Within each row combination, the proportions of observations falling into the column classes are calculated and these form the box heights. The gaps are added to the box positions and minimum dimensions enforced. Any empty boxes are drawn with the empty pen. The labels are applied to the four sides of the plot, and label positions are adjusted to avoid overlap by a subsidiary procedure
Action with RESTRICT
DMOSAIC will obey restrictions on the factors in a
Friendly, M. (1994). Mosaic displays for multi-way contingency tables, Journal of the American Statistical Association, 89, 190–200.
CAPTION 'DMOSAIC Example',\ 'Survival of souls on the Titanic by class, age and sex';\ STYLE=major,minor SPLOAD [PRINT=*] '%DATA%/Titanic.gsh'; ISAVE=pData "Class - passenager class: Crew, First, Second, Third Age - Adult or Child Sex - Male or Female Survival - No or Yes" TABULATE [CLASS=Class,Age,Sex,Survived; COUNTS=tcounts; PRINT=counts] DMOSAIC tcounts; TITLE='Survival of Titanic sinking by class, age and sex' CAPTION 'Results from a detergent preference test'; STYLE=minor SPLOAD [PRINT=*] '%DATA%/Detergent.gsh' "AUser - No or Yes - whether they had used product A previously Temperature - Low or High - temperature used during the test WaterSoftness - Soft, Medium and Hard Preference - A or B indicates preference for product A or B" TABULATE [CLASS=AUser,Temperature,WaterSoftness,Preference; \ COUNTS=prefer; PRINT=counts] DMOSAIC prefer; TITLE='Consumer preference for detergent A vs B'