1. Home
  2. ALLDIFFERENCES procedure

ALLDIFFERENCES procedure

Shows all pairwise differences of values in a variate or table (A.R.G. McLachlan).

Options

PRINT = string token What to print (differences); default diff
CLPRINT = string token How to print column labels (labels, integers); default labe
SORT = string token How to sort the DATA values (ascending, descending); default * i.e. not sorted
MVREMOVE = string token Whether to remove missing values (yes, no); default no
RCMETHOD = string token Which differences to calculate i.e. column-row, row-column, or absolute values (column, row, absolute); default colu
DIAGONAL = string token Whether to put the data values into the diagonal of the symmetric matrices of results (values); default * i.e. diagonal left as missing values

Parameters

DATA = variates or tables Data values whose pairwise differences are required
DIFFERENCES = symmetric matrices or pointers Saves the pairwise differences in a symmetric matrix if GROUPS is unset, otherwise in a pointer to several symmetric matrices
GROUPS = factors or pointers Defines groupings of the data values
LABELS = texts Labels for the rows (and columns) of the symmetric matrices of differences
NEWLABELS = texts or pointers Saves the row labels of the symmetrix matrices of differences in a text if GROUPS is unset, otherwise in a pointer to several texts

Description

ALLDIFFERENCES prints out a symmetric matrix of all pairwise differences between values in a variate or table. That is, every value is subtracted once from every other value and the results of these subtractions are arranged in a symmetric matrix.

The DATA parameter supplies the data values in either a variate or a table. If a DATA table has margins, these are ignored and the marginal values not used in the differences calculations. If DATA is set to a variate, this must have at least two unrestricted values for differences to be calculated.

The data can be subdivided into groups by using the GROUPS parameter. This can be set to a single factor or to a pointer containing several factors. When it is a pointer, groups are formed for each combination of the factor levels. Each factor must either be of the same length as the DATA variate, or be one of the factors classifying a DATA table. If GROUPS is specified, then at least one group must have two or more unrestricted values in it.

Labels for the rows (and columns) of the symmetric matrix of differences can be provided, using the LABELS parameter, by supplying a text with a value for each DATA value. The unrestricted number of labels must be the same as the number of unrestricted data values. If LABELS are not supplied for a DATA variate with n values, the integers from 1 to n are used for labels. If LABELS are not supplied for a DATA table, labels are created from the table factors using labels if factor labels are present, or levels if a factor does not have labels. The labels that are actually used for the rows of the symmetric matrices of differences can be obtained from the NEWLABELS parameter which will either be a text if GROUPS is not set, or a pointer to texts if GROUPS is specified.

The pairwise differences can be saved using the using DIFFERENCES parameter. If there are no groups, they are saved in a symmetric matrix. Alternatively, if there are groups, they are saved in pointer with a symmetric matrix for each group. The suffixes of the pointer are the ordinal levels of a single GROUPS factor. For multiple GROUPS factors they are the integers 1…n, where n is the number of factor combinations. The saved symmetric matrices each have an extra text defined that gives details of the contents. This text can be seen by setting option IPRINT=extra when printing the matrices using the PRINT directive.

The differences are printed by default, but you can set option PRINT=* to suppress this if you just want to store the differences for further calculation or later printing. The format of the printed column labels can be controlled using the CLPRINT option. The default, CLPRINT=labels, prints both row labels and column labels i.e. it is equivalent to using the PRINT directive with options RLPRINT=labels and CLPRINT=labels. The alternative setting CLPRINT=integers is useful when printing results that have long labels. The columns are then labelled with integers instead of text labels, and the rows are labelled with both text and integers (where the column integers match those of the rows). This is equivalent to using PRINT with options RLPRINT=labels,integers and CLPRINT=integers. At the same time, ALLDIFFERENCES also changes the field width so that it just accommodates the widest value. Usually, this means that the columns are printed closer together, so that the output will be much more compact. If further control is needed over the printing of the results, it is suggested that you save the differences, and then use PRINT with your own preferred settings.

The DATA values can be sorted into either ascending or descending order by specifying the SORT option. (Note though, that any labels supplied by the LABELS parameter must be in the original unsorted order – these will be sorted automatically by ALLDIFFERENCES together with the data values.) By default, the DATA values are not sorted.

By default, when missing values are present in the DATA, these will create missing values in the symmetric matrix of differences. If groups have been specified, then any group whose differences are all missing will be omitted from the printed output, although its symmetric matrix (of missing values) will still be saved by the DIFFERENCES parameter. Alternatively, you can remove the missing values by setting option MVREMOVE=yes. Groups with only missing differences are then neither printed nor saved.

The order of the subtraction in the symmetric matrix of results is controlled by the RCMETHOD option. The default, column, calculates the difference as

difference = column value – row value

but this can be reversed to give

difference = row value – column value

by setting RCMETHOD=row. Essentially, the choice of RCMETHOD determines the sign of the differences. If instead you wish all of the differences to be positive values, you can use RCMETHOD=absolute. This is equivalent to calculating the differences by either method, and then taking their absolute values.

By default, the diagonal of the symmetric matrix of differences will contain missing values. Alternatively, you can replace these by the row values (which are also the column values) by setting option DIAGONAL=value.

Options: PRINT, CLPRINT, SORT, MVREMOVE, RCMETHOD, DIAGONAL.

Parameters: DATA, DIFFERENCES, GROUPS, LABELS, NEWLABELS.

Method

Each value in DATA is subtracted from every other value and the result stored in a symmetric matrix. If restrictions are applied, or MVREMOVE=yes, then procedure SUBSET is first used to remove any restricted values or missing values.

Action with RESTRICT

Restrictions are honoured but are relevant only when the data values are in a variate. In this case, any restrictions on the DATA variate, the GROUP factor and the LABELS text are all combined and honoured. Thus, you can exclude some data values not just by restricting the DATA variate, but also by restricting the GROUPS factor or the LABELS text, or both. Only unrestricted values are used in the differences calculations. Since restrictions are not possible on a table, when the DATA are a table, any restrictions on the LABELS text and the GROUPS factor are then ignored.

See also

Procedures: PAIRTEST, RPAIR.

Commands for: Calculations and manipulation.

Example

CAPTION        'ALLDIFFERENCES example'; STYLE=meta
VARIATE        [VALUES=1,3,4,5.5,9] Y
ALLDIFFERENCES Y
TEXT           [VALUES=First,Second,Third,Fourth,Fifth] Txt
ALLDIFFERENCES [SORT=descending; RCMETHOD=row; DIAGONAL=values] Y; LABELS=Txt
FACTOR         [LEVELS=2; LABELS=!T(A,B); VALUES=3(1,2)] F
VARIATE        [VALUES=1,3,4,5.5,9,12] X
ALLDIFFERENCES X; GROUPS=F
Updated on March 11, 2019

Was this article helpful?