Shows all pairwise differences of values in a variate or table (A.R.G. McLachlan).
Options
PRINT = string token |
What to print (differences ); default diff |
---|---|
CLPRINT = string token |
How to print column labels (labels , integers ); default labe |
SORT = string token |
How to sort the DATA values (ascending , descending ); default * i.e. not sorted |
MVREMOVE = string token |
Whether to remove missing values (yes , no ); default no |
RCMETHOD = string token |
Which differences to calculate i.e. column-row, row-column, or absolute values (column , row , absolute ); default colu |
DIAGONAL = string token |
Whether to put the data values into the diagonal of the symmetric matrices of results (values ); default * i.e. diagonal left as missing values |
Parameters
DATA = variates or tables |
Data values whose pairwise differences are required |
---|---|
DIFFERENCES = symmetric matrices or pointers |
Saves the pairwise differences in a symmetric matrix if GROUPS is unset, otherwise in a pointer to several symmetric matrices |
GROUPS = factors or pointers |
Defines groupings of the data values |
LABELS = texts |
Labels for the rows (and columns) of the symmetric matrices of differences |
NEWLABELS = texts or pointers |
Saves the row labels of the symmetrix matrices of differences in a text if GROUPS is unset, otherwise in a pointer to several texts |
Description
ALLDIFFERENCES
prints out a symmetric matrix of all pairwise differences between values in a variate or table. That is, every value is subtracted once from every other value and the results of these subtractions are arranged in a symmetric matrix.
The DATA
parameter supplies the data values in either a variate or a table. If a DATA
table has margins, these are ignored and the marginal values not used in the differences calculations. If DATA
is set to a variate, this must have at least two unrestricted values for differences to be calculated.
The data can be subdivided into groups by using the GROUPS
parameter. This can be set to a single factor or to a pointer containing several factors. When it is a pointer, groups are formed for each combination of the factor levels. Each factor must either be of the same length as the DATA
variate, or be one of the factors classifying a DATA
table. If GROUPS
is specified, then at least one group must have two or more unrestricted values in it.
Labels for the rows (and columns) of the symmetric matrix of differences can be provided, using the LABELS
parameter, by supplying a text with a value for each DATA
value. The unrestricted number of labels must be the same as the number of unrestricted data values. If LABELS
are not supplied for a DATA
variate with n values, the integers from 1 to n are used for labels. If LABELS
are not supplied for a DATA
table, labels are created from the table factors using labels if factor labels are present, or levels if a factor does not have labels. The labels that are actually used for the rows of the symmetric matrices of differences can be obtained from the NEWLABELS
parameter which will either be a text if GROUPS
is not set, or a pointer to texts if GROUPS
is specified.
The pairwise differences can be saved using the using DIFFERENCES
parameter. If there are no groups, they are saved in a symmetric matrix. Alternatively, if there are groups, they are saved in pointer with a symmetric matrix for each group. The suffixes of the pointer are the ordinal levels of a single GROUPS
factor. For multiple GROUPS
factors they are the integers 1…n, where n is the number of factor combinations. The saved symmetric matrices each have an extra text defined that gives details of the contents. This text can be seen by setting option IPRINT=extra
when printing the matrices using the PRINT
directive.
The differences are printed by default, but you can set option PRINT=*
to suppress this if you just want to store the differences for further calculation or later printing. The format of the printed column labels can be controlled using the CLPRINT
option. The default, CLPRINT=labels
, prints both row labels and column labels i.e. it is equivalent to using the PRINT
directive with options RLPRINT=labels
and CLPRINT=labels
. The alternative setting CLPRINT=integers
is useful when printing results that have long labels. The columns are then labelled with integers instead of text labels, and the rows are labelled with both text and integers (where the column integers match those of the rows). This is equivalent to using PRINT
with options RLPRINT=labels,integers
and CLPRINT=integers
. At the same time, ALLDIFFERENCES
also changes the field width so that it just accommodates the widest value. Usually, this means that the columns are printed closer together, so that the output will be much more compact. If further control is needed over the printing of the results, it is suggested that you save the differences, and then use PRINT
with your own preferred settings.
The DATA
values can be sorted into either ascending or descending order by specifying the SORT
option. (Note though, that any labels supplied by the LABELS
parameter must be in the original unsorted order – these will be sorted automatically by ALLDIFFERENCES
together with the data values.) By default, the DATA
values are not sorted.
By default, when missing values are present in the DATA
, these will create missing values in the symmetric matrix of differences. If groups have been specified, then any group whose differences are all missing will be omitted from the printed output, although its symmetric matrix (of missing values) will still be saved by the DIFFERENCES
parameter. Alternatively, you can remove the missing values by setting option MVREMOVE=yes
. Groups with only missing differences are then neither printed nor saved.
The order of the subtraction in the symmetric matrix of results is controlled by the RCMETHOD
option. The default, column
, calculates the difference as
difference = column value – row value
but this can be reversed to give
difference = row value – column value
by setting RCMETHOD=row
. Essentially, the choice of RCMETHOD
determines the sign of the differences. If instead you wish all of the differences to be positive values, you can use RCMETHOD=absolute
. This is equivalent to calculating the differences by either method, and then taking their absolute values.
By default, the diagonal of the symmetric matrix of differences will contain missing values. Alternatively, you can replace these by the row values (which are also the column values) by setting option DIAGONAL=value
.
Options: PRINT
, CLPRINT
, SORT
, MVREMOVE
, RCMETHOD
, DIAGONAL
.
Parameters: DATA
, DIFFERENCES
, GROUPS
, LABELS
, NEWLABELS
.
Method
Each value in DATA
is subtracted from every other value and the result stored in a symmetric matrix. If restrictions are applied, or MVREMOVE=yes
, then procedure SUBSET
is first used to remove any restricted values or missing values.
Action with RESTRICT
Restrictions are honoured but are relevant only when the data values are in a variate. In this case, any restrictions on the DATA
variate, the GROUP
factor and the LABELS
text are all combined and honoured. Thus, you can exclude some data values not just by restricting the DATA
variate, but also by restricting the GROUPS
factor or the LABELS
text, or both. Only unrestricted values are used in the differences calculations. Since restrictions are not possible on a table, when the DATA
are a table, any restrictions on the LABELS
text and the GROUPS
factor are then ignored.
See also
Commands for: Calculations and manipulation.
Example
CAPTION 'ALLDIFFERENCES example'; STYLE=meta VARIATE [VALUES=1,3,4,5.5,9] Y ALLDIFFERENCES Y TEXT [VALUES=First,Second,Third,Fourth,Fifth] Txt ALLDIFFERENCES [SORT=descending; RCMETHOD=row; DIAGONAL=values] Y; LABELS=Txt FACTOR [LEVELS=2; LABELS=!T(A,B); VALUES=3(1,2)] F VARIATE [VALUES=1,3,4,5.5,9,12] X ALLDIFFERENCES X; GROUPS=F