Performs a Kolmogorov-Smirnoff two-sample test (S.J. Welham, N.M. Maclaren & H.R. Simpson).

### Options

`PRINT` = string tokens |
Output required (`test` , `differences` , `ranks` ): `test` gives the test statistic, `differences` gives signed differences, and `ranks` produces the ranks for each sample; default `test` |
---|---|

`GROUPS` = factor |
Defines the groups for a two-sample test if only the `Y1` parameter is specified |

### Parameters

`Y1` = variates |
Identifier of the variate holding the first sample |
---|---|

`Y2` = variates |
Identifier of the variate holding the second sample |

`R1` = variates |
Saves the ranks of the first sample |

`R2` = variates |
Saves the ranks of the second sample |

`STATISTIC` = scalars |
Scalar to save the test statistic (the maximum absolute difference between the cumulative distribution functions) |

`CHISQUARE` = scalars |
Scalar to save the chi-square approximation to the test statistic |

`DIFFERENCES` = variates |
Variate to save the signed differences between the cumulative distribution functions |

### Description

The Kolmogorov-Smirnoff test assesses the similarity between the underlying distributions of the two samples, by comparing their cumulative distribution functions; the test statistic is the maximum absolute difference between the cumulative distribution functions. The samples can either be specified in two separate variates using the parameters `Y1`

and `Y2`

. Alternatively, they can be given in a single variate, with the `GROUPS`

option set to a factor to identify the samples. The `GROUPS`

option is ignored when the `Y2`

parameter is set.

Output from the procedure is controlled by the `PRINT`

option: `test`

prints the relevant test statistic, `differences`

prints the signed differences, and `ranks`

prints a vector of ranks for each of the samples.

The test statistic and its chi-square approximation can be saved using the parameters `STATISTIC`

and `CHISQUARE`

respectively. The parameter `DIFFERENCES`

can be used to save the differences between the cumulative distributions. The `R1`

and `R2`

parameters allow the ranks of the samples to be saved.

Options: `PRINT`

, `GROUPS`

.

Parameters: `Y1`

, `Y2`

, `R1`

, `R2`

, `STATISTIC`

, `CHISQUARE`

, `DIFFERENCES`

.

### Method

The Kolmogorov-Smirnoff two sample test is a test of the null hypothesis that the two samples arise from the same distribution, against the alternative that the underlying distributions are different. The test compares the two empirical cumulative distribution functions in order to try and detect differences in shape of the underlying distributions. The cumulative distribution functions S_{1} and S_{2} are formed by

S* _{k}*(X) = ( number of scores in sample

*k*≤

*X*) / ( size of sample

*k*)

for *k*=1,2; and a suitable set of points *X*. The procedure uses the set of values taken by one or other of the samples, i.e. {*X*: *X* is in `DATA`

}. The maximum absolute difference

*MD* = max( abs { S_{1}(*X*) – S_{2}(*X*) } )

is used as the basis for significance tests. The chi-square approximation (2 degrees of freedom) to this statistic is *CH*:

*CH* = 4 × *MD* × *MD* × (*n*_{1}×*n*_{2} / (*n*_{1}+*n*_{2}) )

where *n*_{1}, *n*_{2} are the sizes of the samples. (See for example Siegel 1956, pages 127-136.)

### Action with `RESTRICT`

The variates `Y1`

and `Y2`

can be restricted, and in different ways. `KOLMOG2`

uses only those units of each variate that are not excluded by their respective restrictions. Restrictions are also obeyed on `Y1`

and `GROUPS`

, allowing `RESTRICT`

to be used for example to limit the data to only two groups when the `GROUPS`

factor has more than two levels.

### Reference

Siegel, S. (1956). *Nonparametric Statistics for the Behavioural Sciences*. McGraw-Hill, New York.

### See also

Directive: `DISTRIBUTION`

.

Procedures: `DPROBABILITY`

, `EDFTEST`

.

Commands for: Basic and nonparametric statistics.

### Example

CAPTION 'KOLMOG2 example',!t(\ 'Data from Siegel (1956), Nonparametric Statistics,',\ 'p. 133. Two groups are scored by the number of a set of 18',\ 'objects that they can identify.'); STYLE=meta,plain VARIATE [VALUES=11(0),7(3),8(6),3(9),5(12),5(15),5(18)] N1 & [VALUES=0,3(3),6(6),12(9),12(12),4(15),16(18)] N2 PRINT [ORIENT=across] N1,N2; DECIMALS=0; FIELD=6 CAPTION !T('The Kolmogorov-Smirnoff test is used to test whether the two',\ 'sets of results come from the same underlying distribution.') KOLMOG2 [PRINT=test,differences,ranks] Y1=N1; Y2=N2; R1=RN1; R2=RN2;\ STATISTIC=M; CHISQUARE=Chi2; DIFFERENCES=Diffs PRINT M,Chi2 & [ORIENT=across] RN1,RN2,Diffs; DECIMALS=1; FIELD=6