Performs a random permutation test for a two-dimensional contingency table (L.H. Schmitt, M.C. Hannah & S.J. Welham).

### Options

`PRINT` = string tokens |
Output required (`summary` , `observed` , `expected` ); default `summ` |
---|---|

`PLOT` = string token |
What to plot (`histogram` ); default `hist` |

`METHOD` = string token |
Method for calculating chi-square (`pearson` , `maximumlikelihood` ); default `pear` |

`NTIMES` = scalar |
Number of permutations to make; default 999 |

`SEED` = scalar |
Seed for the random number generator used to make the permutations; default 0 continues from the previous generation or (if none) initializes the seed automatically |

### Parameters

`DATA` = tables |
Table containing observed data |
---|---|

`CHISQUARE` = scalars |
Saves the observed chi-square value |

`CHIPERMUTED` = variates |
Saves the chi-square values from the permuted data sets |

`PROBABILITY` = scalars |
Saves the probability value from the test |

### Description

The `CHIPERMTEST`

procedure uses a permutation test to calculate the significance probability for a chi-square test of the independence of rows and columns in a two-dimensional contingency table. This provides a nonparametric alternative to the more usual chi-square test of independence (see the `CHISQUARE`

procedure). The usual test depends upon the fact that the distribution of its so-called “chi-square” test statistic becomes a chi-square distribution as the numbers of observations become infinite. (Technically, we would say that the distribution is *asymptotically* chi-square.) However, the test is unreliable with smaller numbers, especially when the expected number in any cell of the table is less than five.

The permutation test simulates the random distribution of table values that may occur in tables that have the same overall distribution of numbers over the columns, and over the rows, as in the original table. We can assess the significance of the chi-square statistic that we can calculate from the observed table, by seeing where it lies in the distribution of statistics that we obtain from the permuted data.

The `NTIMES`

option specifies how many permutations are done (default 999). The `SEED`

option supplies the seed that is used in the `RANDOMIZE`

directive to generate the permutations. The default of zero continues the existing sequence of random numbers if `RANDOMIZE`

has already been used in the current Genstat job. If `RANDOMIZE`

has not yet been used, Genstat picks a seed at random.

The `DATA`

parameter supplies the observed data values, in a table with two classifying factors. The `CHISQUARE`

can save the chi-square statistic calculated from the `DATA`

table (in a scalar). The `CHIPERMUTED`

parameter can save the chi-square statistics calculated from the permuted data sets (in a variate), and the `PROBABILITY `

parameter can save the significance probability from the permutation test (in a scalar).

The `PRINT`

option controls the output, with the following settings:

`summary` |
prints a summary, containing the chi-square statistic, the minimum and maximum statistics calculated from the permuted data sets, and the probability (default); |
---|---|

`observed` |
prints the `DATA` table; and |

`expected` |
prints the expected values for tables with the same overall distribution of numbers over rows and over columns, but no interaction between the row and column factors (i.e. in a table where the rows and columns are independent). |

By default, `CHIPERMTEST`

plots a histogram showing the distribution of statistics obtained from the permuted data sets, with the chi-square statistic from the observed data superimposed as a vertical line. You can suppress this by setting option `PLOT=*`

.

The `METHOD`

option controls how the chi-square statistic is calculated. The default is to use the usual Pearson approximation (see the *Method* section), but you can set `METHOD=likelihood`

to calculate it by maximum likelihood instead (using the Genstat facilities for generalized linear models).

Options: `PRINT`

, `PLOT`

, `METHOD`

, `NTIMES`

, `SEED`

.

Parameters: `DATA`

, `CHISQUARE`

, `CHIPERMUTED`

, `PROBABILITY`

.

### Method

The Pearson statistic is calculated as

chi-square = sum( (*o*–*e*) × (*o*–*e*) / *e* ),

where *o* = observed, and *e* = expected. The alternative, maximum-likelihood method takes the deviance from fitting a generalized linear model with a log link and a Poisson distribution.

The permutations are constructed using the method Roff & Bentzen (1989).

### Reference

Roff, D.A. & Bentzen, P. (1989). The statistical analysis of mitochondrial DNA polymorphisms: χ^{2} and the problem of small samples. *Mol. Biol. Evol.*, 6, 539-545.

### See also

Procedure: `CHISQUARE`

.

Commands for: Basic and nonparametric statistics, Regression analysis.

### Example

CAPTION 'CHIPERMTEST example','Data from Roff & Bentzen (1988)';\ STYLE=meta,plain FACTOR [LEVELS=14; LABELS=!t(A,B,C,D,E,F,G,H,I,J,K,L,M,N)] River FACTOR [LEVELS=2] Gene2 TABLE [CLASSIFICATION=River,Gene2; VALUES=\ 13,16,8,10,8,5,11,6,9,11,12,10,11,8,\ 17,4,10,1,12,7,6,4,12,5,16,5,7,0] B2 CHISQUARE B2 CHIPERMTEST [PRINT=summary,observed,expected; SEED=301453] B2