Does random permutation tests for generalized linear mixed models (R.W. Payne).
Options
PRINT =string tokens |
Controls printed output (prwald , criticalwald , ownstatistics , monitoring ); default prwa , crit |
NTIMES = scalar |
Number of permutations to make; default 99 |
NRETRIES = scalar |
Maximum number of extra samples to take when some analyses fail to converge; default NTIMES |
BLOCKSTRUCTURE = formula |
Model formula defining any blocking to consider during the randomization; default none |
EXCLUDE = factors |
Factors in the block formula whose levels are not to be randomized |
SEED = scalar |
Seed for the random number generator used to make the permutations; default 0 continues from the previous generation or (if none) initializes the seed automatically |
BINMETHOD = string token |
How to permute binomial data (individuals , units ; default indi |
WMETHOD = string token |
Controls which Wald statistics are used (add , drop ); default add |
OWNMETHOD = string token |
Type of test required for own statistics (twosided , greaterthan , lessthan ); default twos |
CIPROBABILITY = scalar |
Probability level for the confidence interval for own statistics; default 0.95 |
Parameters
GLSAVE = pointers |
Save structure of the original analysis from GLMM ; default * uses the save structure from the most recent GLMM analysis |
WALD = pointers |
Saves a pointer with a variate for each of the fixed terms containing the Wald statistics from the permuted data sets |
PRWALD = pointers |
Saves a pointer with a scalar for each of the fixed terms, containing the test probability obtained from the position of its Wald statistic within those from the permuted data sets |
CRITICALWALD = pointers |
Saves a pointer with variates for the 5%, 1% and 0.1% significance levels containing the corresponding critical values for the fixed terms, obtained from the quantiles of the Wald statistics from the permuted data sets |
NNOTCONVERGED = scalars |
Saves the number of permuted data sets whose analyses failed to converge |
OWNDATA = pointers |
Data required to calculate own statistics |
OWNOBSERVEDVALUES = variates |
Saves observed values of the own statistics |
OWNPROBABILITIES = variates |
Saves bootstrap probabilities for the own statistics |
OWNESTIMATES = variates |
Saves bootstrap estimates for the own statistics |
OWNSES = variates |
Saves bootstrap standard errors for the own statistics |
OWNLOWERCIS = variates |
Saves bootstrap lower values of the confidence intervals for the own statistics |
OWNUPPERCIS = variates |
Saves bootstrap upper values of the confidence intervals for the own statistics |
OWNSTATISTICS = pointers |
Saves the own statistics obtained from the permuted data sets, in a pointer with a variate for each statistic |
Description
GLPERMTEST
performs random permutation tests for fixed terms in a generalized linear mixed model, analysed by GLMM
. A problem with these analyses is that their estimates of the variance components are generally biased i.e. the estimates are smaller than the true values. The Wald tests also suffer from bias, in that their test probabilities may be too small. You therefore need to be cautious when the probabilities from the tests are close to their critical values, especially when analysing small data sets or data from a binary distribution.
GLPERMTEST
uses random permutation tests to provide an alternative way of assessing the fixed terms. It forms random permutations of the response, analyses those data sets, and records their Wald statistics. The distributions of the Wald statistics, under the null hypothesis of no fixed effects, can be estimated by the sets of statistics obtained from the analyses of the permuted data sets. Test probabilities for the original Wald statistics can therefore be estimated by their locations within those sets.
Before using GLPERMTEST
, you need to analyse the original data set by GLMM
. The GLSAVE
parameter supplies the save structure from that analysis. If this is not specified, GLPERMTEST
uses the save structure from the most recent GLMM
analysis. The save structure provides the settings of all the options and parameters that GLMM
used in that analysis. The analyses of the permuted data sets can therefore be done in exactly the same way as the original analysis.
The NTIMES
option defines how many random permutations to perform; by default there are 99. The NRETRIES
option specifies the maximum number of extra samples to take when some analyses fail to converge; the default is to use the same number as specified by NTIMES
. The NNOTCONVERGED
parameter can save a scalar containing the number of permuted data sets whose analyses failed to converge. The results may be unreliable if more than a few analyses fail.
The SEED
option allows you to specify the seed to use for the random-number generator that is used for the randomizations to form the permutations. The default, SEED=0
, continues the sequence of random numbers from a previous generation or, if this is the first use of the generator in this run of Genstat, it initializes the seed automatically. If NTIMES
exceeds the maximum possible number of permutations for the data, an “exact” test is performed in which every permutation is used once. This is feasible only for small data sets. There are n! (n factorial) permutations of n units: 3!=6, 4!=24, 5!=120, 6!=720, 7!=5040, 8!=40320, and so on.
If the data are from a designed experiment, you may need to use the BLOCKSTRUCTURE
option to specify a block model to define how to do the randomization. The EXCLUDE
option can then restrict the randomization so that one or more of the factors in the block model is not randomized. See the RANDOMIZE
directive for further details.
The BINMETHOD
option controls how the permutations are done for binomial data. The original data set will have contained a set of units, each recording a number of “successes” obtained from an observed number of individuals. The default, and recommended, method is to expand the data set to contain individuals themselves, and permute these. Alternatively, you can set BINMETHOD=units
if you prefer to permute the units as a whole instead.
The WALD
parameter can save a pointer with a variate for each of the fixed terms containing the Wald statistics from the analyses of the permuted data sets. Similarly the PRWALD
parameter can save a pointer with a scalar for each of the fixed terms, containing the test probability obtained from the position of its Wald statistic within those from the permuted data sets.
You can define your own statistics to be assessed by the test. They are calculated by a procedure _GLPERMownstatistics
, which is called by GLPERMTEST
following the GLMM
analysis of each permuted data set. Its use is shown in the GLPERMTEST
example, which can be modified to calculate your own statistics instead. The information required by _GLPERMownstatistics
to do the calculations is supplied, in a pointer, by the OWNDATA
parameter. The OWNMETHOD
option specifies the type of test to be made. The default, twosided
tests whether the statistics differ from zero. The greaterthan
setting tests whether they are greater than zero, and the lessthan
setting tests whether they are less than zero. Permutation estimates, standard errors and confidence intervals are also calculated, The CIPROBABILITY
option specifies the probability for the confidence intervals (default 0.95). The OWNOBSERVEDVALUES
parameter can save a variate containing the values of the own statistics from the original data set. The OWNPROBABILITIES
can save a variate containing the probabilities from the tests. The OWNESTIMATES
can save a variate containing the bootstrap estimates of the statistics (calculated as the mean of the values obtained from the bootstrap samples) The OWNSES
can save a variate containing standard errors of bootstrap estimates. The OWNLOWERCIS
and OWNUPPERCIS
parameters can save variates containing the lower and upper values, respectively, of the confidence intervals. Finally, the OWNSTATISTICS
can save the values of the own statistics obtained from the permuted data sets, in a pointer with a variate for each statistic.
Output is controlled by the PRINT
option, with settings:
prwald
to print probabilities for the fixed terms, estimated from the locations of their Wald statistics within the sets obtained from the permuted data sets;
criticalwald
to print critical values for the Wald statistics, estimated by quantiles within the sets from the permuted data sets;
ownstatistics
to print estimates, standard errors and confidence intervals for the own statistics, and
monitoring
to monitor the progress of the anayses.
The default is to print probabilities and critical values.
Options: PRINT
, NTIMES
, NRETRIES
, BLOCKSTRUCTURE
, EXCLUDE
, SEED
, BINMETHOD
, WMETHOD
, OWNMETHOD
, CIPROBABILITY
.
Parameters: GLSAVE
, WALD
, PRWALD
, CRITICALWALD
, NNOTCONVERGED
, OWNDATA
, OWNOBSERVEDVALUES
, OWNPROBABILITIES
, OWNESTIMATES
, OWNSES
, OWNLOWERCIS
, OWNUPPERCIS
, OWNSTATISTICS
.
Method
GLPERMTEST
uses RANDOMIZE
to perform the permutations, taking account of any block structure of the date. The model is fitted, for each data set using GLMM
, and GLKEEP
is used to save the Wald statistics. The QUANTILES
function is used to calculate the critical values.
Action with RESTRICT
GLPERMTEST
takes account of any restrictions on any of the y-variates or x-variates or factors in the model.
See also
Procedures: GLDISPLAY
, GLKEEP
GLMM
, GLPLOT
, GLPREDICT
, GLRTEST
, GLTOBITPOISSON
, APERMTEST
, RPERMTEST
.
Commands for: Regression analysis.
Example
CAPTION 'GLPERMTEST example',\ !t('Data from McCullagh & Nelder (1989, Table 14.4),',\ 'also see Schall (1991).'); STYLE=meta,plain FACTOR [NVALUES=120; LEVELS=20] Female, Male & [LEVELS=4; LABELS=!t(RR,RW,WR,WW)] Cross VARIATE [NVALUES=120] Mate1 READ Cross,Male,Female; FREPRESENTATION=labels,2(levels) RR 1 1 RW 14 1 RR 5 1 RW 11 1 RR 4 1 RW 15 1 RR 5 2 RW 15 2 RR 3 2 RW 13 2 RR 1 2 RW 12 2 RR 2 3 RW 11 3 RR 1 3 RW 14 3 RR 3 3 RW 13 3 RR 4 4 RW 12 4 RR 2 4 RW 15 4 RR 5 4 RW 14 4 RR 3 5 RW 13 5 RR 4 5 RW 12 5 RR 2 5 RW 11 5 RW 19 6 RR 9 6 RW 20 6 RR 7 6 RW 16 6 RR 8 6 RW 18 7 RR 8 7 RW 19 7 RR 9 7 RW 17 7 RR 6 7 RW 16 8 RR 6 8 RW 17 8 RR 10 8 RW 20 8 RR 9 8 RW 20 9 RR 7 9 RW 18 9 RR 6 9 RW 19 9 RR 10 9 RW 17 10 RR 10 10 RW 16 10 RR 8 10 RW 18 10 RR 7 10 WR 9 11 WW 19 11 WR 7 11 WW 20 11 WR 10 11 WW 18 11 WR 7 12 WW 16 12 WR 9 12 WW 17 12 WR 6 12 WW 20 12 WR 8 13 WW 17 13 WR 6 13 WW 19 13 WR 7 13 WW 16 13 WR 10 14 WW 20 14 WR 8 14 WW 18 14 WR 9 14 WW 19 14 WR 6 15 WW 18 15 WR 10 15 WW 16 15 WR 8 15 WW 17 15 WW 15 16 WR 2 16 WW 13 16 WR 4 16 WW 12 16 WR 1 16 WW 14 17 WR 1 17 WW 15 17 WR 2 17 WW 11 17 WR 5 17 WW 11 18 WR 4 18 WW 12 18 WR 5 18 WW 15 18 WR 3 18 WW 13 19 WR 3 19 WW 11 19 WR 1 19 WW 14 19 WR 4 19 WW 12 20 WR 5 20 WW 14 20 WR 3 20 WW 13 20 WR 2 20: READ Mate1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 1 1 1 1 0 0 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 1 0 1 0 1 0 0 0 0 0 0 1 0 0 0 1 1 1 0 1 1 1 0 1 0 1 0 1 1 1 1 1 0 1 0 0 1 1 0 : GLMM [PRINT=components,means,backmeans,wald; DISTRIBUTION=binomial;\ LINK=logit; FIXED=Cross; RANDOM=Female+Male] Mate1; NBINOMIAL=1;\ GLSAVE=glmms GLPERMTEST [NTIMES=99; SEED=161064] " own statistics: test Cross contrasts " PROCEDURE [WORDLENGTH=long] '_GLPERMownstatistics' PARAMETER NAME=\ 'DATA', "(I: pointer) information required to calculate the statistics"\ 'STATISTICS'; "(O: variate) estimated statistics "\ MODE=p; TYPE='pointer','variate' " insert commands to calculate the statistics " GLKEEP DATA[1]; MEANS=means VARIATE vmeans; VALUES=means VARIATE [NVALUES=DATA[2]] STATISTICS CALCULATE STATISTICS = vmeans$[DATA[3]] - vmeans$[DATA[4]] ENDPROCEDURE "_GLPERMownstatistics" TEXT [VALUES='RR-RW','WR-WW','RR-WR','RW-WW'] Contrast POINTER [VALUES=Cross,Contrast,!(1,3,1,2),!(2,4,3,4)] Owninfo GLPERMTEST [PRINT=#,ownstatistics; NTIMES=99; SEED=161064]\ glmms; OWNDATA=Owninfo