Calculates the sample sizes for the Mann-Whitney test (R.W. Payne).
|What to print (
||Significance level at which the test is to be made; default 0.05|
||The required power (i.e. probability of detection) of the test; default 0.9|
||Whether to a one- or two-sided test is to be made (
||Ratio of replication sample2:sample1 (i.e. the size of sample 2 should be
||Sample sizes for which to calculate and print or save the power; default
||Probabilities under null hypothesis|
||Odds ratio for test group vs. control|
||Saves the required sample size|
||Sample sizes for which powers have been calculated|
||Power (i.e. probability of detection) for the various numbers of replicates|
The Mann-Whitney U test is a nonparametric test for differences in location between two samples (see procedure
MANNWHITNEY). This procedure,
SMANNWHITNEY, allows you to calculate the sample sizes required for the test, provided you can supply some information about the probability distributions from which the samples are likely to be generated. For simplicity, the data are assumed to be classified into ordered categories. These may be natural categories (such as “very good”, “good”, “moderate” and “poor”) or they may be formed by splitting a continuous scale intervals (e.g. “under 18”, “18-25”, “25-40”, “40-60” and “over 60”). You then use the
NULLPROBABILITIES parameter to specify a variate containing the probability value for each category. This indicates the probability distribution which you feel would generate the data of both samples under the null hypothesis. The accuracy of the subsequent calculations will depend on how many categories you take for a continuous variate. However, Whitehead (1993) suggests that there is little to gain in taking more than five.
To assess the power of the test, you next need to indicate how small a difference between the sample distributions the test should be able to detect. The assumption now is that there will be a control sample, with probability distribution as supplied, and a test sample for which the distribution is shifted by multiplying the odds (i.e. p/(1-p)) of the cumulative distribution by a constant amount. (This corresponds to the proportional-odds model of McCullagh 1980.) This constant is supplied by the
ODDSRATIO parameter. An example, with odds-ratio 2, is show below.
|Null hypothesis||Alternative hypothesis|
|probability||cumulative probability||odds||probability||cumulative probability||odds|
The cumulative probabilities are produced as part of the information generated by setting the
power. So you can evaluate possible ratios to check that they generate plausible distributions.
By default the calculations are done for a one-sided test, but you can set option
TMETHOD=twosided for a two-sided test instead. The significance level for the test is specified by the
PROBABILITY option (default 0.05 i.e. 5%). The required probability for detection of the change (that is, the power of the test) is specified by the
POWER option (default 0.9). It is generally assumed that the sizes of the samples in the two-sample test should be equal. However, you can set the
RATIOREPLICATION option to a scalar,
R say, to indicate that the size of the second sample should be
R times the size of the first sample. The sample size can be saved using the
||to print the required number of replicates in each sample (i.e. the size of each sample);|
||to print a table giving the power (i.e. probability of detection) provided by a range of numbers of replicates.|
By default both are printed.
The replications and corresponding powers can also be saved, in variates, using the
VPOWER parameters. The
REPLICATION option can specify the replication values for which to calculate and print or save the power; if this is not set, the default is to take 11 replication values centred around the required number of replicates.
The method is based on the equations given by Whitehead (1993), except the Genstat implementation omits the approximation of taking n/(n+1) as equal to one.
McCullagh, P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society Series B, 43, 109-142.
Whitehead, J. (1993). Sample size calculations for ordered categorical data. Statistics in Medicine, 12, 2257-2271.
Commands for: Design of experiments.
CAPTION 'SMANNWHITNEY example',\ !t('Example 2 of Whitehead (1993), but note that results below',\ 'differ slightly due to the use here of a more exact equation.');\ STYLE=meta,plain VARIATE [VALUES=0.2,0.5,0.2,0.1] Controlprob SMANNWHITNEY [TMETHOD=twosided] Controlprob; ODDSRATIO=EXP(0.887)