1. Home
  2. VSAMPLESIZE procedure

VSAMPLESIZE procedure

Estimates the replication to detect a fixed term or contrast in a REML analysis, using parametric bootstrap (R.W. Payne).

Options

PRINT = string tokens Controls printed output (power, replication, monitoring); default powe, repl, moni
TERM = formula Fixed term to be assessed in the analysis
REPLICATES = factor Factor identifying the replication in the design
TRYREPLICATION = variate Replication values to try first; default !(2,4)
MAXREPLICATION = scalar Maximum feasible replication; default * i.e. not defined
FIXED = formula Fixed terms in the analysis; if unset, determined automatically from the most recent VCOMPONENTS
RANDOM = formula Random terms in the analysis; if unset, determined automatically from the most recent VCOMPONENTS
COMPONENTS = variate or scalar Variate of variance components of the random terms; must be set
FACTORIAL = scalar Limit on the number of factors or variates in fixed terms; default 3
PROBABILITY = scalar Significance level at which the term is required to be detected (assuming a one-sided test); default 0.05
POWER = scalar The required power (i.e. probability of detection) of the test; default 0.9
TMETHOD = string token Type of test to be made (fratio, wald, twosided, lessthan, greaterthan, equivalence, noninferiority; default frat
XCONTRASTS = variate X-variate defining a contrast to be detected
CONTRASTTYPE = string token Type of contrast (regression, comparison) default rege
CRITICALVALUE = scalar Supplies a critical value for the test statistic
NBOOT = scalar or variate Number of bootstrap samples to analyse, in a variate with 2 values if there is to be preliminary search, otherwise in a scalar; default 1000
NRETRIES = scalar or variate Maximum number of extra samples to take when some REML analyses fail to converge, in a variate with 2 values if there is to be preliminary search, otherwise in a scalar; default NBOOT
SEED = scalar Seed for random number generation; default 0 continues an existing sequence or, if none, selects a seed automatically
METHOD = string token Indicates whether to use the standard Fisher-scoring algorithm or the new AI algorithm with sparse matrix methods (Fisher, AI); default AI
MAXCYCLE = scalar Sets a limit on the number of iterations in the REML analyses; default 30
FMETHOD = string token Controls whether and how to calculate F statistics for fixed terms (automatic, none, algebraic, numerical); default auto
WMETHOD = string token Controls which Wald statistics are saved (add, drop); default add
WORKSPACE = scalar Number of blocks of internal memory to be set up for use by the REML algorithm

Parameters

RESPONSE = scalars or tables Specifies the response to be detected
NREPLICATES = scalars Number of replicates required to detect RESPONSE

Description

When designing an experiment, it is usually possible to vary the replication of the treatments. For example, in a resolvable design, you may be able to include additional (duplicate) replicates. Alternatively, in some situations, it may be possible to improve precision by taking replicate measurements within the basic experimental units: for example, increasing the number of independent samples taken from a field plot. VSAMPLESIZE estimates the replication required to detect a specified response or contrast in a REML analysis.

The FIXED option defines the fixed model for the REML analysis. The RANDOM option specifies the random model. This may also contain the residual term, but that is not essential. If it is not present, the residual is added as the final model term. The COMPONENTS option specifies the variance components for the random terms, including the residual variance (at the end, if this had to be added this to the RANDOM formula). If either FIXED or RANDOM is not specified, their defaults are taken from the most recent VCOMPONENTS statement. The FACTORIAL option sets a limit (default 3) on the number of factors or variates in a fixed term; any containing more than that number are deleted. VSAMPLESIZE cannot be used for designs whose analysis to include covariance structures (specified by VSTRUCTURE).

The REPLICATES option must be set to the factor in the random model whose number of levels is to be increased or decreased to change the replication of the treatments. The factors in the fixed and random models must be defined to contain the values for a single replicate of the design. You can set the TRYREPLICATION option to a variate containing the number of replicates to try first. There must be at least two of these. The default is a variate containing the numbers 2 and 4. The MAXREPLICATION option can specify the maximum feasible number of replicates. The NREPLICATES parameter can save the estimate of the number of replicates that is required.

The fixed term to be tested is specified using the TERM option, and the response to be detected is specified by the RESPONSE parameter. This can supply a scalar to specify the maximum difference between the effects of the term, or it can supply a table, to specify the anticipated effects themselves. As an alternative to detecting a difference between its effects, you can ask to detect a contrast. RESPONSE must then supply a scalar, and TERM must be a main effect (that is, it must involve just one factor). The XCONTRASTS option must specify a variate or table containing the coefficients defining the contrast, and the CONTRASTTYPE option indicates whether this is a regression contrast (as specified e.g. by the REG function in ANOVA) or a comparison (as specified e.g. by the COMPARISON function in ANOVA).

The TMETHOD option specifies the type of test that is to be used to assess the term, with the following settings.

    fratio assumes that the term will be tested using its F ratio.
    wald assumes that the term will be tested by a Wald test.
    twosided assumes a two-sided test to assess whether a contrast of the term differs from zero (default).
    lessthan assumes a one-sided test to assess whether a contrast of the term is less than zero.
    greaterthan assumes a one-sided test to assess whether a contrast of the term is greater than zero.
    noninferiority assumes a test to check that a contrast of the term is not significantly less then zero. (See Method for more details.)
    equivalence assumes a one-sided test to check that a contrast of the term does not differ significantly from zero; see Method for more details.

The settings fratio and wald are not appropriate for contrasts. The default is twosided when there are contrasts, and fratio otherwise. Note: the specified response must be negative when TMETHOD is set to lessthan or noninferiority.

VSAMPLESIZE uses the VPOWER procedure to estimate of the power with which the response will be detected for each number of replicates that is tried. VPOWER performs a parametric bootstrap, in which random data variates are generated and analysed by REML ro see how often the term’s response is significant.

The MAXCYCLE option sets a limit on the number of iterations in the REML analyses (default 30). The METHOD option controls whether REML uses the Fisher-scoring algorithm, or the AI algorithm with sparse matrix methods (the default). The WMETHOD option controls whether the Wald and F statistics are obtained from the table where terms are added sequentially (the default), or from the table where suitable terms are dropped from the full fixed model. Note that, if you use the table where terms are dropped, the TERM must not be not marginal to any other term in the fixed model: for example, the main effect A cannot be tested if the model contains an interaction, such as A.B. The FMETHOD option controls how to estimate the denominator degrees of freedom for the F tests. (This is relevant if TMETHOD=fratio, or if tests for fixed effects are being printed in the REML analyses of the bootstrap samples.) The WORKSPACE option specifies the number of blocks of internal memory to be set up for use by the REML algorithm.

The NBOOT option specifies the number of bootstrap samples to take. The NRETRIES option specifies the maximum number of extra samples to take when some REML analyses fail to converge. These can be either a scalar, or a variate with one or two values. If two values are supplied, the first is used during an initial search to find a replication value to provide at least enough power. The second is then used for a more precise search, The default for NBOOT is to the single value 1000. The default for NRETRIES is to use the same number as specified by NBOOT. The SEED option supplies the seed for the random number generator used to form the samples; default 0 continues from the previous generation or (if none) initializes the seed automatically.

The PROBABILITY option specifies the significance level to be used in the test; the default is 0.05, i.e. 5%. The CRITICALVALUE option can supply the critical value to be used in the test. (The VCRITICAL procedure can be used to obtain this, with a similar parametric bootstrap process to that used by VPOWER.). Note: the specified critical value must be negative when TMETHOD is set to lessthan or noninferiority. If CRITICALVALUE is not set, the critical value is obtained in the conventional way, using an F, chi-square or t-distribution, according to the type of test.

The PRINT option controls the printed output, with settings:

    power prints a table giving the estimated power for the numbers of replicates that have been tried in the second phase of the search;
    replication prints the required replication; and
    monitoring prints monitoring information showing the numbers of replicates and corresponding estimated powers obtained during the search.

By default all are printed.

Options: PRINT, TERM, REPLICATES, TRYREPLICATION, MAXREPLICATION, FIXED, RANDOM, COMPONENTS, FACTORIAL, PROBABILITY, POWER, TMETHOD, XCONTRASTS, CONTRASTTYPE, CRITICALVALUE, NBOOT, NRETRIES, SEED, METHOD, MAXCYCLE, FMETHOD, WMETHOD, WORKSPACE.

Parameters: RESPONSE, NREPLICATES.

Method

The power is estimated for each number of replicates in the search, using the VPOWER procedure. This sees how frequently the relevant test would be significant in the analyses of a set of bootstrap samples. The variance-covariance matrix required to generate the samples is formed by the VUVCOVARIANCE procedure.

With an equivalence test, you define a threshold h below which two treatments can be assumed to be equivalent. The contrast c would be the difference between the treatments, and the null hypothesis that the treatments are not equivalent is that either

c ≤ –t

or

ct

with the alternative hypothesis that they are equivalent, i.e.

t < c < t

This defines an intersection-union test, in which each component of the null hypothesis must be rejected separately. This implies performing two one-sided t-tests (this is known as a TOST procedure). If the significance level for the full test is to be α, each t-test must have significance level α (see Berger & Hsu 1996).

With a non-inferiority test, you again define the threshold t for the effect of the new treatment to be inferior to the standard treatment, and a contrast representing the effect of the new test minus the effect of the standard treatment. The null hypothesis is

ct

which represents a one-sided “less-than” t-test.

Reference

Berger, M.L. & Hsu, J.C. (1996). Bioequivalence trials, intersection-union tests and equivalence confidence sets. Statistical Science, 11, 283-319.

See also

Directive: REML.

Procedures: ASAMPLESIZE, VPOWER, VCRITICAL, VUVCOVARIANCE.

Commands for: REML analysis of linear mixed models, Design of experiments.

Example

CAPTION  'VSAMPLESIZE example',\
         'Number of sub-samples to take within a 2-replicate lattice design.';\
         STYLE=meta,plain
FACTOR   [NVALUES=50; LEVELS=2] Reps
&        [LEVELS=5] Blocks,Plots
&        [LEVELS=1] Sample
&        [LEVELS=25] Treats
GENERATE Reps,Blocks,Plots,Sample
READ     Treats
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
1 6 11 16 21 2 7 12 17 22 3 8 13 18 23 4 9 14 19 24 5 10 15 20 25 :
VCOMPONENTS [FIXED=Treats] RANDOM=Reps/Blocks/Plots/Sample
" Note: the number of bootstrap samples in phase 2 is set to 2000 to improve
  reliability, and to 400 in phase 1 to speed up the initial search.
  Nevertheless, the example may be time-consuming to run."
VSAMPLESIZE [PRINT=#,monitoring; TERM=Treats; REPLICATES=Sample;\
            MAXREPLICATION=12; COMPONENTS=!(5,15,9,6); NBOOT=!(400,2000);\
            SEED=672817] 20
CAPTION     'Instead take duplicates of the whole design.'; STYLE=plain
FACTOR      [LEVELS=1; VALUES=50(1)] Duplicates
VCOMPONENTS [FIXED=Treats] RANDOM=(Duplicates.Reps)/Blocks/Plots
VSAMPLESIZE [PRINT=#,monitoring; TERM=Treats; REPLICATES=Duplicates;\
            MAXREPLICATION=12; COMPONENTS=!(5,15,15); NBOOT=!(400,2000);\
            SEED=562513] 20
Updated on March 4, 2019

Was this article helpful?