Calculates the power (probability of detection) for regression models (R.W. Payne).
Options
PRINT = string token |
Prints the power (power ); default powe |
---|---|
TERMS = formula |
Specifies the terms (x-variates, factors or model terms) to be fitted in the analysis when the responses to be detected are specified by the RESPONSE parameter |
FACTORIAL = scalar |
Limit on the number of factors or variates in a model term generated from TERMS ; default 3 |
PROBABILITY = scalar |
Significance level at which the response is required to be detected (assuming a one-sided test); default 0.05 |
TMETHOD = string token |
Type of test to be made (onesided , twosided , equivalence , noninferiority , fratio , chisquare ); default ones |
SAVE = rsave |
Regression save structure to provide the information about the regression model |
Parameters
RESPONSE = variates |
Variate of fitted values calculated using regression parameters of the size to be detected; default * implies that the information is to be taken from a regression save structure |
---|---|
RDF = scalars |
Number of residual degrees of freedom; if unset, this is obtained from the analysis of RESPONSE or from the regression save structure |
RSS = scalars |
Anticipated residual sum of squares; if unset, this is obtained from the analysis of RESPONSE or from the regression save structure |
POWER = scalars or variates |
Saves the power |
Description
When planning a regression study, it can be useful to know how likely a response is to be detected. This probability of detection, known as the power of the study with respect to the response of interest, helps to determine whether the study is sufficiently large or accurate to achieve its purpose. RPOWER
can consider any of the regression models that Genstat can analyse, and can calculate the power either for the assessment of the whole model (as represented by the regression sum of squares), or the assessment of individual parameters in the regression model.
To determine the power, you need to define the terms (x-variates, factors or model terms) to be fitted in the regression, and specify the anticipated amount of residual variability. This is most easily done by taking the analysis of a data set similar to the one to be used in the new study. To do this, you should analyse the earlier set of data with the regression directives in the usual way. Provided you do not fit any other regressions in the interim, RPOWER
will pick up the information automatically from the save information held within Genstat about the most recent regression analysis. Alternatively, you can save the information explicitly in a regression save structure, by setting the SAVE
option of MODEL
, and then use this same save structure as the setting of the SAVE
option of RPOWER
.
Using a save structure allows you to specify any regression model, including any nonlinear or generalized linear model. If you merely have an ordinary linear regression model, you can set up the whole process within RPOWER
if you prefer. The terms to be fitted in the model can be specified using the TERMS
option of RPOWER
. The setting can be a list of x-variates or a model formula, as in the setting of the parameter of the FIT
directive. The FACTORIAL
option, as in FIT
, sets a limit on the number of factors or variates in each of the terms generated from a model formula. The constant is included automatically. (So, if you want to omit the constant and fit a regression through the origin, you should specify a save structure instead.) The RESPONSE parameter then supplies a y-variate calculated with regression parameters set to the sizes of responses to be detected. For example, if we have a simple linear regression with x-variate X
and wish to be able to detect a regression coefficient of size at least 2.5, we would calculate the response as
response = 2.5 * X
If we also wanted to check that we can detect a constant (or intercept) of size 3, the calculation would become
response = 2.5 * X + 3
RPOWER
analyses the RESPONSE
variate using the model specified by TERMS
in order to obtain the values required to be detected for the various regression parameters.
The anticipated residual sum of squares can be specified by the RSS
parameter, and the residual degrees of freedom by the RDF
parameter. If these are not set, RPOWER
takes the values from the regression save structure (if this is how the model has been specified) or from the analysis of the RESPONSE
variate.
The PROBABILITY
option specifies the significance level that you intent to use in the analysis to detect a response; the default is 0.05 (i.e. 5%). By default, RPOWER
assumes that individual regression parameters are to be assessed by a one-sided t-test, but you can set option TMETHOD=twosided
to assess them by a two-sided t-test instead.
Other settings of TMETHOD
enable you to test individual parameters for equivalence or for non-inferiority. With equivalence (TMETHOD=equivalence
), RESPONSE
defines a threshold below which the parameter can be assumed to be equivalent to no response. If the future estimate of the parameter is b and the threshold is blim, the null hypothesis for equivalence is that either
b ≤ –blim
or
b ≥ blim
with the alternative hypothesis that they are equivalent, i.e.
–blim < b < blim
With non-inferiority (TMETHOD=noninferiority
), the null hypothesis becomes
b ≥ –blim
(which represents a simple one-sided t-test).
You can also set TMETHOD=fratio
, to assess the power of the F test for the regression in the summary analysis of variance (or deviance); this is an overall test for the whole regression model. Alternatively, if RPOWER
is using a save structure from the analysis of a generalized linear model with a non-Normal distribution, you can set TMETHOD=chisquare
to assess the power of a chi-square test on the deviance due to the regression model (see Section 3.5 of Part 2 of the Guide to the Genstat Command Language).
The POWER
parameter can save the power, in a scalar if TMETHOD
is set to fratio
or chisquare
; otherwise in a variate. They are printed by default, but you can set option PRINT=*
to stop this.
Options: PRINT
, TERMS
, FACTORIAL
, PROBABILITY
, TMETHOD
, SAVE
.
Parameters: RESPONSE
, RDF
, RSS
, POWER
.
Method
The standard error of the i’th regression parameter is
SQRT( IMAT$[i] * RSS / RMS )
where IMAT$[i]
is the value in the ith diagonal element of the inverse matrix, obtainable using the INVERSE
parameter of RKEEP
. The sum of squares (or the deviance) due to the regression and the corresponding number of degrees of freedom are obtainable by using RKEEP
to save the total sum of squares and number of degrees of freedom, and those for the residual. The required powers can then be calculated using Genstat’s probability functions for the F, chi-square and t distributions as appropriate.
See also
Commands for: Regression analysis.
Example
CAPTION 'RPOWER example',\ !t('Simple linear regression to detect a regression',\ 'coefficient of 2.5 and an intercept of 3.'); STYLE=meta,plain " define the suggested x-values " VARIATE [VALUES=1,2,5,8,9] X " calculate the response from the fitted values for the parameter values to be detected " CALCULATE response = 2.5 * X + 3 RPOWER [PRINT=power; TERM=X] response; RSS=25