WADLEY procedure

Fits models for Wadley’s problem, allowing alternative links and errors (D.M. Smith).

Options

`PRINT` = string tokens	Controls printed output (`deviance`, `estimates`, `correlations`, `monitoring`); default `devi`, `esti`
`DISTRIBUTION` = string token	Distribution of the response variate (`poisson`, `negativebinomial`, `qlnegativebinomial`, `qlscaledpoisson`); default `pois`
`LINK` = string token	Link transformation (`logit`, `probit`, `complementaryloglog`, `cauchit`); default `logi`
`TERMS` = formula	Model to be fitted
`CONTROL` = factor	Factor to distinguish the control, or zero, dose (level 1) from the other treatments (level 2)
`MAXIMAL` = factor	Factor to define the maximal model i.e. with a level for every combination of values of the variates and factors in `TERMS`
`RMETHOD` = string token	Type of residuals to be formed (`deviance`, `Pearson`); default `devi`

Parameters

`Y` = variates	Response variate for each fit
`RESIDUALS` = variates	Variate to save the residuals from each fit
`FITTEDVALUES` = variates	Variate to save the fitted values from each fit

Description

WADLEY uses the generalized linear models methodology of composite link functions to fit a range of models for the situation known as Wadley’s problem. This arises in bioassay where it is possible to count only the number of subjects that have not responded to a particular dose of a drug or stimulus. For example, with eggs of insects fumigated in grain, it is generally possible to count only those that survive and hatch.

By default, the analysis assumes that the numbers of subjects that are treated in each observation follow a Poisson distribution with a common mean parameter; other distributions can be specified using the DISTRIBUTION option or, for user-defined distributions, by providing subsidiary procedure WADDISTRIBUTION (see details of the procedures called by WADLEY).

The analysis estimates the mean of the distribution, and then fits the dose response curve as in an ordinary probit analysis. The LINK option defines the transformation (logit, probit, cauchit, or complementary log-log) required to make the model additive. User-defined transformations can also be specified, by leaving LINK unset and providing subsidiary procedure WADLINK to calculate the necessary fitted values and derivatives, and WADINITIAL to calculate initial values for the linear predictor (see details of the procedures called by WADLEY). The model to be fitted is defined by the TERMS option.

To assist the estimation of the expected total number of subjects, there must be some control observations – for example with zero doses of fumigant. These must be identified by a factor, specified by the CONTROL option, with level 1 for untreated and level 2 for treated. The comparison between the treated and untreated levels of CONTROL must not be aliased with any of the variates and factors in TERMS. (Thus if, for example, TERMS contained a factor representing different types of drug, this must not have a separate level for the untreated observations.)

Often with these sort of data, it is found that the variability exceeds that which would be expected from the distribution assumed for the data. To estimate the amount of overdispersion, the MAXIMAL option must be set to a factor with a different level for every combination of values of the factors and variates in the TERMS model.

Options: PRINT, DISTRIBUTION, LINK, TERMS, CONTROL, MAXIMAL, RMETHOD.

Parameters: Y, RESIDUALS, FITTEDVALUES.

Method

In essence WADLEY is a specific application of the use of composite link functions in generalized linear models. The actual methods used are those in the Genstat procedure GLM (Lane 1989) and the GLIM macros of Smith & Morgan (1989). The procedure is very similar in spirit to these GLIM macros, and it is recommended that this reference be consulted for further information. However, there are some extensions. The capability to handle user-defined links and distributions has been added. Also, the range of distributions has been extended to include two forms of quasi-likelihood, namely that where the weighting is of negative binomial form (weight=1/(1+hf×fittedvalues)), and that where the weighting is of scaled Poisson form (weight=1/hf), where hf is the heterogeneity factor. If the estimated heterogeneity factor is less than zero in the negative binomial cases, or if it is less than one in the scaled Poisson case, it is set to zero or one respectively.

WADLEY has two subsidiary procedures, WADCODI and WADFIT, to assist with the analysis; neither of these need be modified by the user:

WADCODI prints the results of the iterative processes;

WADFIT performs the iterative model fits.

There are also three other procedures, which can be rewritten or replaced, to cater for further user-defined distributions and links:

WADDISTRIBUTION calculates the variance function and deviance for a user-defined distribution;

WADINITIAL calculates initial estimates of the linear predictor for a user-defined link;

WADLINK calculates the fitted values and derivatives for a user-defined link.

If the DISTRIBUTION option is unset, the procedure will call WADDISTRIBUTION instead of using one of the various standard distributions. For a Poisson error distribution WADDISTRIBUTION should be defined like this.

PROCEDURE 'WADDISTRIBUTION'

"Calculation of variance function and deviance"

PARAMETER 'Y', "Input: variate; response variate"\

'FITTED', "Input: variate; fitted values"\

'VARIANCE',"Output: variate; variance"\

'LL', "Output: variate; log likelihood variate"\

'DEVIANCE';"Output: scalar; total deviance"\

MODE=p

SCALAR two; VALUE=2

CALCULATE VARIANCE = FITTED

& LL = Y*LOG(Y/FITTED)-Y+FITTED

& DEVIANCE = two*SUM(LL)

ENDPROC

For other error distributions only the three CALCULATE statements need to be changed.

Similarly, for option LINK unset, WADINITIAL and WADLINK will be called. For a logit link WADINITIAL would be defined as follows.

PROCEDURE 'WADINITIAL'

"Calculation of initial estimates of linear predictor"

PARAMETER 'Y', "Input: variate; response variate"\

'LP', "Output: variate; linear predictor"\

'IND', "Input: variate; marker variate with value 1

for a control observation, 0 otherwise"\

'MAXY'; "Inout: scalar; estimate of asymptote"\

MODE=p

SCALAR half,one; VALUE=0.5,1

CALCULATE LP = IND*LOG(MAXY/(Y+half)-one)

ENDPROC

For other links only the CALCULATE statement need be changed so, for example, a probit link would require the statement

CALCULATE LP = IND*NED(one-(Y+one)/MAXY)

For a logit link WADLINK would be

PROCEDURE 'WADLINK'

"Calculation of fitted values and derivatives

of the link function given the linear predictor"

PARAMETER 'LP', "Input: variate; linear predictor"\

'IND', "Input: variate; marker variate with value 1

for a control observation, 0 otherwise"\

'TA', "Output: variate; estimate of fitted values"\

'TB', "Output: variate; estimate of derivatives"\

'MAXY'; "Input: scalar; estimate of asymptote"\

MODE=p

SCALAR half,one; VALUE=0.5,1

CALCULATE TA = (.NOT.IND)+IND/(one+EXP(LP))

& TB = MAXY*EXP(LP)*TA*TA

ENDPROC

For other links only the CALCULATE statements need to be changed so, for example, a probit link would require

CALCULATE TA = (.NOT.IND)+IND/(one-NORMAL(LP))

& TB = MAXY*EXP(-half*LP*LP)/ROOT2PI

where ROOT2PI is a scalar with the value of the square root of 2π. The marker variate IND identifies which is the control and non control data, so TA should always be of the form

TA = (.NOT.IND)+IND*function

where function is the link function for the non-control part of the data. The variate TB should always be of the form

TB = MAXY*deriv_fn

where deriv_fn is the derivative of the link function with respect to the linear predictor (LP).

If LINK or DISTRIBUTION are unset, but no user routines are given for WADINITIAL, WADLINK and WADDISTRIBUTION, then those given here (for logit link and Poisson error distribution) will be used.

A debt is owned to Dr J. Parrott of Pfizer Central Research, Sandwich, UK for his support and encouragement of this work.

Action with `RESTRICT`

If the Y-variate is restricted, only the specified subset of the units will be included in the analysis.

References

Lane, P.W. (1989). Procedure GLM. In: Genstat Procedure Library Release 1.3[2] (ed. R.W.Payne & G.M.Arnold), 80-82.

Smith, D.M. & Morgan, B.J.T. (1989). Extended models for Wadley’s Problem. Glim Newsletter, 18, 21-28.

Example

CAPTION   'WADLEY example',\
          'Data from Smith & Morgan, GLIM Newsletter, 18, 1989.';\ 
          STYLE=meta,plain
VARIATE   [NVALUES=70] Dose,Count
READ      Dose,Count
 0 219  0 228  0 202  0 237  0 228  0 204  0 217  0 190  0 224  0 218
 1 167  1 158  1 158  1 175  1 167  5 105  5 123  5 105  5 105  5 105
10  88 10  88 10  61 10  61 10  88 50  61 50  44 50  35 50  35 50  44
 1 166  1 158  1 181  1 143  1 159  5  97  5 112  5  88  5 120  5 103
10  78 10  80 10  75 10  74 10 102 50  49 50  40 50  57 50  51 50  40
 1 160  1 143  1 148  1 135  1 142  5 101  5  81  5  82  5  94  5  74
10  54 10  42 10  52 10  48 10  63 50  32 50  15 50  16 50  19 50  23 :
FACTOR    [LEVELS=2] Control
 &        [LEVELS=3; VAL=30(1),20(2,3)] Group
"  Note:  the Control observations must be assigned to one of the three
          levels of Group as otherwise the model is overparameterised;
          here the ten control observations have been assigned to level 1."
CALCULATE Control= 1+(Dose>0)
 &        LDose  = LOG(Dose + (Dose==0))
CAPTION   !t('Fitting parallel linear regressions in log dose:',\ 
          'logit link and Poisson error.')
WADLEY    [DISTRIBUTION=poisson; LINK=logit; TERMS=Group+LDose;\ 
          CONTROL=Control] Count
CAPTION   'Allow for heterogeneity: quasi-likelihood, scaled Poisson error.'
FACTOR    [LEVELS=13] Full; VALUES=!(10(1),5(2...13))
WADLEY    [DISTRIBUTION=qlscaledpoisson; TERMS=Group+LDose; CONTROL=Control;\
          MAXIMAL=Full] Count

Updated on June 14, 2019

Was this article helpful?

Yes No

Options

Parameters

Description

Method

Action with RESTRICT

References

See also

Example

Was this article helpful?

Action with `RESTRICT`