HGFIXEDMODEL procedure

Defines the fixed model for a hierarchical or double hierarchical generalized linear model (R.W. Payne, Y. Lee, J.A. Nelder & M. Noh).

Options

`DISTRIBUTION` = string token	Distribution of the data (`binomial`, `poisson`, `normal`, `gamma`); default `norm`
`LINK` = string token	Link for the fixed model (`identity`, `logarithm`, `logit`, `reciprocal`, `probit`, `complementaryloglog`); default `iden`
`DISPERSION` = scalar	Value of dispersion parameter in calculation of s.e.s etc; default `*` for `DIST=norm` or `gamm`, and 1 for `DIST=pois` or `bino`
`DLINK` = string token	Link for the dispersion model (`logarithm`, `reciprocal`); default `loga`
`DTERMS` = formula	Dispersion model; default `*` i.e. none
`CONSTANT` = string token	How to treat the constant (`estimate`, `omit`) default `esti`
`FACTORIAL` = scalar	Limit on number of variates and/or factors in a fixed model term; default 3
`WEIGHTS` = variate	Prior weights; default `*` i.e. 1
`OFFSET` = variate	Offset variate; default `*` i.e. none
`DOFFSET` = variate	Offset variate for dispersion model; default `*` i.e. none
`DDISPERSION` = scalar	Dispersion parameter to use in a dispersion model for the residual dispersion parameter phi; default 1
`IDISPERSION` = scalar	Initial value for the residual dispersion parameter phi; default `*` i.e. formed automatically

Parameter

`TERMS` = formula	Fixed model

Description

HGFIXEDMODEL is one of several procedures with the prefix HG, which provide tools for fitting the hierarchical generalized linear models defined by Lee & Nelder (1996, 2001, 2006) and described by Lee, Nelder & Pawitan (2006). These models extend generalized linear models (GLMs) to include additional random terms in the linear predictor. They include generalized linear mixed models (GLMMs) as a special case, but do not constrain the additional terms to follow a Normal distribution and to have an identity link (as in the GLMM). For example, if the basic generalized linear model is a log-linear model (Poisson distribution and log link), a more appropriate assumption or the additional random terms might be a gamma distribution and a log link.

The role of HGFIXEDMODEL is to specify the fixed model terms in the HGLM, and to define the distribution of the data (this corresponds to error distribution of a GLM). The fixed model is given by the TERMS parameter. Most of the options operate similarly to those occurring in the directives FIT and MODEL. The link function for the fixed model is defined by the LINK option, and the FACTORIAL option sets a limit on the number of variates and/or factors for a term to be included in the fixed model (default 3). The CONSTANT option indicates whether or not to include a constant term or intercept (by default this is included), and the OFFSET option allows an offset variate to be included. The DISTRIBUTION option defines the distribution of the data, the WEIGHTS option allows you to specify a variate of prior weights, and the DISPERSION option governs how the dispersion parameter is obtained.

The HGLM methodology also caters for structured dispersion models, in which fixed terms are included in the generalized linear models that are used to estimate the dispersion parameters. Currently these GLMs must have a gamma distribution. The DTERMS option allows you to specify fixed terms for the GLM that estimates the residual dispersion parameter phi. The DLINK parameter specifies the link to use with the dispersion model, the DOFFSET option allows you to specify an offset variate, and the DDISPERSION option defines the dispersion parameter for the dispersion GLM (default 1). You can also extend the GLM to become an HGLM (thus making the full model a double hierarchical generalized linear model or DHGLM), by using the HGDRANDOMMODEL procedure to add some random terms.

The IDISPERSION option allows you to define an initial value for the residual dispersion parameter phi. Initial values for the dispersion parameters of the additional random terms of the HGLM can be defined using the IDISPERSION parameter of the HGRANDOMMODEL procedure. If you set both of these, the HGANALYSE procedure will then use them to initialize the weights that are involved in the fitting of the augmented mean model; for details see Chapter 6 of Lee, Nelder & Pawitan (2006). The default weights that are formed automatically if either of these is unset are satisfactory in most circumstances, but you may want to try your own initial values if you encounter convergemce problems.

Options: DISTRIBUTION, LINK, DISPERSION, DLINK, DTERMS, CONSTANT, FACTORIAL, WEIGHTS, OFFSET, DOFFSET, DDISPERSION, IDISPERSION.

Parameter: TERMS.

Method

The information is stored in a workspace G5PL_HG (accessed using the WORKSPACE directive) for later use by HGANALYSE.

References

Lee, Y., & Nelder, J.A. (1996). Hierarchical generalized linear models (with discussion). Journal of the Royal Statistical Society, Series B, 58, 619-678.

Lee, Y., & Nelder, J.A. (2001). Hierarchical generalized linear models: a synthesis of generalised linear models, random-effect models and structured dispersions. Biometrika, 88, 987-1006.

Lee, Y. & Nelder, J.A. (2006). Double hierarchical generalized linear models (with discussion). Appl. Statist., 55, 139-185.

Lee, Y., Nelder, J.A. & Pawitan, Y. (2006). Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood. Chapman & Hall, London.

Lee, Y., Nelder, J.A. & Pawitan, Y. (2006). Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood. Chapman and Hall, Boca Raton.

Example

CAPTION  'HGFIXEDMODEL example',!t(\
         'Number of faults in rolls of fabric of various lengths',\
         '(data from Bissell (1972) Biometrika, 59, 435-441).'),\
         'Fit negative binomial: var(y) = mu + alpha * mu * mu',\
         '(equivalent to Poisson gamma HGLM with saturated random effect).';\
         STYLE=meta,3(plain)
VARIATE  [NVALUES=32] length,faults
READ     length,faults
551  6  651  4  832 17  375  9  715 14  868  8  271  5  630  7
491  7  372  7  645  6  441  8  895 28  458  4  642 10  492  4
543  8  842  9  905 23  542  9  522  6  122  1  657  9  170  4
738  9  371 14  735 17  749 10  495  7  716  3  952  9  417  2  :
CALCULATE     loglength = log(length)
&             loglength = loglength - mean(loglength)
FACTOR        [LEVELS=32; VALUES=1...32] saturated
HGFIXEDMODEL  [DISTRIBUTION=poisson; LINK=log] loglength
HGRANDOMMODEL [DISTRIBUTION=normal; LINK=identity] saturated
HGANALYSE     faults

Updated on February 7, 2023

Was this article helpful?

Yes No