1. Home
  2. HGANALYSE procedure

HGANALYSE procedure

Analyses data using a hierarchical or double hierarchical generalized linear model (R.W. Payne, Y. Lee, J.A. Nelder & M. Noh).

Options

PRINT = string tokens Controls printed output (model, fixedestimates, randomestimates, dispersionestimates, likelihoodstatistics, deviance, waldtests, fittedvalues, monitoring, dhgmonitoring); default mode, fixe, disp, devi, like, moni
LMETHOD = string token Whether to use exact likelihood or extended quasi likelihood to obtain the y-variate and weights for the dispersion model (exact, eql); default exac
SEMETHOD = string token Method to use to calculate the se’s for the dispersion estimates (approximate, profilelikelihood); default appr
DMETHOD = string token Method to use for the adjusted profile likelihood when calculating the likelihood statistics (automatic,  choleski, lrv); default auto
EMETHOD = string token Extrapolation method to use (aitken, adjustedaitken); default aitk
MLAPLACEORDER = scalar Order of Laplace approximation to use in the estimation of the mean model (0 or 1); default 0
DLAPLACEORDER = scalar Order of Laplace approximation to use in the estimation of the dispersion components (0, 1 or 2); default 0
MAXCYCLE = scalars Maximum number of iterations of the hierarchical generalized linear model fits, and maximum number of iterations in the fitting of the mean and dispersion models; default 99,50
EXIT = scalar Exit status (0 for success, 1 for failure to converge)
TOLERANCE = scalar Criterion for convergence; default 0.0005
ETOLERANCE = scalar Maximum size of ratio of the original to the new estimates allowed in Aitken extrapolation; default 7.5
GROUPTERM = formula Random term to use as groups when fitting the augmented mean model; default * i.e. none

Parameters

Y = variate Response variate (must be one only)
NBINOMIAL = variate or scalar Total numbers for binomial data
RESIDUALS = variate Saves the residuals
FITTEDVALUES = variate Saves the fitted values
SAVE = pointer Saves details of the analysis for use in subsequent HGDISPLAY, HGKEEP, HGPLOT or HGPREDICT statements

Description

HGANALYSE is one of several procedures with the prefix HG, which provide tools for fitting the hierarchical and double hierarchical generalized linear models (HGLMs and DHGLMs) defined by Lee & Nelder (1996, 2001, 2006) and described by Lee, Nelder & Pawitan (2006). These models extend generalized linear models (GLMs) to include additional random terms in the linear predictor. They include generalized linear mixed models (GLMMs) as a special case, but do not constrain the additional terms to follow a Normal distribution and to have an identity link (as in the GLMM). For example, if the basic generalized linear model is a log-linear model (Poisson distribution and log link), a more appropriate assumption for the additional random terms might be a gamma distribution and a log link.

The analysis involves fitting an augmented generalized linear model to describe the mean of the distribution. This has units corresponding to the original data units, together with additional units for the effects of the random terms; see Lee & Nelder (1996). Then there are further GLMs to describe the dispersion for each random term (including the residual dispersion, phi); see Lee & Nelder (2001). In a DHGLM, some of these dispersion GLMs are themselves extended to become HGLMs by the inclusion of random terms; see Lee & Nelder (2006).

Before calling HGANALYSE, the fixed and random terms in the HGLM must be defined by the HGFIXEDMODEL and HGRANDOMMODEL procedures, respectively. The HGDRANDOMMODEL procedure can then add random terms to a dispersion GLM, so that the model becomes a DHGLM.

The variate to be analysed must be supplied by the Y parameter and, if the y-values are binomial responses, the NBINOMIAL parameter should supply the corresponding total numbers. Residuals and fitted values can be saved using the RESIDUALS and FITTEDVALUES parameters, respectively. Note that only one y-variate can be analysed at once, so any additional variates are ignored (as occurs with the MODEL directive when generalized linear models are defined).

The SAVE parameter allows you to save a pointer containing full details of the analysis. This can then be used to generate further output from HGDISPLAY, HGKEEP, HGPLOT or HGPREDICT. The most recent save structure is kept automatically inside Genstat to use as a default for the SAVE options of HGDISPLAY, HGKEEP, HGPLOT and HGPREDICT. So, you need save the pointer explicitly only if you want to display output from more than one analysis at a time.

The PRINT, SEMETHOD and DMETHOD options control printed output, almost exactly as in the HGDISPLAY procedure (which is called by HGANALYSE to produce the output). The only difference is that PRINT has additional settings: monitoring provides information about the fitting process of an ordinary HGLM, and dhgmonitoring provides information about the fitting of the HGLM for the dispersion model in a DHGLM.

The other options control various aspects of the fitting process. The fitting process involves alternative fits of the augmented GLM for the mean given the current estimates of the dispersion parameters, and of the models that estimate the dispersion parameters. The convergence of the process is assessed by comparing the dispersion estimates from successive fits. The MAXCYCLE option can specify two scalars. The first sets a limit on the number of alternating fits (default 99), and the second controls the number of iterations in the estimation of the mean model and of the dispersion model (default 50). The TOLERANCE option defines the criterion for convergence in the alternating fits (default 0.005). The EMETHOD option determines whether Aitken (default) or adjusted Aitken extrapolation is used in the estimation of the dispersion estimates, or you can set EMETHOD=* to use neither. The ETOLERANCE option sets an upper limit on the ratio of the changed value to the original values in the extrapolations; the default value is 7.5. The GROUPTERM option allows you to specify a random term whose factor combinations should be used as a groups factor during the fitting of the augmented mean model (see the GROUPS option of the MODEL directive). This allows models with large numbers of random effects to be fitted much more efficiently. However, algorithmic complications mean that predictions can then be made by HGPREDICT only using a BLUP for a specific random effect of that term – you cannot form predictions at the expected value of the term. The EXIT option can be set to a scalar which will be set to zero or one according to whether or not the fitting has been successful.

By default HGANALYSE uses exact likelihood to obtain the y-variate and weights for the dispersion model. This produces estimates with less bias than the previous method, of extended quasi likelihood (EQL). However, option LMETHOD is provided to enable EQL estimates to be obtained if required. For some of the models the DLAPLACEORDER option allows the order of Laplace approximation involved in the estimation of the dispersion components to be increased from the standard value (and default) of 0, to either 1 or 2. This is appropriate for generalized linear mixed models with the binomial or Poisson distributions, where use of Laplace order 0 can lead to serious downwards bias. The MLAPLACEORDER option similarly allows you to set the order of Laplace approximation to use in the estimation of the mean model to 1 instead of 0.

Options: PRINT, LMETHOD, SEMETHOD, DMETHOD, EMETHOD, MLAPLACEORDER, DLAPLACEORDER, MAXCYCLE, EXIT, TOLERANCE, ETOLERANCE, GROUPTERM.

Parameters: Y, NBINOMIAL, RESIDUALS, FITTEDVALUES, SAVE.

Method

The model is fitted using the method of Lee & Nelder (2006).

Action with RESTRICT

Restrictions are not allowed.

References

Lee, Y., & Nelder, J.A. (1996). Hierarchical generalized linear models (with discussion). Journal of the Royal Statistical Society, Series B, 58, 619-678.

Lee, Y., & Nelder, J.A. (2001). Hierarchical generalized linear models: a synthesis of generalised linear models, random-effect models and structured dispersions. Biometrika, 88, 987-1006.

Lee, Y. & Nelder, J.A. (2006). Double hierarchical generalized linear models (with discussion). Appl. Statist., 55, 139-185.

Lee, Y., Nelder, J.A. & Pawitan, Y. (2006). Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood. Chapman and Hall, Boca Raton.

See also

Procedures: GEE, GLMM, HGDISPLAY, HGDRANDOMMODEL, HGFIXEDMODEL, HGFTEST, HGGRAPH, HGKEEP, HGNONLINEAR, HGPLOT, HGPREDICT, HGRANDOMMODEL, HGRTEST, HGSTATUS, HGTOBITPOISSON, HGWALD.

Commands for: Regression analysis.

Example

CAPTION  'HGANALYSE example',!t(\
         'Breaking angles of cake baked from 3 recipes at 10 temperatures',\
         '(Cochran & Cox, 1957, Experimental Designs, page 300).',\
         'Data values are assumed to follow a GLM with a gamma distribution',\
         'and reciprocal link. The linear predictor contains additional',\
         'random variables, with inverse gamma distributions and reciprocal',\
         'link, for replicates and batches of cake mixture.');\
         STYLE=meta,plain
FACTOR   [NVALUES=270; LEVELS=3] Recipe
&        [LEVELS=15] Replicate
&        [LEVELS=!(175,185...225)] Temperature
GENERATE Recipe,Replicate,Temperature
VARIATE  [NVALUES=270] Angle
READ     Angle
42 46 47 39 53 42 47 29 35 47 57 45 32 32 37 43 45 45
26 32 35 24 39 26 28 30 31 37 41 47 24 22 22 29 35 26
26 23 25 27 33 35 24 33 23 32 31 34 24 27 28 33 34 23
24 33 27 31 30 33 33 39 33 28 33 30 28 31 27 39 35 43
29 28 31 29 37 33 24 40 29 40 40 31 26 28 32 25 37 33
39 46 51 49 55 42 35 46 47 39 52 61 34 30 42 35 42 35
25 26 28 46 37 37 31 30 29 35 40 36 24 29 29 29 24 35
22 25 26 26 29 36 26 23 24 31 27 37 27 26 32 28 32 33
21 24 24 27 37 30 20 27 33 31 28 33 23 28 31 34 31 29
32 35 30 27 35 30 23 25 22 19 21 35 21 21 28 26 27 20
46 44 45 46 48 63 43 43 43 46 47 58 33 24 40 37 41 38
38 41 38 30 36 35 21 25 31 35 33 23 24 33 30 30 37 35
20 21 31 24 30 33 24 23 21 24 21 35 24 18 21 26 28 28
26 28 27 27 35 35 28 25 26 25 38 28 24 30 28 35 33 28
28 29 43 28 33 37 19 22 27 25 25 35 21 28 25 25 31 25 :
FACPRODUCT    !p(Replicate,Recipe); Batch
HGFIXEDMODEL  [DISTRIBUTION=gamma; LINK=reciprocal] Recipe*Temperature
HGRANDOMMODEL [DISTRIBUTION=inversegamma; LINK=reciprocal] Replicate+Batch
HGANALYSE     [P=#,WALD] Angle
Updated on February 7, 2023

Was this article helpful?