HGTOBITPOISSON procedure

Uses the Tobit method to fit a hierarchical generalized linear model with censored Poisson data (R.W. Payne).

Options

`PRINT` = *string tokens*		Controls printed output (`model`, `fixedestimates`, `randomestimates`, `dispersionestimates`, `likelihoodstatistics`, `deviance`, `waldtests`, `fittedvalues`, `monitoring`, `hgmonitoring`, `dhgmonitoring`, `censored`); default `mode`, `fixe`, `disp`, `devi`, `like`, `cens`
`LMETHOD` = *string token*		Whether to use exact likelihood or extended quasi likelihood to obtain the y-variate and weights for the dispersion model (`exact`, `eql`); default `exac`
`SEMETHOD` = *string token*		Method to use to calculate the se’s for the dispersion estimates (`approximate`, `profilelikelihood`); default `appr`
`DMETHOD` = *string token*		Method to use for the adjusted profile likelihood when calculating the likelihood statistics (`automatic`, `choleski`, `lrv`); default `auto`
`EMETHOD` = *string token*		Extrapolation method to use (`aitken`, `adjustedaitken`); default `aitk`
`MLAPLACEORDER` = *scalar*		Order of Laplace approximation to use in the estimation of the mean model (0 or 1); default 0
`DLAPLACEORDER` = *scalar*		Order of Laplace approximation to use in the estimation of the dispersion components (0, 1 or 2); default 0
`MAXCYCLE` = *scalar*		Maximum number of iterations of the E-M algorithm; default 100
`TOLERANCE` = *scalar*		Convergence criterion for the E-M algorithm; default 0.001
`DIRECTION` = *string token*		Whether the data are left or right censored (`left`, `right`); default `righ`
`HGMAXCYCLE` = *scalars*		Maximum number of iterations of the hierarchical generalized linear model fits, and maximum number of iterations in the fitting of the mean and dispersion models; default 99,50
`HGTOLERANCE` = *scalar*		Criterion for convergence; default 0.0005
`ETOLERANCE` = *scalar*		Maximum size of ratio of the original to the new estimates allowed in Aitken extrapolation; default 7.5
`GROUPTERM` = *formula*		Random term to use as groups when fitting the augmented mean model; default `*` i.e. none

Parameters

`Y` = *variate*		Response variate to be analysed; must be set
`BOUND` = *scalar*		Censoring threshold; must be set
`INITIAL` = *scalar* or *variate*		Scalar or a variate providing starting values for the censored observations in the E-M algorithm; default `BOUND+1` for right-censored data and default `BOUND-1` for left-censored data
`NEWY` = *variate*		Saves a copy of the response variate with the censored observations replaced by their estimates
`EXIT` = scalar		Exit status (0 for success, 1 for failure in the E-M algorithm, 2 for failure to fit the generalized linear mixed model)
`SAVE` = pointer		Saves details of the analysis for use in subsequent `HGDISPLAY`, `HGKEEP`, `HGPLOT` or `HGPREDICT` statements

Description

When an experiment generates a mixture of small and very large counts, it may be convenient to count only the observations less than a specified boundary value, and enter that value for the larger observations. The data then come from a right-censored Poisson distribution. In the similar (but less common) left-censored situation, the emphasis is on the larger observations. It may then not be worth recording the small observations in detail, only that they are no larger than the boundary value. Censored Poisson data can be analysed by the Tobit method (Terza 1985), which is implemented in this procedure.

In the Tobit model, the probabilities for the uncensored observations are standard Poisson probabilities. The probabilities for right-censored observations are cumulative upper Poisson probabilities for values greater than or equal to the boundary value. Probabilities for left-censored observations cumulative lower Poisson probabilities for values less than or equal to the boundary value. The Tobit method uses an E-M (expectation-maximization) algorithm to estimate values for the censored observations. It starts with initial estimates for the censored observations, which can be specified by the INITIAL parameter in either a variate or a scalar. For right-censored data the default is to use the boundary value plus one. For left-censored data the default is the boundary value minus one. In each iteration, the method uses the HGANALYSE procedure to fit a Poisson-log hierarchical generalized linear model, saving the resulting fitted values to provide estimated means for the Poisson distributions of the censored observations. The new estimates for the censored observations are then given by the expected values for the upper parts of those Poisson distributions. The process continues either until the updates to the estimates are less than or equal to the value specified by the TOLERANCE option (default 0.001), or until the number of iterations equals the number specified by the MAXCYCLE option (default 100). The EXIT parameter can be set to a scalar that will be set to zero for a successful fit, one for failure in the E-M algorithm, two if the hierarchical generalized linear model has failed to fit, or a missing value for an earlier fault.

The models for the hierarchical generalized linear model must be specified beforehand by the HGFIXEDMODEL and HGDRANDOMMODEL procedures, as usual, except that HGFIXEDMODEL must specify DISTRIBUTION=poisson and LINK=logarithm.

The response variate is specified by the Y parameter, and the NEWY parameter can save a variate where the censored observations are replaced by their estimates. The BOUND option specifies the boundary value for the censoring (and the value that has been entered to indicate the censored observations in the Y variate). The DIRECTION option specifies whether the data are left or right censored. The default is that they are right censored.

The HGMAXCYCLE and HGTOLERANCE options specify the maximum number of iterations and tolerance for the fit of the hierarchical generalized linear model by the HGANALYSE procedure, and correspond to the MAXCYCLE and TOLERANCE options of HGANALYSE . The options LMETHOD, SEMETHOD, DMETHOD, EMETHOD, MLAPLACEORDER, DLAPLACEORDER, MAXCYCLE, TOLERANCE, HGMAXCYCLE, HGTOLERANCE, ETOLERANCE and GROUPTERM all operate exactly like the corresponding options of HGANALYSE. The PRINT option is similar. However, the monitoring setting prints monitoring information for the E-M algorithm. There is a hgmonitoring to monitor for the fit of the hierarchical generalized linear model. There is also a setting censored to print the estimates of the censored observations.

The SAVE parameter can save a pointer, with information about the hierarchical generalized linear model analysis, for use by procedures like HGDISPLAY and HGKEEP.

Options: PRINT, LMETHOD, SEMETHOD, DMETHOD, EMETHOD, MLAPLACEORDER, DLAPLACEORDER, MAXCYCLE, TOLERANCE, DIRECTION, HGMAXCYCLE, HGTOLERANCE, ETOLERANCE, GROUPTERM. Parameters: Y, BOUND, INITIAL, NEWY, EXIT, SAVE.

Method

The hierarchical generalized linear model is fitted by the HGANALYSE procedure. The expected values for the upper parts of the Poisson distributions are calculated by the EUPOISSON procedure, and those for the lower parts of the distributions are calculated by the ELPOISSON procedure.

Reference

Terza, J.V. (1985). A Tobit-type estimator for the censored Poisson regression model. Economics Letters, 18, 361-365.

Example

CAPTION        'HGTOBITPOISSON example',\
               !t('Nematode data from Cochran & Cox (1957) p.46,',\
               'analysed in Section 4.3 of the Statistics Guide,',\
               'with the unfumigated plots removed to simplify the analysis.',\
               'Suppose that counting stopped at 400.',\
               'Units 18-20, 24 & 27 are then censored.');  STYLE=meta,plain
SPLOAD         '%data%/Nematode.gsh'
SUBSET         [Fumigant.IN.'Fumigated'; SETLEVELS=yes]\
               Blocks,Amount,Type,Count,Priorcount
CALCULATE      Logpriorcount = LOG(Priorcount)
HGFIXEDMODEL   [DISTRIBUTION=poisson; LINK=log; DISPERSION=*]\
               Logpriorcount+Amount*Type
HGRANDOMMODEL  [DISTRIBUTION=gamma; LINK=log] Blocks
HGTOBITPOISSON [PRINT=model,fixedestimates,dispersionestimates,\
               likelihoodstatistics,waldtests,censored] Count; BOUND=400

Updated on May 10, 2023

Was this article helpful?

Yes No