Uses the Tobit method to fit a hierarchical generalized linear model with censored Poisson data (R.W. Payne).
Options
PRINT =string tokens |
Controls printed output (model , fixedestimates , randomestimates , dispersionestimates , likelihoodstatistics , deviance , waldtests , fittedvalues , monitoring , hgmonitoring , dhgmonitoring , censored ); default mode , fixe , disp , devi , like , cens |
|
LMETHOD =string token |
Whether to use exact likelihood or extended quasi likelihood to obtain the y-variate and weights for the dispersion model (exact , eql ); default exac |
|
SEMETHOD =string token |
Method to use to calculate the se’s for the dispersion estimates (approximate , profilelikelihood ); default appr |
|
DMETHOD =string token |
Method to use for the adjusted profile likelihood when calculating the likelihood statistics (automatic , choleski , lrv ); default auto |
|
EMETHOD =string token |
Extrapolation method to use (aitken , adjustedaitken ); default aitk |
|
MLAPLACEORDER =scalar |
Order of Laplace approximation to use in the estimation of the mean model (0 or 1); default 0 | |
DLAPLACEORDER =scalar |
Order of Laplace approximation to use in the estimation of the dispersion components (0, 1 or 2); default 0 | |
MAXCYCLE =scalar |
Maximum number of iterations of the E-M algorithm; default 100 | |
TOLERANCE =scalar |
Convergence criterion for the E-M algorithm; default 0.001 | |
DIRECTION =string token |
Whether the data are left or right censored (left , right ); default righ |
|
HGMAXCYCLE =scalars |
Maximum number of iterations of the hierarchical generalized linear model fits, and maximum number of iterations in the fitting of the mean and dispersion models; default 99,50 | |
HGTOLERANCE =scalar |
Criterion for convergence; default 0.0005 | |
ETOLERANCE =scalar |
Maximum size of ratio of the original to the new estimates allowed in Aitken extrapolation; default 7.5 | |
GROUPTERM =formula |
Random term to use as groups when fitting the augmented mean model; default * i.e. none |
Parameters
Y = variate |
Response variate to be analysed; must be set | |
BOUND = scalar |
Censoring threshold; must be set | |
INITIAL = scalaror variate |
Scalar or a variate providing starting values for the censored observations in the E-M algorithm; default BOUND+1 for right-censored data and default BOUND-1 for left-censored data |
|
NEWY = variate |
Saves a copy of the response variate with the censored observations replaced by their estimates | |
EXIT = scalar |
Exit status (0 for success, 1 for failure in the E-M algorithm, 2 for failure to fit the generalized linear mixed model) | |
SAVE = pointer |
Saves details of the analysis for use in subsequent HGDISPLAY ,HGKEEP , HGPLOT or HGPREDICT statements |
Description
When an experiment generates a mixture of small and very large counts, it may be convenient to count only the observations less than a specified boundary value, and enter that value for the larger observations. The data then come from a right-censored Poisson distribution. In the similar (but less common) left-censored situation, the emphasis is on the larger observations. It may then not be worth recording the small observations in detail, only that they are no larger than the boundary value. Censored Poisson data can be analysed by the Tobit method (Terza 1985), which is implemented in this procedure.
In the Tobit model, the probabilities for the uncensored observations are standard Poisson probabilities. The probabilities for right-censored observations are cumulative upper Poisson probabilities for values greater than or equal to the boundary value. Probabilities for left-censored observations cumulative lower Poisson probabilities for values less than or equal to the boundary value. The Tobit method uses an E-M (expectation-maximization) algorithm to estimate values for the censored observations. It starts with initial estimates for the censored observations, which can be specified by the INITIAL
parameter in either a variate or a scalar. For right-censored data the default is to use the boundary value plus one. For left-censored data the default is the boundary value minus one. In each iteration, the method uses the HGANALYSE
procedure to fit a Poisson-log hierarchical generalized linear model, saving the resulting fitted values to provide estimated means for the Poisson distributions of the censored observations. The new estimates for the censored observations are then given by the expected values for the upper parts of those Poisson distributions. The process continues either until the updates to the estimates are less than or equal to the value specified by the TOLERANCE
option (default 0.001), or until the number of iterations equals the number specified by the MAXCYCLE
option (default 100). The EXIT
parameter can be set to a scalar that will be set to zero for a successful fit, one for failure in the E-M algorithm, two if the hierarchical generalized linear model has failed to fit, or a missing value for an earlier fault.
The models for the hierarchical generalized linear model must be specified beforehand by the HGFIXEDMODEL
and HGDRANDOMMODEL
procedures, as usual, except that HGFIXEDMODEL
must specify DISTRIBUTION=poisson
and LINK=logarithm.
The response variate is specified by the Y
parameter, and the NEWY
parameter can save a variate where the censored observations are replaced by their estimates. The BOUND
option specifies the boundary value for the censoring (and the value that has been entered to indicate the censored observations in the Y
variate). The DIRECTION
option specifies whether the data are left or right censored. The default is that they are right censored.
The HGMAXCYCLE
and HGTOLERANCE
options specify the maximum number of iterations and tolerance for the fit of the hierarchical generalized linear model by the HGANALYSE
procedure, and correspond to the MAXCYCLE
and TOLERANCE
options of HGANALYSE
. The options LMETHOD
, SEMETHOD
, DMETHOD
, EMETHOD
, MLAPLACEORDER
, DLAPLACEORDER
, MAXCYCLE
, TOLERANCE
, HGMAXCYCLE
, HGTOLERANCE
, ETOLERANCE
and GROUPTERM
all operate exactly like the corresponding options of HGANALYSE
. The PRINT
option is similar. However, the monitoring
setting prints monitoring information for the E-M algorithm. There is a hgmonitoring
to monitor for the fit of the hierarchical generalized linear model. There is also a setting censored
to print the estimates of the censored observations.
The SAVE
parameter can save a pointer, with information about the hierarchical generalized linear model analysis, for use by procedures like HGDISPLAY
and HGKEEP
.
Options: PRINT
, LMETHOD
, SEMETHOD
, DMETHOD
, EMETHOD
, MLAPLACEORDER
, DLAPLACEORDER
, MAXCYCLE
, TOLERANCE
, DIRECTION
, HGMAXCYCLE
, HGTOLERANCE
, ETOLERANCE
, GROUPTERM
. Parameters: Y
, BOUND
, INITIAL
, NEWY
, EXIT
, SAVE
.
Method
The hierarchical generalized linear model is fitted by the HGANALYSE
procedure. The expected values for the upper parts of the Poisson distributions are calculated by the EUPOISSON
procedure, and those for the lower parts of the distributions are calculated by the ELPOISSON
procedure.
Reference
Terza, J.V. (1985). A Tobit-type estimator for the censored Poisson regression model. Economics Letters, 18, 361-365.
See also
Procedures: CENSOR
, ELPOISSON
, EUPOISSON
, HGANALYSE
, HGDISPLAY
, HGDRANDOMMODEL
, HGFIXEDMODEL
, HGFTEST
, HGGRAPH
, HGKEEP
, HGNONLINEAR
, HGPLOT
, HGPREDICT
, HGRANDOMMODEL
, HGRTEST
, HGSTATUS
, HGWALD
, GLTOBITPOISSON
, RTOBITPOISSON
, TOBIT
.
Commands for: Regression analysis.
Example
CAPTION 'HGTOBITPOISSON example',\ !t('Nematode data from Cochran & Cox (1957) p.46,',\ 'analysed in Section 4.3 of the Statistics Guide,',\ 'with the unfumigated plots removed to simplify the analysis.',\ 'Suppose that counting stopped at 400.',\ 'Units 18-20, 24 & 27 are then censored.'); STYLE=meta,plain SPLOAD '%data%/Nematode.gsh' SUBSET [Fumigant.IN.'Fumigated'; SETLEVELS=yes]\ Blocks,Amount,Type,Count,Priorcount CALCULATE Logpriorcount = LOG(Priorcount) HGFIXEDMODEL [DISTRIBUTION=poisson; LINK=log; DISPERSION=*]\ Logpriorcount+Amount*Type HGRANDOMMODEL [DISTRIBUTION=gamma; LINK=log] Blocks HGTOBITPOISSON [PRINT=model,fixedestimates,dispersionestimates,\ likelihoodstatistics,waldtests,censored] Count; BOUND=400