Uses the Tobit method to fit a generalized linear mixed model with censored Poisson data (R.W. Payne).

### Options

`PRINT` = string token |
What output to display (`model` , `components` , `effects` , `fittedvalues` , `means` , `backmeans` , `monitoring` , `vcovariance` , `waldtests` , `missingvalues` , `covariancemodels` , `deviance` , `censored` ; default `mode` , `comp` , `mean` , `back` , `cens` |

`DISPERSION` = scalar |
Value at which to fix the residual variance, if missing the variance is estimated; default 1 for binomial, Poisson and negative binomial distributions, a missing value otherwise |

`RANDOM` = formula |
Random model excludingbottom stratum; this must be set |

`FIXED` = formula |
Fixed model; default `*` |

`CONSTANT` = string token |
Whether to estimate or omit constant term in fixed model (`omit` , `estimate` ); default `esti` |

`FACTORIAL` = scalar |
Limit on number of factors/covariates in a model term; default 3 |

`PTERMS` = formula |
Formula specifying fixed terms for which means or back-transformed means are to be printed; default `*` prints all the fixed model terms |

`PSE` = string token |
Standard errors to print with tables of means (`se` , `sesummary` , `sed` , `sedsummary` , `vcovariance` , `differences` , `estimates` , `alldifferences` , `allestimates` ); default `seds` |

`MVINCLUDE` = string tokens |
Whether to include units with missing values in the explanatory factors and variates and/or the y-variates (`explanatory` , `yvariate` ); default `*` i.e. omit units with missing values in either explanatory factors or variates or y-variates |

`MAXCYCLE` = scalar |
Maximum number of iterations of the E-M algorithm; default 100 |

`TOLERANCE` = scalar |
Convergence criterion for the E-M algorithm; default 0.001 |

`DIRECTION` = string token |
Whether the data are left or right censored (`left` , `right` ); default `righ` |

`GLMAXCYCLE` = scalar |
Maximum number of iterations of the `GLMM` algorithm; default 20 |

`GLTOLERANCE` = scalar |
Convergence criterion for iterative procedure; default 0.0001 |

`FMETHODGLMM` = string token |
Specifies fitting method (`all` , `fixed` ): `all` indicates the method of Schall (1991); `fixed` indicates the marginal method of Breslow & Clayton (1993) ; default `all` |

`CADJUST` = string token |
What adjustment to make to covariates for the `REML` analysis (`mean` , `none` ); default `mean` |

`WORKSPACE` = scalar |
Number of blocks of internal memory to be set up for use by the `REML` algorithm; default 1 |

`VCONSTRAINTS` = string token |
Whether to constrain variance components to be positive (`none` , `positive` ); default `posi` |

`VMETHOD` = string token |
Indicates whether to use the standard Fisher-scoring algorithm or the new AI algorithm with sparse matrix methods (`Fisher` , `AI` ); default `AI` |

`VMAXCYCLE` = scalar |
Limit on the number of iterations; default 30 |

### Parameters

`Y` = variate |
Response variate to be analysed; must be set |

`BOUND` = scalar |
Censoring threshold; must be set |

`INITIAL` = scalar or variate |
Scalar or a variate providing starting values for the censored observations in the E-M algorithm; default `BOUND+1` for right-censored data and `BOUND−` 1 for left-censored data |

`NEWY` = variate |
Saves a copy of the response variate with the censored observations replaced by their estimates |

`OFFSET` = variate |
Offset variate |

`EXIT` = scalar |
Exit status (0 for success, 1 for failure in the E-M algorithm, 2 for failure to fit the generalized linear mixed model) |

`SAVE` = `REML` save structure |
`REML` save structure from the analysis of the data with censored observations replaced by their estimates |

`GLSAVE` = pointer |
`GLMM` save structure from the analysis of the data with censored observations replaced by their estimates |

### Description

When an experiment generates a mixture of small and very large counts, it may be convenient to count only the observations less than a specified boundary value, and enter that value for the larger observations. The data then come from a right-censored Poisson distribution. In the similar (but less common) left-censored situation, the emphasis is on the larger observations. It may then not be worth recording the small observations in detail, only that they are no larger than the boundary value. Censored Poisson data can be analysed by the Tobit method (Terza 1985), which is implemented in this procedure.

In the Tobit model, the probabilities for the uncensored observations are standard Poisson probabilities. The probabilities for right-censored observations are cumulative upper Poisson probabilities for values greater than or equal to the boundary value. Probabilities for left-censored observations cumulative lower Poisson probabilities for values less than or equal to the boundary value. The Tobit method uses an E-M (expectation-maximization) algorithm to estimate values for the censored observations. It starts with initial estimates for the censored observations, which can be specified by the `INITIAL`

parameter in either a variate or a scalar. For right-censored data the default is to use the boundary value plus one. For left-censored data the default is the boundry value minus one. In each iteration, the method first fits a Poisson-log generalized linear mixed model, saving the resulting fitted values to provide estimated means for the Poisson distributions of the censored observations. The new estimates for the censored observations are then given by the expected values for the upper parts of those Poisson distributions. The process continues either until the updates to the estimates are less than or equal to the value specified by the `TOLERANCE`

option (default 0.001), or until the number of iterations equals the number specified by the `MAXCYCLE`

option (default 100). The `EXIT`

parameter can be set to a scalar that will be set to zero for a successful fit, one for failure in the E-M algorithm, two if the generalized linear mixed model has failed to fit, or a missing value for an earlier fault.

The fixed and random models are specified by the `FIXED`

and `RANDOM`

options, respectively. By default the variance components are constrained to be positive, but you can set option `VCONSTRAINTS`

to `none`

to allow them to become negative. The model can also contain an offset, specified by the `OFFSET`

parameter. The `CONSTANT`

option indicates whether the constant is to be estimated or omitted, and the `FACTORIAL`

option sets a limit on the number of variates and/or factors in the model terms, in the usual way. As in the `REML`

directive, the `MVINCLUDE`

option specifies whether units with missing values in explanatory factors and variates and/or y-variates are to be included. The `DISPERSION`

option specifies the dispersion parameter (default 1). A missing value indicates that the dispersion parameter is to be estimated.

The response variate is specified by the `Y`

parameter, and the `NEWY`

parameter can save a variate where the censored observations are replaced by their estimates. The `BOUND`

option specifies the boundary value for the censoring (and the value that has been entered to indicate the censored observations in the `Y`

variate). The `DIRECTION`

option specifies whether the data are left or right censored. The default is that they are right censored.

The `GLMAXCYCLE`

and `GLTOLERANCE`

options specify the maximum number of iterations (default 20) and tolerance (default 0.0001) for the fit of the generalized linear model by the `GLMM`

procedure. The `FMETHODGLMM`

option specifies the method used to form the fitted values in the `GLMM`

analysis, and thus determines the fitting method to be used. The default setting `all`

specifies that both fixed and random terms should be used to form fitted values which gives the method of Schall (1991); setting `fixed`

indicates that only fixed terms are used to form fitted values which gives the marginal method of Breslow & Clayton (1993).

The `VMAXCYCLE`

option sets the limit on the number of iterations in the `REML`

algorithm (default 30). The `VMETHOD`

option specifies the algorithm to use in the `REML`

steps of the `GLMM`

algorithm: either Fisher or AI(default). By default any covariates are centred for the `REML`

fitting by subtracting their means, weighted according to the iterative weights of the generalized linear model. Alternatively you can set option `CADJUST=none`

to request that the uncentred covariates are used instead. The `WORKSPACE`

option specifies the number of blocks of internal memory to be set up for use by the `REML`

algorithm; default 1

The `PRINT`

option controls the printed output. The settings are as in the `GLMM`

procedure, except that the `monitoring`

setting prints monitoring information for the E-M algorithm, and that there is an additional setting `censored`

to print the estimates of the censored observations. The options `PTERMS`

and `PSE`

all operate like those of `GLMM`

and `REML`

.

The `GLSAVE`

parameter can save a pointer, with information about the `GLMM`

analysis, for use by procedures like `GLDISPLAY`

or `GLKEEP`

. Alternatively the `SAVE`

parameter can save a `REML`

save structure to provide a more direct route to the `REML`

part of the analysis. This can be used by the directives `VDISPLAY`

, `VKEEP`

etc. Options: `PRINT`

, `DISPERSION`

, `RANDOM`

, `FIXED`

, `CONSTANT`

, `FACTORIAL`

, `PTERMS`

, `PSE`

, `MVINCLUDE`

, `MAXCYCLE`

, `TOLERANCE`

, `DIRECTION`

, `GLMAXCYCLE`

, `GLTOLERANCE`

, `FMETHODGLMM`

, `CADJUST`

, `WORKSPACE`

, `VCONSTRAINTS`

, `VMETHOD`

, `VMAXCYCLE`

. Parameters: `Y`

, `BOUND`

, `INITIAL`

, `NEWY`

, `OFFSET`

, `EXIT`

, `GLSAVE`

, `SAVE`

.

### Method

The generalized linear mixed model is fitted by the `GLMM`

procedure. The expected values for the upper parts of the Poisson distributions are calculated by the `EUPOISSON`

procedure, and those for the lower parts of the distributions are calculated by the `ELPOISSON`

procedure.

### Action with `RESTRICT`

If the Y-variate is restricted, only the units not excluded by the restriction will be analysed.

### References

Breslow, N.E. & Clayton, D.G. (1993). Approximate inference in generalized linear mixed models. *Journal of the American Statistical Association*, **88**, 421, 9-25.

Schall, R. (1991) Estimation in generalized linear models with random effects. *Biometrika*, **78**, 719-727.

Terza, J.V. (1985). A Tobit-type estimator for the censored Poisson regression model. *Economics Letters*, **18**, 361-365.

### See also

Directive: `REML`

.

Procedures: `CENSOR`

, `ELPOISSON`

, `EUPOISSON`

, `GLMM`

, `GLDISPLAY`

, `GLKEEP`

, `GLPERMTEST`

, `GLPLOT`

, `GLPREDICT`

, `GLRTEST,`

`HGTOBITPOISSON`

, `RTOBITPOISSON,`

`TOBIT`

.

Commands for: Regression analysis.

### Example

CAPTION 'GLTOBITPOISSON example',\ !t('Nematode data from Cochran & Cox (1957) p.46,',\ 'analysed in Section 4.3 of the Statistics Guide,',\ 'with the unfumigated plots removed to simplify the analysis.',\ 'Suppose that counting stopped at 400.',\ 'Units 18-20, 24 & 27 are then censored.'); STYLE=meta,plain SPLOAD '%data%/Nematode.gsh' SUBSET [Fumigant.IN.'Fumigated'; SETLEVELS=yes]\ Blocks,Amount,Type,Count,Priorcount CALCULATE Logpriorcount = LOG(Priorcount) GLTOBITPOISSON [PRINT=model,components,fittedvalues,means,backmeans,\ waldtests,censored; RANDOM=Blocks; FIXED=Amount*Type;\ DISPERSION=*] Count; BOUND=400; OFFSET=Logpriorcount