Uses the Tobit method to fit models to censored Poisson data (R.W. Payne).

### Options

`PRINT` = string tokens |
What to print (`model` , `deviance` , `summary` , `estimates` , `correlations` , `fittedvalues` , `accumulated` , `monitoring` , `confidence` , `censored` ); default `mode` , `summ` , `esti` |

`TERMS` = formula |
Defines the model to be fitted |

`CONSTANT` = string token |
How to treat the constant (`estimate` , `omit` ); default `esti` |

`FACTORIAL` = scalar |
Limit for expansion of model terms; default 3 |

`POOL` = string token |
Whether to pool ss in accumulated summary between all terms fitted in a linear model (`yes` , `no` ); default `no` |

`DENOMINATOR` = string token |
Whether to base ratios in accumulated summary on rms from model with smallest residual ss or smallest residual ms (`ss` , `ms` ); default `ss` |

`NOMESSAGE` = string tokens |
Which warning messages to suppress (`dispersion` , `leverage` , `residual` , `aliasing` , `marginality` , `vertical` , `df` , `inflation` ); default `*` |

`FPROBABILITY` = string token |
Printing of probabilities for variance and deviance ratios (`yes` , `no` ); default `no` |

`TPROBABILITY` = string token |
Printing of probabilities for t-statistics (`yes` , `no` ); default `no` |

`SELECTION` = string tokens |
Statistics to be displayed in the summary of analysis produced by `PRINT=summary` (`%variance` , `%ss` , `adjustedr2` , `r2` , `dispersion` , `%meandeviance` , `%deviance` , `aic` , `bic` , `sic` ); default `disp` |

`DISPERSION` = scalar |
Dispersion parameter; default 1 |

`PROBABILITY` = scalar |
Probability level for confidence intervals for parameter estimates; default 0.95 |

`WEIGHTS` = variate |
Variate of weights for weighted regression; default `*` |

`GROUPS` = factor |
Absorbing factor defining the groups for within-groups regression; default `*` |

`MAXCYCLE` = scalar |
Sets a limit on the number of iterations performed by the E-M algorithm; default 100 |

`TOLERANCE` = variate |
Sets tolerance limits for convergence of the E-M algorithm on the estimates of the censored observations; default 0.001 |

`DIRECTION` = string token |
Whether the data are left or right censored (`left` , `right` ); default `righ` |

### Parameters

`Y` = variate |
Response variate to be analysed; must be set |

`BOUND` = scalar |
Censoring threshold; must be set |

`INITIAL` = scalar or variate |
Scalar or a variate providing starting values for the censored observations in the E-M algorithm; default `BOUND+1` for right-censored data and default `BOUND-1` for left-censored data |

`NEWY` = variate |
Saves a copy of the response variate with the censored observations replaced by their estimates |

`OFFSET` = variate |
Offset variate |

`EXIT` = scalar |
Exit status (0 for success, 1 for failure to converge) |

`SAVE` = regression save structure |
Save structure from the analysis of the data with censored observations replaced by their estimates |

### Description

When an experiment generates a mixture of small and very large counts, it may be convenient to count only the observations less than a specified boundary value, and enter that value for the larger observations. The data then come from a right-censored Poisson distribution. In the similar (but less common) left-censored situation, the emphasis is on the larger observations. It may then not be worth recording the small observations in detail, only that they are no larger than the boundary value. Censored Poisson data can be analysed by the Tobit method (Terza 1985), which is implemented in this procedure.

In the Tobit model, the probabilities for the uncensored observations are standard Poisson probabilities. The probabilities for right-censored observations are cumulative upper Poisson probabilities for values greater than or equal to the boundary value. Probabilities for left-censored observations cumulative lower Poisson probabilities for values less than or equal to the boundary value. The Tobit method uses an E-M (expectation-maximization) algorithm to estimate values for the censored observations. It starts with initial estimates for the censored observations, which can be specified by the `INITIAL`

parameter in either a variate or a scalar. For right-censored data the default is to use the boundary value plus one. For left-censored data the default is the boundary value minus one. In each iteration, the method first fits a Poisson-log generalized linear model, saving the resulting fitted values to provide estimated means for the Poisson distributions of the censored observations. The new estimates for the censored observations are then given by the expected values for the upper parts of those Poisson distributions. The process continues either until the updates to the estimates are less than or equal to the value specified by the `TOLERANCE`

option (default 0.001), or until the number of iterations equals the number specified by the `MAXCYCLE`

option (default 100). The `EXIT`

parameter can be set to a scalar which will be set to zero for a successful fit, one for failure in the E-M algorithm, or a missing value for an earlier fault.

The model to be fitted is specified by the `TERMS`

option, and can contain an offset specified by the `OFFSET`

parameter. The `CONSTANT`

option indicates whether the constant is to be estimated or omitted, and the `FACTORIAL`

option sets a limit on the number of variates and/or factors in the model terms, in the usual way.

The response variate is specified by the `Y`

parameter, and the `NEWY`

parameter can save a variate where the censored observations are replaced by their estimates. The `SAVE`

parameter can save a regression save structure for the analysis that can be used to display further output, or save information from the analysis, as usual. The `BOUND`

option specifies the boundary value for the censoring (and the value that has been entered to indicate the censored observations in the `Y`

variate). The `DIRECTION`

option specifies whether the data are left or right censored. The default is that they are right censored.

The `PRINT`

option controls the printed output. The settings are as in the `FIT`

directive, except that the `monitoring`

setting prints monitoring information for the E-M algorithm, and that there is an additional setting `censored`

to print the estimates of the censored observations. The `WEIGHTS`

and `GROUPS`

options operate as in the `MODEL`

directive. `WEIGHTS`

can be used to specify duplicate observations (and the Tobit calculations are then still valid). For example, you could use a weight of two to supply a single unit in the data for two observations with an identical response and identical explanatory variates. The other options (`POOL`

, `DENOMINATOR`

, `NOMESSAGE`

, `FPROBABILITY`

, `TPROBABILITY`

, `SELECTION`

, `DISPERSION`

and `PROBABILITY`

) all operate like those of `FIT`

.

Options: `PRINT`

, `TERMS`

, `CONSTANT`

, `FACTORIAL`

, `POOL`

, `DENOMINATOR`

, `NOMESSAGE`

, `FPROBABILITY`

, `TPROBABILITY`

, `SELECTION`

, `DISPERSION`

, `PROBABILITY`

, `MAXCYCLE`

, `TOLERANCE`

, `DIRECTION`

.

Parameters: `Y`

, `BOUND`

, `INITIAL`

, `NEWY`

, `OFFSET`

, `EXIT`

, `SAVE`

.

### Method

The expected values for the upper parts of the Poisson distributions are calculated by the `EUPOISSON`

procedure, and those for the lower parts of the distributions are calculated by the `ELPOISSON`

procedure.

### Action with `RESTRICT`

As in `FIT`

, the y-variate or any of the model variates or factors can be restricted to analyse a subset of the data.

### Reference

Terza, J.V. (1985). A Tobit-type estimator for the censored Poisson regression model. *Economics Letters*, **18**, 361-365.

### See also

Directive: `FIT`

.

Procedures: `CENSOR`

, `ELPOISSON`

, `EUPOISSON`

, `GLTOBITPOISSON`

, `HGTOBITPOISSON`

, `TOBIT`

.

Commands for: Regression analysis.

### Example

CAPTION 'RTOBITPOISSON example',\ !t('Experiment to investigate whether a novel endophyte provides',\ 'ryegrass with protection against a common pasture insect pest,',\ 'thus resulting in bigger plants with more tillers.',\ 'Treatment factors: Cultivar A or B; Endophyte E+ or E- (present or absent);',\ 'and Insect yes or no (whether or not treated with the insect pest).',\ 'Counts were censored at 200.');\ STYLE=meta,plain SPLOAD '%data%/CensoredCounts.gsh' RTOBITPOISSON [PRINT=model,estimates,fittedvalues,accumulated;\ FPROBABILITY=yes; TERMS=Replicate+Cultivar*Endophyte*Insect;\ DISPERSION=*; MAXCYCLE=100] Tillers; BOUND=200 PREDICT [PRINT=predictions,sed] Endophyte,Insect