Performs a Tobit linear mixed model analysis on data with fixed-threshold censoring (M.C. Hannah & V.M. Cave).

### Options

`PRINT` = string token |
Controls printed output (`summary` ); default `summ` |

`VPRINT` = string tokens |
Controls printed output from the `REML` analysis of the data with censored observations replaced by their estimates (`model` , `components` , `effects` , `means` , `stratumvariances` , `monitoring` , `vcovariance` , `deviance` , `Waldtests` , `missingvalues` ); default `mode` , `comp` , `Wald` |

`PSE` = string token |
Standard errors to be printed with tables of effects and means from the `REML` analysis (`differences` , `estimates` , `alldifferences` , `allestimates` , `none` ); default `diff` |

`PLOT` = string token |
To display a scatter plot of the data with censored observations replaced by their estimates against the observed data(`scatterplot` ); default `*` |

`MAXCYCLE` = scalar |
Sets a limit on the number of iterations performed by the E-M algorithm; default `30` |

`TOLERANCE` = variate |
Sets tolerance limits for convergence of the E-M algorithm on the treatment means and the variance components; default 0.1 and 0.05 for the treatment means and variance components, respectively |

`RMETHOD` = string token |
Which random terms to use when calculating the residuals during the E-step of the E-M algorithm (`final` , `all` ); default `final` |

`DIRECTION` = string token |
The direction of the censoring (`left` , `right` ); default `left` (i.e., the true values for the censored observations are less than or equal to the `BOUND` ) |

### Parameters

`Y` = variate |
Response variate to be analysed; no default, must be set |

`BOUND` = scalar |
Censoring threshold; no default, must be set |

`CENSORED` = variate |
Indicator variable for censored observations, with values of one where the response values are censored and zero otherwise |

`INITIAL` = scalar or variate |
Scalar or a variate providing starting values for the censored observations in the E-M algorithm |

`NEWY` = variate |
Saves a copy of the response variate with the censored observations replaced by their estimates |

`YCENSORED` = variate |
Saves a logical variate indicating which `Y` values are censored |

`SAVE` = save structure |
`REML` save structure from the analysis of the data with censored observations replaced by their estimates |

### Description

The `TOBIT`

procedure performs a linear mixed model analysis on data values that are subject to fixed threshold censoring. Such censoring occurs when a measurement cannot be taken above or below a bound. For example, chemical concentrations may be censored when they fall below a minimum level of quantification. The procedure uses an E-M algorithm to estimate values for the censored observations, and once converged, uses `REML`

to analyse the response variate with the censored observations replaced by their estimates.

`TOBIT`

must be preceded by a `VCOMPONENTS`

command to define the fixed and random models. (Note, however, that `TOBIT`

does not accommodate spline terms in `VCOMPONENTS`

, nor linear mixed models with complex covariance structures defined by `VSTRUCTURE`

.)

The response variate must be supplied using the `Y`

parameter, and a scalar defining the fixed censoring threshold must be supplied using the `BOUND`

parameter. By default, the data values are assumed to be left-censored (i.e., measurements less than or equal to the value specified by the `BOUND`

parameter are censored). However, right-censoring (i.e., when measurements greater than or equal to the `BOUND`

are censored) can be specified by setting the `DIRECTION`

option to `right`

. Censored observations in `Y`

may be represented either by missing values or by values at or outside the `BOUND`

(i.e., for left-censoring, y-values ≤ `BOUND`

, or, for right-censoring, y-values ≥ `BOUND`

). If missing values are used, an indicator variate, with values of one corresponding to censored observations and values of zero to the non-censored observations, must be supplied using the `CENSORED`

parameter.

The `MAXCYCLE`

, `TOLERANCE`

and `RMETHOD`

options, the `INITIAL`

parameter and the `VAOPTIONS`

procedure can be used to control various aspects of the E-M algorithm performed by `TOBIT`

. The `INITIAL`

parameter provides starting values for the estimates of the censored observations. If available, these may speed up convergence of the E-M algorithm. The values should be below the value specified by the `BOUND`

parameter when `DIRECTION=left`

, or above that value when `DIRECTION=right`

. `INITIAL`

can supply a scalar if a common starting value is to be used for all the censored observations. Alternatively, if different values are required, `INITIAL`

should supply a variate of the same length as `Y`

. Only the values corresponding to censored observations are used, the others are ignored. If `INITIAL`

if not specified, the default is to use the value specified by the `BOUND`

parameter.

The `MAXCYCLE`

option specifies the maximum number of iterations performed by the E-M algorithm (default 30). By default, the E-M algorithm is deemed to have converged if the percentage change in each estimated treatment mean is less than 0.1%, and the percentage change in each estimated variance component is less than 0.05%. However, you can change these tolerance limits by setting the `TOLERANCE`

option to a variate of length two. Its first value specifies the maximum acceptable percentage change for the treatment means, and its second value specifies the maximum acceptable percentage change for the variance components.

The `RMETHOD`

specifies which random terms are used when estimating values for the censored observations during the E-step of the E-M algorithm. With `RMETHOD=all`

, the censored observations are estimated from the fixed effects only, whereas when `RMETHOD=final`

, the censored observations are estimated from the fixed and random effects; default `final`

. Finally, the `VAOPTIONS`

procedure can be used to specify the `MAXCYCLE`

and `WORKSPACE`

options of the `REML`

commands used during the M-step of the E-M algorithm.

Printed output is controlled by the `PRINT`

, `VPRINT`

, and `PSE`

options. The `PRINT`

option has one setting, `summary`

, which prints information on the number of E-M algorithm iterations performed, the percentage of observations censored and the censoring threshold. This is the default, but you can suppress this output by setting option `PRINT=*`

. The `VPRINT`

and `PSE`

options control the printed output from the `REML`

analysis when the censored observations have been replaced by their estimates. The `VPRINT`

option has the same settings as the `PRINT`

option of the `REML`

directive, other than that `covariancemodels`

is excluded; the default is `PRINT=model,comp,Wald`

. Similarly, the setting of `PSE`

are the same as those of the `PSE`

option of the `REML`

directive; the default is `PSE=diff`

.

You can set option `PLOT=scatterplot`

to display a scatter plot of the data, plotting the new y-variate, with censored observations replaced by their estimates, against the observed response variate. When censored observations in `Y`

are entered as missing values, they are plotted at the value specified by the `BOUND`

parameter; otherwise, they are plotted at the values given in `Y`

. Superimposed onto this plot are a 1-1 line and a horizontal reference line at the censoring threshold defined by the `BOUND`

parameter. By default, no plot is produced.

The `NEWY`

parameter allows you to save a copy of the response variate with the censored observations replaced by their estimates. An indicator variable with values of one corresponding to censored observations in `Y`

and values of zero to non-censored observations can be saved using the `YCENSORED`

parameter. Note, this will be equivalent to any variate supplied by `CENSORED`

. The `SAVE`

parameter can be used to save the save structure from the `REML`

analysis of the data with censored observations replaced by their estimates, for later use by other `REML`

directives and procedures, such as `VDISPLAY`

and `VGRAPH`

.

Options: `PRINT`

, `VPRINT`

, `PSE`

, `PLOT`

, `MAXCYCLE`

, `TOLERANCE`

, `RMETHOD`

, `DIRECTION`

Parameters: `Y`

, `BOUND`

, `CENSORED`

, `INITIAL`

, `NEWY`

, `YCENSORED`

, `SAVE`

### Method

The E-M (expectation-maximization) algorithm is an iterative two step method to optimize a model. The initial expectation step uses the initial values (either `INITIAL`

, if given, or `BOUND`

) for the censored observations. In the maximization step, the current estimates of the censored values are used in the y-variate in a standard `REML`

analysis to estimate the fitted values and their variances. In subsequent expectation steps, the censored values are estimated as the expected value of the tail of Normal distribution with means and variances for these observations from previous M-step model. The expected deviate in the lower tail of a Normal distribution (x < `BOUND`

) is

`m - SQRT(v)*PRNORMAL(BOUND;m;v)/CLNORMAL(BOUND;m;v)`

.

### Action with `RESTRICT`

Restrictions are not allowed.

### References

Amemiya, T. (1984). Tobit models: A survey. *Journal of Econometrics*, **24**, 3-61.

Dempster, A.P., Laird, N.M. & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. *Journal of the Royal Statistical Society, Series B*, **39**, 1-38.

Taylor, J. (1973). The analysis of designed experiments with censored observations. *Biometrics*, **29**, 35-43.

Tobin, J. (1958). Estimation of relationships for limited dependent variables. *Econometrica*, **26**, 24-36.

### See also

Directives: `REML`

, `VCOMPONENTS`

, `VDISPLAY`

, `VKEEP`

.

Procedures: `CENSOR`

, `GLTOBITPOISSON`

, `HGTOBITPOISSON`

, `RTOBITPOISSON`

, `VAOPTIONS`

.

Commands for: REML analysis of linear mixed models.

### Example

CAPTION 'TOBIT example',\ !T('Oats.gsh contains yield data from a split-plot experiment.',\ 'For details, see section 5.1 in',\ 'A Guide to Anova and Design in Genstat');\ STYLE=meta,plain SPLOAD [PRINT=summary] '%Data%/Oats.gsh' CAPTION 'Example 1: Yield left-censored data at 70.'; STYLE=meta CALCULATE yield_lc = yield*(yield .GT. 70) + 70*(yield .LE. 70) VCOMPONENTS [FIXED=nitrogen*variety] RANDOM=blocks/wplots/subplots TOBIT [PLOT=scatterplot] Y=yield_lc; BOUND=70 CAPTION 'Example 2: Yield right-censored data at 130.'; STYLE=meta CALCULATE yield_rc = yield*(yield .LT. 130) + 130*(yield .GE. 130) VCOMPONENTS [FIXED=nitrogen*variety] RANDOM=blocks/wplots/subplots TOBIT [DIRECTION=right; PLOT=scatterplot] Y=yield_rc; BOUND=130 CAPTION !T('Example 3: Yield right-censored data at 130,', \ 'recorded as missing values.'); STYLE=meta CALCULATE yield_rc0 = REPLACE(yield_rc; 130; !s(*)) CALCULATE censored = yield_rc0 == !s(*) TOBIT [DIRECTION=right; PLOT=scatterplot] Y=yield_rc0; BOUND=130; \ CENSORED=censored CAPTION 'Example 4: Saving structures to produce extra output.'; STYLE=meta TOBIT [VPRINT=*] Y=yield_lc; BOUND=70; NEWY=newy; YCENSORED=ycen; SAVE=vsave VPLOT [SAVE=vsave; RMETHOD=final] METHOD=fitted; PEN=ycen+1 VPLOT [SAVE=vsave; RMETHOD=all] METHOD=fitted; PEN=ycen+1 FACTOR [LEVELS=!(0,1); LABELS=!T(uncensored,censored)] ycenF; VALUES=ycen DOTHISTOGRAM [KEYDESCRIPTION=''] newy; PENS=ycenF VKEEP [FITTED=fit; SAVE=vsave] DOTHISTOGRAM [KEYDESCRIPTION=''] fit; PENS=ycenF