Performs a Tobit linear mixed model analysis on data with fixed-threshold censoring (M.C. Hannah & V.M. Cave).
|Controls printed output (
||Controls printed output from the
||Standard errors to be printed with tables of effects and means from the
||To display a scatter plot of the data with censored observations replaced by their estimates against the observed data(
||Sets a limit on the number of iterations performed by the E-M algorithm; default
||Sets tolerance limits for convergence of the E-M algorithm on the treatment means and the variance components; default 0.1 and 0.05 for the treatment means and variance components, respectively|
||Which random terms to use when calculating the residuals during the E-step of the E-M algorithm (
||The direction of the censoring (
||Response variate to be analysed; no default, must be set|
||Censoring threshold; no default, must be set|
||Indicator variable for censored observations, with values of one where the response values are censored and zero otherwise|
||Scalar or a variate providing starting values for the censored observations in the E-M algorithm|
||Saves a copy of the response variate with the censored observations replaced by their estimates|
||Saves a logical variate indicating which
TOBIT procedure performs a linear mixed model analysis on data values that are subject to fixed threshold censoring. Such censoring occurs when a measurement cannot be taken above or below a bound. For example, chemical concentrations may be censored when they fall below a minimum level of quantification. The procedure uses an E-M algorithm to estimate values for the censored observations, and once converged, uses
REML to analyse the response variate with the censored observations replaced by their estimates.
TOBIT must be preceded by a
VCOMPONENTS command to define the fixed and random models. (Note, however, that
TOBIT does not accommodate spline terms in
VCOMPONENTS, nor linear mixed models with complex covariance structures defined by
The response variate must be supplied using the
Y parameter, and a scalar defining the fixed censoring threshold must be supplied using the
BOUND parameter. By default, the data values are assumed to be left-censored (i.e., measurements less than or equal to the value specified by the
BOUND parameter are censored). However, right-censoring (i.e., when measurements greater than or equal to the
BOUND are censored) can be specified by setting the
DIRECTION option to
right. Censored observations in
Y may be represented either by missing values or by values at or outside the
BOUND (i.e., for left-censoring, y-values ≤
BOUND, or, for right-censoring, y-values ≥
BOUND). If missing values are used, an indicator variate, with values of one corresponding to censored observations and values of zero to the non-censored observations, must be supplied using the
RMETHOD options, the
INITIAL parameter and the
VAOPTIONS procedure can be used to control various aspects of the E-M algorithm performed by
INITIAL parameter provides starting values for the estimates of the censored observations. If available, these may speed up convergence of the E-M algorithm. The values should be below the value specified by the
BOUND parameter when
DIRECTION=left, or above that value when
INITIAL can supply a scalar if a common starting value is to be used for all the censored observations. Alternatively, if different values are required,
INITIAL should supply a variate of the same length as
Y. Only the values corresponding to censored observations are used, the others are ignored. If
INITIAL if not specified, the default is to use the value specified by the
MAXCYCLE option specifies the maximum number of iterations performed by the E-M algorithm (default 30). By default, the E-M algorithm is deemed to have converged if the percentage change in each estimated treatment mean is less than 0.1%, and the percentage change in each estimated variance component is less than 0.05%. However, you can change these tolerance limits by setting the
TOLERANCE option to a variate of length two. Its first value specifies the maximum acceptable percentage change for the treatment means, and its second value specifies the maximum acceptable percentage change for the variance components.
RMETHOD specifies which random terms are used when estimating values for the censored observations during the E-step of the E-M algorithm. With
RMETHOD=all, the censored observations are estimated from the fixed effects only, whereas when
RMETHOD=final, the censored observations are estimated from the fixed and random effects; default
final. Finally, the
VAOPTIONS procedure can be used to specify the
WORKSPACE options of the
REML commands used during the M-step of the E-M algorithm.
Printed output is controlled by the
PSE options. The
summary, which prints information on the number of E-M algorithm iterations performed, the percentage of observations censored and the censoring threshold. This is the default, but you can suppress this output by setting option
PSE options control the printed output from the
REML analysis when the censored observations have been replaced by their estimates. The
VPRINT option has the same settings as the
REML directive, other than that
covariancemodels is excluded; the default is
PRINT=model,comp,Wald. Similarly, the setting of
PSE are the same as those of the
PSE option of the
REML directive; the default is
You can set option
PLOT=scatterplot to display a scatter plot of the data, plotting the new y-variate, with censored observations replaced by their estimates, against the observed response variate. When censored observations in
Y are entered as missing values, they are plotted at the value specified by the
BOUND parameter; otherwise, they are plotted at the values given in
Y. Superimposed onto this plot are a 1-1 line and a horizontal reference line at the censoring threshold defined by the
BOUND parameter. By default, no plot is produced.
NEWY parameter allows you to save a copy of the response variate with the censored observations replaced by their estimates. An indicator variable with values of one corresponding to censored observations in
Y and values of zero to non-censored observations can be saved using the
YCENSORED parameter. Note, this will be equivalent to any variate supplied by
SAVE parameter can be used to save the save structure from the
REML analysis of the data with censored observations replaced by their estimates, for later use by other
REML directives and procedures, such as
The E-M (expectation-maximization) algorithm is an iterative two step method to optimize a model. The initial expectation step uses the initial values (either
INITIAL, if given, or
BOUND) for the censored observations. In the maximization step, the current estimates of the censored values are used in the y-variate in a standard
REML analysis to estimate the fitted values and their variances. In subsequent expectation steps, the censored values are estimated as the expected value of the tail of Normal distribution with means and variances for these observations from previous M-step model. The expected deviate in the lower tail of a Normal distribution (x <
m - SQRT(v)*PRNORMAL(BOUND;m;v)/CLNORMAL(BOUND;m;v).
Restrictions are not allowed.
Amemiya, T. (1984). Tobit models: A survey. Journal of Econometrics, 24, 3-61.
Dempster, A.P., Laird, N.M. & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1-38.
Taylor, J. (1973). The analysis of designed experiments with censored observations. Biometrics, 29, 35-43.
Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica, 26, 24-36.
CAPTION 'TOBIT example',\ !T('Oats.gsh contains yield data from a split-plot experiment.',\ 'For details, see section 5.1 in',\ 'A Guide to Anova and Design in Genstat');\ STYLE=meta,plain SPLOAD [PRINT=summary] '%Data%/Oats.gsh' CAPTION 'Example 1: Yield left-censored data at 70.'; STYLE=meta CALCULATE yield_lc = yield*(yield .GT. 70) + 70*(yield .LE. 70) VCOMPONENTS [FIXED=nitrogen*variety] RANDOM=blocks/wplots/subplots TOBIT [PLOT=scatterplot] Y=yield_lc; BOUND=70 CAPTION 'Example 2: Yield right-censored data at 130.'; STYLE=meta CALCULATE yield_rc = yield*(yield .LT. 130) + 130*(yield .GE. 130) VCOMPONENTS [FIXED=nitrogen*variety] RANDOM=blocks/wplots/subplots TOBIT [DIRECTION=right; PLOT=scatterplot] Y=yield_rc; BOUND=130 CAPTION !T('Example 3: Yield right-censored data at 130,', \ 'recorded as missing values.'); STYLE=meta CALCULATE yield_rc0 = REPLACE(yield_rc; 130; !s(*)) CALCULATE censored = yield_rc0 == !s(*) TOBIT [DIRECTION=right; PLOT=scatterplot] Y=yield_rc0; BOUND=130; \ CENSORED=censored CAPTION 'Example 4: Saving structures to produce extra output.'; STYLE=meta TOBIT [VPRINT=*] Y=yield_lc; BOUND=70; NEWY=newy; YCENSORED=ycen; SAVE=vsave VPLOT [SAVE=vsave; RMETHOD=final] METHOD=fitted; PEN=ycen+1 VPLOT [SAVE=vsave; RMETHOD=all] METHOD=fitted; PEN=ycen+1 FACTOR [LEVELS=!(0,1); LABELS=!T(uncensored,censored)] ycenF; VALUES=ycen DOTHISTOGRAM [KEYDESCRIPTION=''] newy; PENS=ycenF VKEEP [FITTED=fit; SAVE=vsave] DOTHISTOGRAM [KEYDESCRIPTION=''] fit; PENS=ycenF