Performs a Tobit linear mixed model analysis on data with fixed-threshold censoring (M.C. Hannah & V.M. Cave).
Options
PRINT = string token |
Controls printed output (summary ); default summ |
VPRINT = string tokens |
Controls printed output from the REML analysis of the data with censored observations replaced by their estimates (model , components , effects , means , stratumvariances , monitoring , vcovariance , deviance , Waldtests , missingvalues ); default mode , comp , Wald |
PSE = string token |
Standard errors to be printed with tables of effects and means from the REML analysis (differences , estimates , alldifferences , allestimates , none ); default diff |
PLOT = string token |
To display a scatter plot of the data with censored observations replaced by their estimates against the observed data(scatterplot ); default * |
MAXCYCLE = scalar |
Sets a limit on the number of iterations performed by the E-M algorithm; default 30 |
TOLERANCE = variate |
Sets tolerance limits for convergence of the E-M algorithm on the treatment means and the variance components; default 0.1 and 0.05 for the treatment means and variance components, respectively |
RMETHOD = string token |
Which random terms to use when calculating the residuals during the E-step of the E-M algorithm (final , all ); default final |
DIRECTION = string token |
The direction of the censoring (left , right ); default left (i.e., the true values for the censored observations are less than or equal to the BOUND ) |
Parameters
Y = variate |
Response variate to be analysed; no default, must be set |
BOUND = scalar |
Censoring threshold; no default, must be set |
CENSORED = variate |
Indicator variable for censored observations, with values of one where the response values are censored and zero otherwise |
INITIAL = scalar or variate |
Scalar or a variate providing starting values for the censored observations in the E-M algorithm |
NEWY = variate |
Saves a copy of the response variate with the censored observations replaced by their estimates |
YCENSORED = variate |
Saves a logical variate indicating which Y values are censored |
SAVE = REML save structure |
REML save structure from the analysis of the data with censored observations replaced by their estimates |
Description
The TOBIT
procedure performs a linear mixed model analysis on data values that are subject to fixed threshold censoring. Such censoring occurs when a measurement cannot be taken above or below a bound. For example, chemical concentrations may be censored when they fall below a minimum level of quantification. The procedure uses an E-M algorithm to estimate values for the censored observations, and once converged, uses REML
to analyse the response variate with the censored observations replaced by their estimates.
TOBIT
must be preceded by a VCOMPONENTS
command to define the fixed and random models. (Note, however, that TOBIT
does not accommodate spline terms in VCOMPONENTS
, nor linear mixed models with complex covariance structures defined by VSTRUCTURE
.)
The response variate must be supplied using the Y
parameter, and a scalar defining the fixed censoring threshold must be supplied using the BOUND
parameter. By default, the data values are assumed to be left-censored (i.e., measurements less than or equal to the value specified by the BOUND
parameter are censored). However, right-censoring (i.e., when measurements greater than or equal to the BOUND
are censored) can be specified by setting the DIRECTION
option to right
. Censored observations in Y
may be represented either by missing values or by values at or outside the BOUND
(i.e., for left-censoring, y-values ≤ BOUND
, or, for right-censoring, y-values ≥ BOUND
). If missing values are used, an indicator variate, with values of one corresponding to censored observations and values of zero to the non-censored observations, must be supplied using the CENSORED
parameter.
The MAXCYCLE
, TOLERANCE
and RMETHOD
options, the INITIAL
parameter and the VAOPTIONS
procedure can be used to control various aspects of the E-M algorithm performed by TOBIT
. The INITIAL
parameter provides starting values for the estimates of the censored observations. If available, these may speed up convergence of the E-M algorithm. The values should be below the value specified by the BOUND
parameter when DIRECTION=left
, or above that value when DIRECTION=right
. INITIAL
can supply a scalar if a common starting value is to be used for all the censored observations. Alternatively, if different values are required, INITIAL
should supply a variate of the same length as Y
. Only the values corresponding to censored observations are used, the others are ignored. If INITIAL
if not specified, the default is to use the value specified by the BOUND
parameter.
The MAXCYCLE
option specifies the maximum number of iterations performed by the E-M algorithm (default 30). By default, the E-M algorithm is deemed to have converged if the percentage change in each estimated treatment mean is less than 0.1%, and the percentage change in each estimated variance component is less than 0.05%. However, you can change these tolerance limits by setting the TOLERANCE
option to a variate of length two. Its first value specifies the maximum acceptable percentage change for the treatment means, and its second value specifies the maximum acceptable percentage change for the variance components.
The RMETHOD
specifies which random terms are used when estimating values for the censored observations during the E-step of the E-M algorithm. With RMETHOD=all
, the censored observations are estimated from the fixed effects only, whereas when RMETHOD=final
, the censored observations are estimated from the fixed and random effects; default final
. Finally, the VAOPTIONS
procedure can be used to specify the MAXCYCLE
and WORKSPACE
options of the REML
commands used during the M-step of the E-M algorithm.
Printed output is controlled by the PRINT
, VPRINT
, and PSE
options. The PRINT
option has one setting, summary
, which prints information on the number of E-M algorithm iterations performed, the percentage of observations censored and the censoring threshold. This is the default, but you can suppress this output by setting option PRINT=*
. The VPRINT
and PSE
options control the printed output from the REML
analysis when the censored observations have been replaced by their estimates. The VPRINT
option has the same settings as the PRINT
option of the REML
directive, other than that covariancemodels
is excluded; the default is PRINT=model,comp,Wald
. Similarly, the setting of PSE
are the same as those of the PSE
option of the REML
directive; the default is PSE=diff
.
You can set option PLOT=scatterplot
to display a scatter plot of the data, plotting the new y-variate, with censored observations replaced by their estimates, against the observed response variate. When censored observations in Y
are entered as missing values, they are plotted at the value specified by the BOUND
parameter; otherwise, they are plotted at the values given in Y
. Superimposed onto this plot are a 1-1 line and a horizontal reference line at the censoring threshold defined by the BOUND
parameter. By default, no plot is produced.
The NEWY
parameter allows you to save a copy of the response variate with the censored observations replaced by their estimates. An indicator variable with values of one corresponding to censored observations in Y
and values of zero to non-censored observations can be saved using the YCENSORED
parameter. Note, this will be equivalent to any variate supplied by CENSORED
. The SAVE
parameter can be used to save the save structure from the REML
analysis of the data with censored observations replaced by their estimates, for later use by other REML
directives and procedures, such as VDISPLAY
and VGRAPH
.
Options: PRINT
, VPRINT
, PSE
, PLOT
, MAXCYCLE
, TOLERANCE
, RMETHOD
, DIRECTION
Parameters: Y
, BOUND
, CENSORED
, INITIAL
, NEWY
, YCENSORED
, SAVE
Method
The E-M (expectation-maximization) algorithm is an iterative two step method to optimize a model. The initial expectation step uses the initial values (either INITIAL
, if given, or BOUND
) for the censored observations. In the maximization step, the current estimates of the censored values are used in the y-variate in a standard REML
analysis to estimate the fitted values and their variances. In subsequent expectation steps, the censored values are estimated as the expected value of the tail of Normal distribution with means and variances for these observations from previous M-step model. The expected deviate in the lower tail of a Normal distribution (x < BOUND
) is
m - SQRT(v)*PRNORMAL(BOUND;m;v)/CLNORMAL(BOUND;m;v)
.
Action with RESTRICT
Restrictions are not allowed.
References
Amemiya, T. (1984). Tobit models: A survey. Journal of Econometrics, 24, 3-61.
Dempster, A.P., Laird, N.M. & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1-38.
Taylor, J. (1973). The analysis of designed experiments with censored observations. Biometrics, 29, 35-43.
Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica, 26, 24-36.
See also
Directives: REML
, VCOMPONENTS
, VDISPLAY
, VKEEP
.
Procedures: CENSOR
, GLTOBITPOISSON
, HGTOBITPOISSON
, RTOBITPOISSON
, VAOPTIONS
.
Commands for: REML analysis of linear mixed models.
Example
CAPTION 'TOBIT example',\ !T('Oats.gsh contains yield data from a split-plot experiment.',\ 'For details, see section 5.1 in',\ 'A Guide to Anova and Design in Genstat');\ STYLE=meta,plain SPLOAD [PRINT=summary] '%Data%/Oats.gsh' CAPTION 'Example 1: Yield left-censored data at 70.'; STYLE=meta CALCULATE yield_lc = yield*(yield .GT. 70) + 70*(yield .LE. 70) VCOMPONENTS [FIXED=nitrogen*variety] RANDOM=blocks/wplots/subplots TOBIT [PLOT=scatterplot] Y=yield_lc; BOUND=70 CAPTION 'Example 2: Yield right-censored data at 130.'; STYLE=meta CALCULATE yield_rc = yield*(yield .LT. 130) + 130*(yield .GE. 130) VCOMPONENTS [FIXED=nitrogen*variety] RANDOM=blocks/wplots/subplots TOBIT [DIRECTION=right; PLOT=scatterplot] Y=yield_rc; BOUND=130 CAPTION !T('Example 3: Yield right-censored data at 130,', \ 'recorded as missing values.'); STYLE=meta CALCULATE yield_rc0 = REPLACE(yield_rc; 130; !s(*)) CALCULATE censored = yield_rc0 == !s(*) TOBIT [DIRECTION=right; PLOT=scatterplot] Y=yield_rc0; BOUND=130; \ CENSORED=censored CAPTION 'Example 4: Saving structures to produce extra output.'; STYLE=meta TOBIT [VPRINT=*] Y=yield_lc; BOUND=70; NEWY=newy; YCENSORED=ycen; SAVE=vsave VPLOT [SAVE=vsave; RMETHOD=final] METHOD=fitted; PEN=ycen+1 VPLOT [SAVE=vsave; RMETHOD=all] METHOD=fitted; PEN=ycen+1 FACTOR [LEVELS=!(0,1); LABELS=!T(uncensored,censored)] ycenF; VALUES=ycen DOTHISTOGRAM [KEYDESCRIPTION=''] newy; PENS=ycenF VKEEP [FITTED=fit; SAVE=vsave] DOTHISTOGRAM [KEYDESCRIPTION=''] fit; PENS=ycenF