Uses the Tobit method for analysis of variance of an unbalanced design with censored data (R.W. Payne & V.M. Cave).
Options
PRINT = string tokens |
Controls printed output (aovtable, effects, means, residuals, %cv, censored, monitoring); default aovt, mean, cens |
FACTORIAL = scalar |
Limit on number of factors in a treatment term; default 3 |
PFACTORIAL = scalar |
Limit on number of factors in printed tables of predicted means; default 3 |
FPROBABILITY = string token |
Printing of probabilities for variance ratios in the analysis-of-variance table (yes, no); default no |
TPROBABILITY = string token |
Printing of probabilities for t-tests of effects (yes, no); default no |
COMBINATIONS = string token |
Factor combinations for which to form predicted means (present, estimable); default esti |
ADJUSTMENTS = string token |
Type of adjustment to be made when predicting means (marginal, equal, observed); default marg |
PSE = string token |
Types of standard errors to be printed with the predicted means (differences, alldifferences, lsd, alllsd, means, ese); default diff |
MAXCYCLE = scalar |
Sets a limit on the number of iterations performed by the E-M algorithm; default 100 |
TOLERANCE = variate |
Sets tolerance limits for convergence of the E-M algorithm on the estimates of the censored observations; default 0.001 |
DIRECTION = string tokens |
Whether the data are left or right censored (left, right); default left |
NOMESSAGE = string tokens |
Which warning messages to suppress (dispersion, leverage, residual, aliasing, marginality, vertical, df, inflation); default * i.e. none |
LSDLEVEL = scalar |
Significance level (%) for least significant differences; default 5 |
Parameters
Y = variates |
Response variates to be analysed; must be set |
RESIDUALS = variates |
Variate to save the residuals from each analysis |
FITTEDVALUES = variate |
Variate to save the fitted values from each analysis |
BOUND = scalars, variates or pointers |
Censoring thresholds; must be set |
INITIAL = scalar or variates |
Scalar or a variate providing starting values for the censored observations in the E-M algorithm for each analysis; default BOUND+1 for right-censored data and BOUND−1 for left-censored data |
NEWY = variates |
Saves a copy of each response variate with the censored observations replaced by their estimates |
EXIT = scalars |
Exit status (0 for success, 1 for failure to converge) |
SAVE = regression save structures |
Save structures from the analyses of the data with censored observations replaced by their estimates |
Description
The AUTOBIT
procedure performs an analysis of variance for an unbalanced design with censored data. For example, with the default, left-censoring, some observations may be below the reliable detection limit of a measuring device. Alternatively, with right-censoring (specified by setting option DIRECTION
= right), some observations may be so large that it is impracticable to measure them exactly. You can also set DIRECTION
= left,right to have censoring in both directions.
The values at which the measurements are censored must be specified by the BOUND
parameter. For censoring in a single direction, this can be a scalar if all observations are censored at the same point, or a variate if they are censored at different points. If there is both left and right censoring, BOUND
supplies a pointer containing, first, a scalar or variate to define the left-hand bounds, and then a scalar or variate to define the right-hand bounds.
Censored observations in the data, supplied by the Y
parameter, are represented as values at or outside the boundary. The NEWY
parameter can save a copy of the y-variate with the censored observations replaced by their estimates.
The model to be fitted in the analysis of variance must be specified beforehand in the same way as for the AUNBALANCED
procedure. The treatment terms are specified by the TREATMENTSTRUCTURE
directive. Similarly, any covariates are defined by the COVARIATE
directive. AUTOBIT
also takes account of any blocking structure specified by the BLOCKSTRUCTURE
directive. However, it cannot produce stratified analyses like those generated by ANOVA
, and is able to estimate treatments and covariates only in the “bottom stratum”. So, for example, the full analysis can be produced for a randomized block design, where the treatments are all estimated on the plots within blocks, but it cannot produce the whole-plot analysis in a split plot design. Instead you can analyse these by REML
using the TOBIT
procedure.
In the Tobit model (Tobin 1958), the probabilities for the uncensored observations are standard Normal probabilities. The probabilities for right-censored observations are cumulative upper Normal probabilities for values greater than or equal to the boundary value. Probabilities for left-censored observations are cumulative lower Normal probabilities for values less than or equal to the boundary value. The Tobit method uses an E-M (expectation-maximization) algorithm to estimate values for the censored observations. (See Dempster, Laird, N.M. & Rubin 1977.) It starts with initial estimates for the censored observations, which can be specified by the INITIAL
parameter in either a variate or a scalar. For right-censored data the default is to use the boundary value plus one. For left-censored data the default is the boundary value minus one. In each iteration, the method uses the FIT
directive to fit the model, saving the resulting fitted values to provide estimated means for the distributions of the censored observations. The new estimates for the censored observations are then given by the expected values for the lower or upper parts of the Normal distributions, according to whether the observations are left- or right-censored. The process continues either until the updates to the estimates are less than or equal to the value specified by the TOLERANCE
option (default 0.001), or until the number of iterations equals the number specified by the MAXCYCLE
option (default 100). The EXIT
parameter can be set to a scalar which will be set to zero for a successful fit, one for failure in the E-M algorithm, or a missing value for an earlier fault.
The FACTORIAL
, FPACTORIAL
, FPROBABILITY
, TPROBABILITY
, COMBINATIONS
, ADJUSTMENT
, PSE
, NOMESSAGE
, and LSDLEVEL
options operate as in the AUNBALANCED
procedure to control the operation and output of the analysis of variance. The PRINT
option contains the same settings are as the PRINT
option of AUNBALANCED
directive, as well as a monitoring setting to print monitoring information for the E-M algorithm, and a censored setting to print the estimates of the censored observations.
Following the analysis, you can display further output, or save information, using the procedures AUDISPLAY
, AUGRAPH
, AUPREDICT
, AUMCOMPARISON
, and AUKEEP
, as with the AUNBALANCED
procedure. You can also use relevant regression procedures, such as RCHECK
for plots of residuals. The SAVE
parameter can save the regression save structure used by these procedures, so that you can display output from this analysis even if there have been other regression analyses in the intervening period.
The RESIDUALS
and FITTEDVALUES
parameters can save the residuals and fitted values, respectively
Options: PRINT
, FACTORIAL
, FPACTORIAL
, FPROBABILITY
, TPROBABILITY
, COMBINATIONS
, ADJUSTMENT
, PSE
, MAXCYCLE
, TOLERANCE
,DIRECTION
, NOMESSAGE
, LSDLEVEL
Parameters: Y
, RESIDUALS
, FITTEDVALUES
, BOUND
, INITIAL
,NEWY
,EXIT
, SAVE
Action with RESTRICT
As in FIT
, the y-variate or any of the model variates or factors can be restricted to analyse a subset of the data.
References
Dempster, A.P., Laird, N.M. & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1-38.
Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica, 26, 24-36.
See also
Directives: FIT
Procedures: AUNBALANCED
AUDISPLAY
AUGRAPH
AUNPREDICT
AUMCOMPARISON
AUKEEP
ATOBIT
RNTOBIT
RGTOBIT
RNBTOBIT
RTOBITPOISSON
GLTOBITPOISSON
HGTOBITPOISSON
TOBIT
GenStat Reference Manual 1 Summary section on: Regression Analysis
Example
CAPTION 'AUTOBIT example',\ !t('Experiment on foster feeding of rats from Scheffe (1959)',\ 'The Analysis of Variance; also see McConway, Jones & Taylor (1999)',\ 'Statistical Modelling using GENSTAT, Example 7.6.',\ 'To illustrate AUTOBIT, suppose rats larger than 65',\ 'could not be weighed accurately and are treated as censored.');\ STYLE=meta,plain FACTOR [NVALUES=61; LABELS=!t('A','B','I','J')] litter READ litter; FREPRESENTATION=labels A A A A A A A A A A A A A A A A A B B B B B B B B B B B B B B B I I I I I I I I I I I I I I J J J J J J J J J J J J J J J : FACTOR [NVALUES=61; LABELS=!t('A','B','I','J')] mother READ mother; FREPRESENTATION=labels A A A A A B B B I I I I J J J J J A A A A B B B B B I I I I J J A A A B B B I I I I I J J J A A A A B B B I I I J J J J J : VARIATE [NVALUES=61] littwt READ littwt 61.5 68.2 64 65 59.7 55 42 60.2 52.5 61.8 49.5 52.7 42 54 61 48.2 39.6 60.3 51.7 49.3 48 50.8 64.7 61.7 64 62 56.5 59 47.2 53 51.3 40.5 37 36.3 68 56.3 69.8 67 39.7 46 61.3 55.3 55.7 50 43.8 54.5 59 57.4 54 47 59.5 52.8 56 45.2 57 61.4 44.8 51.5 53 42 54 : TREATMENTSTRUCTURE litter * mother AUTOBIT [PRINT=aovtable,means,censored; FPROBABILITY=yes; DIRECTION=right]\ littwt; BOUND=65