Fits a proportional hazards model to survival data as a generalized linear model (R.W. Payne).
|Controls printed output (
||Defines the full model to explore (using
||Subject corresponding to each observation|
||Time of each observation|
||Contains the value 1 for censored observations, otherwise 0; if unset it is assumed that there is no censoring|
||Offset to include in the model|
||Whether to pool terms in the accumulated summary generated by the fit|
||Model to fit|
The data for
RPHFIT consist of a set of subjects observed at one or more times. The final time is usually at the time of death (or failure), otherwise (if the subject survives the trial) the observation is said to be censored. The
CENSORED option can be used to specify a variate with an entry for each subject containing one when there is censoring, otherwise zero. If this is not specified, it is assumed that there is no censoring. The
SUBJECTS option can specify a factor to indicate the subject corresponding to each observation; this can be omitted if there is only one observation per subject. The time at which each observation was made is specified by the
TIME option, in either a factor or a variate.
The model to be fitted is specified using the
TERMS parameter. You can modify the model later by using procedure
RPHCHANGE. If you intend to use
RPHCHANGE to include additional model terms, you should use option
RPHFIT to define the largest model that you may want to consider (this option acts similarly to the
TERMS directive in ordinary generalized linear modelling). You can display further output using procedure
RPHDISPLAY, and save information using procedure
The proportional hazards model (Cox 1972) makes the assumption that the subjects have a baseline hazard function which is modified proportionally by treatments and covariates. In
RPHFIT it is assumed that the survival times follow a piecewise exponential distribution (Breslow 1974). This partitions the time axis using a set of discrete cut-points ai, and assumes a constant baseline hazard γi between each one. This corresponds to an exponential distribution with mean 1/γi for the survival times (in the absence of treatments) within each time interval. A cut-point is defined at every time that a death (or failure) occurs and, if the covariates or treatments vary with time, also at every time when the subjects are observed.
To fit a proportional hazards model as a generalized linear model, the x-variates (i.e. covariates) and factors must be expanded so that, for each subject, there is a unit for every time interval up to the last one during which the subject was observed. If (as usually happens) the subject was not observed at every cutpoint, the covariates and treatments are taken to be constant during the intervals between the times of the observations.
RPHFIT automatically produces the expanded sets of values (using procedure
RPHVECTORS). These replace the original values while
RPHFIT is fitting and displaying the model. The original values are then reinstated before exit from the procedure, unless a fault is generated e.g. from the regression directives
FIT &c. You can call
RPHVECTORS directly if you do want to obtain the expanded values. Alternatively, procedure
RPHKEEP can save the index variate that is used to construct them.
The y-variate used within the generalized linear model is an indicator that takes the value 0 if the subject was still surviving within the time interval concerned, otherwise it has the value 1. The model also contains an offset representing the log of the exposure time within each interval. Any additional offset can be specified, if required, using the
OFFSET option. (These two variates are also obtainable from
FIT directive, except that there is an extra setting
loglikelihood to print -2 times the log-likelihood. The deviance produced for the terms in the regression model can be assessed using chi-square distributions as usual, but the residual deviance is not usable as the maximal model assumed by the generalized linear models method is inappropriate. So, the residual line is suppressed in the summary and accumulated analysis of deviance. By default the terms in the model are fitted individually so that they will all have their own lines in an accumulated analysis of deviance. However, you can set option
POOL=yes to fit them all at once.
The expanded sets of values for the variates and factors in the model are formed using procedure
RPHVECTORS, together with the response and offset variates that are needed. Further details of the method can be found in Aitkin et al. (1989).
None of the vectors must be restricted, and any restrictions will be cancelled.
Aitkin, M., Anderson, A., Francis, B. & Hinde, J. (1989). Statistical Modelling in GLIM. Oxford University Press.
Breslow. N. (1974). Covariance analysis of censored survival data. Biometrics, 30, 89-99.
Cox, D.R. (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society Series B, 34, 187-220.
Commands for: Survival analysis.
CAPTION 'RPHFIT example',\ 'Data from Gehan (1965, Biometrika, 52, 203-223).';\ STYLE=meta,plain VARIATE [VALUES=1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23,\ 6,6,6,6,7,9,10,10,11,13,16,17,19,20,22,23,25,32,32,34,35] Time & [VALUES=24(0),1,0,1,0,1,1,0,0,1,1,1,0,0,1,1,1,1,1] Censor FACTOR [LABELS=!t(control,'6-mercaptopurine'); VALUES=21(1,2)] Treat FACTOR [LEVELS=42; VALUES=1...42] Subject RPHFIT [TIMES=Time; SUBJECTS=Subject; CENSORED=Censor] Treat