1. Home
  2. KAPLANMEIER procedure


Calculates the Kaplan-Meier estimate of the survivor function (J.T.N.M. Thissen).


PRINT = string tokens What output to print and whether to display the Kaplan-Meier estimate in a graph (estimate, mean, quantiles, summary, graph); default esti, grap
GRAPHICS = string token Type of graphics to use (lineprinter, highresolution); default high
TITLE = text General title for the graph; default *
WINDOW = scalar Window number for the high-resolution graph; default 1
KEYWINDOW = scalar Window number for the key (zero for no key); default 2
SCREEN = string token Whether to clear the screen before plotting or to continue plotting on the old screen (clear, keep); default clea
PROBABILITY = scalar Probability level of the confidence interval for the Kaplan-Meier estimates; default 0.95
XLOWER = scalar Lower bound for x-axis; default 0
XUPPER = scalar Upper bound for x-axis; default * i.e. a value slightly larger than the maximum of the TIME parameter (or EVENT parameter if TIME is not set) is used
PLOT = string tokens What additional plotting features to include (referenceline, censored); default * i.e. none
PERCENTILES = variate or scalar Percentiles at which to estimate quantiles of survival times; default 25,50,75


TIME = variates Observed timepoints
CENSORED = variates Variate specifying whether the corresponding element of TIME is censored (1) or not (0); default is to assume no censoring
GROUPS = factors Factor specifying the different groups for which the survivor function is estimated
EVENT = variates Saves the distinct TIME values when TIME is set; otherwise supplies an input variate specifying the endpoint of each interval
NDEATH = variates Saves the number of deaths at each EVENT when TIME is set; otherwise supplies an input variate specifying the number of deaths in each interval
NATRISK = variates Saves the number of units at risk at each EVENT when TIME is set; otherwise supplies an input variate with the number at risk in each interval
ESTIMATE = variates Saves the Kaplan-Meier estimates of the survivor function
NEWGROUPS = factors Saves the grouping of the EVENT, NDEATH, NATRISK and ESTIMATE variates when TIME is set


Survival data are data in which the response variate is the lifetime of a component or the survival time of a patient. Typically these are censored, i.e. the survival time of some units is unknown at the end of the study. The survivor function F(t) is a key element in the analysis of survival data. It is defined as the probability of an individual still surviving at time t. KAPLANMEIER calculates the Kaplan-Meier estimate of the survivor function for two different types of data.

The first type of data occurs when all timepoints are accurately observed. The observed timepoints or the timepoints at which censoring took place are then specified using the TIME parameter. The CENSORED variate contains values 0 and 1 to specify whether the corresponding element of TIME is censored (1) or not (0); if there was no censoring, this need not be set. The GROUPS parameter can be used to specify a factor to indicate different groups whose survivor functions are to be estimated separately. The distinct TIME values can be saved using the EVENT parameter, and the number of deaths and the number of units at risk at each individual EVENT can be saved using parameters NDEATH and NATRISK respectively. The Kaplan-Meier estimate can be saved with the ESTIMATE parameter. The NEWGROUPS parameter can save a factor indicating the group structure of the output variates.

The second type of data is relevant when the units are observed at the end of time-intervals. The exact times are then unknown and input should be specified using parameters EVENT, NDEATH, NATRISK. These specify the timepoints, number of deaths and number of risk at the end of each interval. The GROUPS parameter can again be used to request separate group estimates.

The PRINT option selects the output to be displayed with settings:

    estimate the events, number of deaths, number of units at risk and the Kaplan-Meier estimate with a confidence interval,
    summary summary of censored and uncensored observations,
    quantiles estimates quantiles of the distribution of survival times (observed timepoints only),
    mean mean and standard error (observed timepoints only),
    graph plots the Kaplan-Meier estimate against the time points.

The default is PRINT=estimates,graph.

The probability level for the Kaplan-Meier estimate confidence interval can be set using the PROBABILITY option; by default this is 0.95. Percentiles for estimating survival times can be set using the PERCENTILES option; by default this is 25,50,75. If PRINT=graph is set, then the PLOT option can be used to include censored observations and a reference line at S(t)=0.5 to indicate the median survival time. If GRAPHICS=highresolution different lines are drawn for different groups, whereas GRAPHICS=lineprinter produces separate graphs for the different groups. Lower and upper bounds for the x-axis can be set by options XLOWER and XUPPER, the TITLE option can specify a title for the plots. Options WINDOW and KEYWINDOW control the windows used for high-resolution graphs.




When TIME is set, the Kaplan-Meier estimate is calculated according to equation (1.10) in Kalbfleisch & Prentice (1980). When TIME is not set, the Kaplan-Meier estimate is directly calculated from the variates specified by EVENT, NDEATH and NATRISK. If PERCENTILES includes the median (50) then a confidence interval is displayed for the median using the method described in Brookmeyer & Crowley (1982). The mean survival time is calculated by the formula

μ = ∑i=1…k { S(ti1) × (titi1) }


k is the number of ordered death times,

S(ti1) is the Kaplan-Meier estimate of the survivor function at the (i-1)th death time,

ti is the death time, where t0 is defined to be zero

Its standard error is calculated using the formula:

se(μ) = √[ (m/m-1) x ∑i=1…k-1 { (Ai ** (2/ni)) × (nidi) } ]


m = ∑i=1…k { di }

Ai = ∑j=1…k-1 { S(tj1) × (tj+1tj) }

Action with RESTRICT

The input variates and factor GROUPS may be restricted identically. The Kaplan-Meier estimate is based only on the units not excluded by the restriction.


Brookmeyer, R. & Crowley, J. (1982). A confidence interval for the median survival time. Biometrics, 38, 29-41.

Collett, D. (1994). Modelling Survival Data in Medical Research. Chapman & Hall. London.

Kalbfleisch, J.D. & Prentice, R.L. (1980). The Statistical Analysis of Failure Time Data. Wiley, New York.

See also


Commands for: Survival analysis.


FACTOR      [LEVELS=2; VALUES=19(1), 21(2)] Sample
VARIATE     [NVALUES=40] Day, Censored
READ        Day, Censored
143 0  164 0  188 0  188 0  190 0  192 0  206 0  209 0  213 0  216 0
220 0  227 0  230 0  234 0  246 0  265 0  304 0  216 1  244 1  142 0
156 0  163 0  198 0  205 0  232 0  232 0  233 0  233 0  233 0  233 0
239 0  240 0  261 0  280 0  280 0  296 0  296 0  323 0  204 1  344 1 :
AXES        WINDOW=1; YTITLE='Survivorfunction S'; XTITLE= 'Days'
KAPLANMEIER [TITLE='Data from Table 1.1 in Kalbfleisch and Prentice']\
            Day; CENSORED=Censored; GROUPS=Sample
VARIATE     [VALUES=  1,   2,   3,   4,   5,  6,  7, 8] Year
VARIATE     [VALUES=358, 269, 181, 136, 112, 68, 26, 6] Natrisk
VARIATE     [VALUES= 89,  88 , 45,  24,   8, 12,  0, 0] Ndeath
Updated on March 7, 2019

Was this article helpful?