1. Home
2. JACKKNIFE procedure

# JACKKNIFE procedure

Produces Jackknife estimates and standard errors (R.W. Payne).

### Options

`PRINT` = string token Controls printed output (`estimates`, `vcovariance`); default `esti` Data vectors from which the statistics are to be calculated Other relevant information needed to calculate the statistics Saves the variance-covariance matrix for the statistics

### Parameters

`LABEL` = texts Texts, each containing a single line, to label the statistics Saves the Jackknife estimate for each statistic Saves Jackknife estimates of the standard errors Saves the Jackknife pseudo-values Saves the acceleration parameter for bias-corrected and accelerated bootstrap confidence intervals

### Description

The Jackknife provides a way of decreasing bias and obtaining standard errors in situations where the standard methods might be expected to be inappropriate. The basic form of the Jackknife method works by calculating the statistic (or statistics) of interest omitting each data value in turn. Thus, if there are n data values, n “partial estimates” T-1Tn are obtained (where Tj is the estimate omitting value j). These are combined with the estimate T obtained from all the data, to produce n pseudo-values:

Pj = n × T – (n – 1) × Tj : j = 1 … n

The Jackknife estimate of the statistic is given by the mean of the pseudo-values, and the standard error by the standard error of the mean of the pseudo-values.

The Jackknife can be shown to eliminate the term proportional to 1/n from a bias of the form

T = t + a/n + O(1/n2)

where t is the true value of the estimate and O(1/n2) is a term of order one divided by the square of the number of observations (Quenouille 1956). However, it is not appropriate in all situations. In particular the statistic needs to be “smooth” (small changes in the data set should cause only small changes in the statistic); it will not work for example with medians or order statistics. Further details and advice are given by Miller (1974), Bissell & Ferguson (1975), Hinkley (1983) and Efron & Tibshirani (1993).

The data for `JACKKNIFE` are provided as a list of vectors (variates, factors or texts) using the `DATA` option. From this, new vectors are formed omitting each unit of the original vectors in turn, and a subsidiary procedure `RESAMPLE` is called to calculate the statistics. Other relevant information can be provided for passing to `RESAMPLE`, in any type of data structure, using the `ANCILLARY` option. To use `JACKKNIFE`, you need to provide a version of `RESAMPLE` to calculate the particular statistics that you require. The default `RESAMPLE` procedure, which accompanies `JACKKNIFE` in the library, merely prints details of the syntax (also described in the Methods Section).

A label should be provided for each statistic, using the `LABEL` parameter; by default, there is assumed to be a single statistic labelled simply as `Statistic`. The estimates, their standard errors and variates of corresponding pseudo-values for each statistic can be saved by the `ESTIMATE`, `SE` and `PSEUDOVALUES` parameters, respectively. Also, if there is more than one statistic, a variance-covariance matrix can be saved for the estimates using the `VCOVARIANCE` option.

Printed output is controlled by the `PRINT` option, with settings `estimates` for the estimates and their standard errors, and `vcovariance` for the variance-covariance matrix; by default `PRINT=estimates`.

The jackknife is also required for the calculation of bias-corrected and accelerated confidence limits for bootstrap statistics (as given by the `BOOTSTRAP` procedure). The necessary acceleration quantities can be saved using the `ACCELERATION` parameter. For details see Efron & Tibshirani, 1993, Section 14.3.

Options: `PRINT`, `DATA`, `ANCILLARY`, `VCOVARIANCE`.

Parameters: `LABEL`, `ESTIMATE`, `SE`, `PSEUDOVALUES`, `ACCELERATION`.

### Method

The original papers describing the Jackknife technique are by Quenouille (1949, 1956) and by Tukey (1958). Good expository accounts are provided by Hinkley (1983) or Bissell & Ferguson (1975).

`JACKKNIFE` needs a subsidiary procedure `RESAMPLE` to calculate the statistics of interest. `RESAMPLE` has an option, `DATA`, which is used to supply the data vectors (variates, factors or texts) from which the statistics are to be calculated. (On the first occasion that `RESAMPLE` is called, these will be the original vectors as supplied to `JACKNIFE`, in order to calculate the estimate T; subsequently, they will be new vectors containing all except one of the units.) Other relevant information can can be supplied through the `ANCILLARY` option, which corresponds to the `ANCILLARY` option of `JACKKNIFE` itself. `RESAMPLE` can be called by the `BOOTSTRAP` procedure, and it then also has an `AUXILIARY` option, but this is not relevant to `JACKKNIFE`.

There are two parameters: `STATISTICS` supplies a list of scalars to store the estimates of each statistic, and `EXIT` a list of scalars which should be set to zero or one according to whether or not each statistic could be estimated successfully with the supplied data vectors. If the value of `EXIT` is not calculated in `RESAMPLE`, `JACKKNIFE` assumes that the calculations succeeded. This example shows a version of `RESAMPLE` which calculates the correlation between two variates.

`PROCEDURE [PARAMETER=pointer] 'RESAMPLE'`

`OPTION 'DATA', " (I: variates, factors or texts) data`

`                         vectors from which to calculate the`

`                         statistics; no default"\`

`          'ANCILLARY'; " (I: any type of structure) other`

`                         relevant information needed to`

`                         calculate the statistics "\`

`          MODE=p; TYPE=!t(variate,factor,text),*;\`

`          SET=yes,no; LIST=yes; DECLARED=yes; PRESENT=yes`

`PARAMETER 'STATISTIC', " (O: scalars) to save the calculated`

`                         statistics "\`

`          'EXIT'; " (O: scalars) to save an exit code`

`                         to indicate failure (EXIT[i]=1) or`

`                         success (EXIT[i]=0) when calculating`

`                         each STATISTIC[i]"\`

`          MODE=p; TYPE='scalar'; SET=yes`

`CALCULATE STATISTIC = CORRELATION(DATA; DATA)`

`& EXIT = STATISTIC==C('missing')`

`ENDPROCEDURE`

### Action with `RESTRICT`

If any of the data vectors is restricted, `JACKKNIFE` will use only the units that are not restricted for any of the vectors.

Bissell, A.F. & Ferguson, R.A. (1975). The jackknife – toy, tool or two-edged weapon. The Statistician, 24, 79-100.

Efron, B. & Tibshirani, R.J. (1993). An Introduction to the Bootstrap. Chapman & Hall, London.

Hinkley, D. (1983). Jackknife methods. In: Encyclopedia of Statistics, Volume 4 (ed: S. Kotz, N.L. Johnson & C.B. Read). Wiley, New York.

Miller, R.G. (1974). The jackknife – a review. Biometrika, 61, 1-15.

Quenouille, M.H. (1949). Approximate tests of correlation in time series. Journal of the Royal Statistical Society, Series B, 11, 18-44.

Quenouille, M.H. (1956). Notes on bias in estimation. Biometrika, 61, 353-360.

Procedures: `BOOTSTRAP`, `APERMTEST`, `CHIPERMTEST`, `RPERMTEST`.

### Example

```CAPTION 'JACKKNIFE example',!t(\
'The data are scores from two tests on new admissions to Law School',\
'(Efron, 1981, The Jackknife, the Bootstrap & Other Resampling',\
' Plans. CBMS Monograph 38, SIAM, Philadelphia); listed in Table 1',\
'of Hinkley (1983, Encyclopedia of Statistics Volume 4, page 282).');\
STYLE=meta,plain
" Define RESAMPLE to calculate the correlation between the two scores."
PROCEDURE [PARAMETER=pointer] 'RESAMPLE'
OPTION    'DATA',      " (I: variates, factors or texts) data vectors from
which to calculate the statistics; no default"\
'AUXILIARY', " (I: pointers) auxiliary sets of data vectors, each
of which is to be resampled independently"\
'ANCILLARY'; " (I: any type of structure) other relevant
information needed to calculate the statistics "\
MODE=p; TYPE=!t(variate,factor,text),'pointer',*; SET=yes,no,no;\
LIST=yes; DECLARED=yes; PRESENT=yes
PARAMETER 'STATISTIC', " (O: scalars) to save the calculated statistics "\
'EXIT';      " (O: scalars) to save an exit code to indicate
failure (EXIT[i]=1) or success (EXIT[i]=0)
when calculating each STATISTIC[i]"\
MODE=p; TYPE='scalar'; SET=yes
CALCULATE STATISTIC = CORRELATION(DATA; DATA)
&         EXIT = STATISTIC==C('missing')
ENDPROCEDURE
VARIATE [VALUES=576,635,558,578,666,580,555,661,651,605,653,575,545,572,594] Y
&       [VALUES=3.39,3.30,2.81,3.03,3.44,3.07,3.00,3.43,3.36,3.13,\
3.12,2.74,2.76,2.88,2.96] Z
JACKKNIFE [DATA=Y,Z] 'Correlation'
```