1. Home
2. KCROSSVALIDATION procedure

# KCROSSVALIDATION procedure

Computes cross validation statistics for punctual kriging (D.A. Murray & R. Webster).

### Options

`PRINT` = string tokens Controls printed output (`statistics`, `correlation`); default `stat` Whether to produce a scatter plot of the predicted against the true values (`scatter`); default `*` i.e. none `Y` positions or interval (not needed for 2D regular data i.e. when `DATA` is a matrix) `X` positions (needed only for 2D irregular data) Variate containing 2 values to define the `Y`-bounds of the region to be examined (bottom then top); by default the whole region is used Variate containing 2 values to define the `X`-bounds of the region to be examined (bottom then top); by default the whole region is used Maximum distance between target point and usable data Type of search (`isotropic`, `anisotropic`); default `isot` Minimum number of data points from which to compute elements; default 7 Maximum number of data points from which to compute elements; default 20 Amount of drift (`constant`, `linear`, `quadratic`); default `cons` Ratio of `Y` interval to `X` interval Pointer containing model estimates saved from `MVARIOGRAM`

### Parameters

`DATA` = variates or matrices Observed measurements as a variate or, for data on a regular grid, as a matrix Form of variogram (`isotropic`, `Burgess`, `geometrical`); default `isot` Model fitted to the variogram (`power`, `boundedlinear`, `circular`, `spherical`, `doublespherical`, `pentaspherical`, `exponential`, `besselk1`, `gaussian`, `cubic`, `stable`, `cardinalsine`, `matern`); default `*` The nugget variance Sill variances of the spatially dependent component Ranges of the spatially dependent component Slope of the unbounded component Power of the unbounded component or power for the stable model Value of ν parameter for the Matern model Phi parameters in anisotropic model (`ISOTROPY` = `burg` or `geom`) Maximum gradient of an anisotropic model Minimum gradient of an anisotropic model Variance of measurement error Saves the kriged estimates in matrices for 2D Regular data, otherwise in variates Saves the estimation variances in matrices for 2D Regular data, otherwise in variates Saves the cross validation statistics

### Description

In geostatistics one way of choosing between plausible models for variograms is to use them for kriging, and see how well the kriging predicts the true values. The observed value of z at each sampling point in the data is omitted in turn from the whole set and predicted from the others. The predictions are compared with the true values to give a mean deviation or error, and the kriging variances are compared with the squared deviations to give a mean squared deviation ratio. This process is known as “cross-validation”. The procedure `KCROSSVALIDATION` uses this principle of leave-one-out cross-validation.

The data are supplied, by the `DATA` parameter, in one of the two forms as for the `KRIGE` directive: i.e. in a matrix for data on a regular grid, or as a variate for irregularly scattered data together with the `X` and `Y` options set to variates to supply the spatial coordinates.

By default all data are considered when forming the kriging system. However, you may select a subset of the data by limiting the area to a rectangle defined by `XOUTER` and `YOUTER` options. Each of these should be set to a variate with two values to define lower and upper limits in the x (East-West) and y (North-South) directions respectively.

The minimum and maximum number of points for the kriging system are set by the `MINPOINTS` and `MAXPOINTS` options. There is a minimum limit of 3 for `MINPOINTS` and a maximum of 40 for `MAXPOINTS`, and `MINPOINTS` must be less than or equal to `MAXPOINTS`. The defaults are 7 and 20 respectively. You may select data points around the point to be kriged by setting the `RADIUS` option to the radius within which they must lie. If the variogram is anisotropic, the search may be requested to be anisotropic by setting option `SEARCH` to `anisotropic`; by default `SEARCH=isotropic`.

You can invoke universal kriging for two-dimensional data by setting the `DRIFT` option to `linear` or to `quadratic`, i.e. to be of order 1 or 2 respectively. The default is `DRIFT=constant`, to give ordinary kriging. For data in a regular grid that is not square, the ratio of the spacing in the y direction to that in the x direction should be given by the `YXRATIO` option. The default is 1.0 (i.e. square).

The variogram is specified by its type and parameters, as follows. The `MODEL` option may be defined to be set to either `power`, `boundedlinear` (one dimension only), `circular`, `spherical`, `doublespherical`, `pentaspherical`, `exponential`, `besselk1` (Whittle’s function), `gaussian`, `cubic`, `stable` (i.e. powered exponential; see Webster & Oliver 2001), `cardinalsine` or `matern`. All models may have a nugget variance, supplied using the `NUGGET` option; this is the constant estimated by `MVARIOGRAM`. You can specify the variance of any measurement error using the `MEASUREMENTERROR` parameter. The parameters of the `power` function (the only unbounded model) are defined by the `GRADIENT` and `EXPONENT` parameters. The parameter for the power of the `stable` model is supplied using the `EXPONENT` parameter. The parameter ν for the Matern model is supplied using the `SMOOTHNESS` parameter. The simple bounded models (i.e. all other settings of `MODEL` except `doublespherical`) require the `SILLVARIANCES` (the sill of the correlated variance) and `RANGES` parameters. The latter is strictly the correlation range of the `boundedlinear`, `circular`, `spherical` and `pentaspherical` models, while for the asymptotic models it is the distance parameter of the model. The `doublespherical` model requires `SILLVARIANCES` and `RANGES` to be set to variates of length two, to correspond to the two components of the model.

The `ISOTROPY` parameter allows the variation to be defined to be either isotropic or anisotropic in one of two ways: either Burgess anisotropy (Burgess & Webster 1980) or geometric anisotropy (Webster & Oliver 1990). The anisotropy is specified by three parameters, namely `PHI` the angle in radians of the direction of maximum variation, `RMAX` the maximum gradient of the model, and `RMIN` the minimum gradient. In the current release only the `power` function may be anisotropic.

The predictions (or estimates) and variances can be saved using the `PREDICTIONS` and `VARIANCES` parameters. The cross-validation statistics can be saved using the `STATISTICS` parameter.

The `PRINT` option can be set to `statistics` to print the cross validation statistics or `correlation` to print the correlation between the predicted and true values. The `PLOT` option can be used to produce a plot of the predicted values against the true values.

Options: `PRINT`, `PLOT`, `Y`, `X`, `YOUTER`, `XOUTER`, `RADIUS`, `SEARCH`, `MINPOINTS`, `MAXPOINTS`, `DRIFT`, `YXRATIO`, `SAVE`.

Parameters: `DATA`, `ISOTROPY`, `MODEL`, `NUGGET`, `SILLVARIANCES`, `RANGES`, `GRADIENT`, `EXPONENT`, `SMOOTHNESS`, `PHI`, `RMAX`, `RMIN`, `MEASUREMENTERROR`, `PREDICTIONS`, `VARIANCES`, `STATISTICS`.

### Method

The mean error is given by

i=1…n { z(xi) – zhat(xi) } / n

the mean squared error is

i=1…n { z(xi) – zhat(xi) }2 / n

and the mean squared deviation ratio

i=1…n { (z(x_i) – zhat(xi) )2 / sig2(xi) } / n

### Action with `RESTRICT`

The vectors involved in the analysis may be restricted as for `KRIGE`.

Burgess, T.M. & Webster, R. (1980). Optimal interpolation and isarithmic mapping of soil properties. I. The semi-variogram and punctual kriging. Journal of Soil Science, 31, 315-331.

Webster, R. & Oliver, M.A. (1990). Statistical Methods in Soil and Land Resource Survey. Oxford University Press, Oxford.

Webster, R. & Oliver, M.A. (2001). Geostatistics for Environmental Scientists. Wiley, Chichester.

Directives: `FVARIOGRAM`, `FCOVARIOGRAM`, `KRIGE`, `MCOVARIOGRAM`, `COKRIGE`.

Procedures: `MVARIOGRAM`, `DVARIOGRAM`, `DCOVARIOGRAM`, `DHSCATTERGRAM`.

Commands for: Spatial statistics.

### Example

```CAPTION 'KCROSSVALIDATION example',!t(\
'Data are levels of potassium at Broom''s Barn Experimental Station',\
'(Webster, R and Oliver, M.A. 2001. Geostatistics for Environmental',\
'Scientists, Wiley)'); STYLe=meta,plain
VARIATE [VALUES=8(1),10(2),25(3),26(4),24(5),23(6),21(7),21(8),\
27(9),29(10),29(11),29(12),29(13),28(14),28(15),27(16),26(17),25(18)] x
VARIATE [VALUES=(24...31),19,(23...31),(1...11),18,19,(23...31),\
12,13,14,(2...8),(10...19),(23...31),(4...7),(9...19),(23...31),(5...19),\
(23...30),(7...19),(23...30),(7...19),(23...30),(4...19),21,20,(22...30),\
(2...30),(2...30),(2...30),(2...30),(2...18),(20...30),(2...18),(20...30),\
(3...18),(20...30),(3...18),(20...29),(3...16),18,(20...29)] y
VARIATE [VALUES=26,22,18,19,26,23,32,28,55,19,18,17,15,16,19,15,\
24,14,28,26,23,21,22,22,24,41,30,20,22,22,26,16,18,15,16,15,16,14,20,15,70,\
20,22,24,23,20,24,20,20,34,18,18,21,18,22,28,25,28,24,23,16,18,17,16,19,21,\
13,15,15,24,23,22,25,19,20,19,20,16,16,20,28,21,28,24,18,15,14,15,19,20,15,\
14,16,28,22,26,27,19,19,15,19,18,20,19,27,29,35,25,16,14,15,16,16,16,16,15,\
24,24,24,23,22,16,19,16,20,18,27,58,24,18,14,17,17,14,18,15,28,24,23,24,21,\
16,23,20,26,18,25,20,44,24,53,12,12,15,15,16,18,21,24,20,20,18,20,24,18,23,\
32,33,27,32,27,26,54,38,*,58,96,23,17,18,20,18,17,21,23,33,24,20,18,19,21,\
20,21,20,23,31,27,29,30,21,24,60,20,24,30,32,29,25,21,28,35,20,21,24,36,29,\
24,26,22,20,20,23,24,26,30,42,38,42,38,40,38,25,24,34,25,20,21,22,25,20,27,\
27,19,32,27,28,23,22,20,21,23,24,20,29,42,36,42,37,33,35,32,30,27,27,19,21,\
32,30,27,27,28,20,38,26,24,28,28,27,29,27,33,36,27,24,27,33,40,41,36,24,25,\
24,28,26,25,20,23,22,32,29,19,54,42,41,37,35,33,39,53,42,28,27,26,26,42,38,\
36,31,20,26,26,23,28,20,19,24,34,29,18,41,30,30,35,33,26,27,41,33,36,27,28,\
32,39,39,39,27,20,26,28,23,27,24,32,32,44,28,18,39,38,32,30,28,28,35,28,24,\
29,26,31,31,36,34,31,24,25,31,26,25,35,31,28,25,24,19,38,41,30,28,39,33,29,\
25,38,23,26,28,29,38,38,28,24,25,28,24,29,19,22,29,39,24,39,38,36,33,28,27,\
26,28,31,29,24,29,30,35,38,29,30,23,23,29,29,23,20,38,36] k
CALCULATE        logk = LOG10(k)
KCROSSVALIDATION [PRINT=stat,corr; Y=y; X=x; RADIUS=5; MINP=7; MAXP=20]\
DATA=logk; MODELTYPE=spherical; NUGGET=0.00466;\
SILL=0.01515; RANGE=10.8; STAT=stats
PRINT            stats
```