1. Home
2. QUANTILE procedure

# QUANTILE procedure

Calculates quantiles of the values in a variate (P.W. Lane).

### Options

`PRINT` = string token What to print (`quantiles`); default `quan` Type of quantile to form (`population`, `sample`); default `samp` or scalar Proportions at which to calculate quantiles; default `!(0,0.25,0.5,0.75,1)`

### Parameters

`DATA` = variates Values whose quantiles are required; this parameter must be specified Identifiers of structures to store results, if required

### Description

Quantiles are statistics that characterize a distribution. The `DATA` parameter supplies a sample of numbers {xi, i=1…n} from which the quantiles are to be calculated, and the `METHOD` option specifies the type of quantile to form.

By default `QUANTILE` calculates quantiles of the sample itself. For a proportion p in the range [0,1], the corresponding quantile q of the sample {xi} has the following properties:

1) at least the proportion p of {xi} are less than or equal to q;

2) at least the proportion (1-p) of {xi} are greater than or equal to q;

3) if q=xi and q=xi+1 satisfy 1) and 2), then take q = (xi+xi+1)/2.

Thus the sample quantile for proportion 0.5 is the median, for 0.0 it is the minimum, and for 1.0 it is the maximum of the sample.

Alternatively, you can set `METHOD=population` to estimate quantiles of the underlying population from which data have been sampled. (This type of quantile is the one used most often elsewhere in Genstat.) The quantile is now an estimate of the value x such that a proportion p of the population has values less than or equal to x.

By default, `QUANTILE` produces the five quantiles called the “five-number summary” of a sample, corresponding to the proportions 0.0, 0.25, 0.5, 0.75, 1.0. The option `PROPORTION` can be set to a scalar or variate to request other single quantiles or sets of quantiles. By default, `QUANTILE` prints the statistics, but this can be suppressed by setting option `PRINT=*`. The quantiles can be stored in a variate using the parameter `QUANTILES`.

Options: `PRINT`, `METHOD`, `PROPORTION`.

Parameters: `DATA`, `QUANTILES`.

### Method

With `METHOD=sample`, `QUANTILE` calculates the quantiles itself, using the `SORT` and `CALCULATE` directives. First, the values are sorted into ascending order. Then for each proportion, the two values that are candidates for the quantile are found, by counting from either end of the sorted list to leave the required number of values from that point in the list to the end. The quantiles are the averages of the two values found.

The alternative setting, `METHOD=population`, uses the Genstat `QUANTILES` function. `QUANTILES` assumes that the sorted data values are evenly distributed along the range of proportions, but with the lowest data value located at proportion 1/2n, and the highest one located at proportion 1-1/2n, where n is the size of the sample. (This recognises that sample is unlikely to contain the minumum and maximum values in the population.) If the required proportion p coincides with one of these sample proportions, `QUANTILES` estimates the quantile as the corresponding data value. If not, `QUANTILES` finds the nearest sample point with a proportion below p, and the nearest one with a proportion above p. It then interpolates between these two points, i.e. it takes a weighted average of their data values, with weights given by the absolute difference between their proportions and p. However, if p lies outside (i.e. above or below) the sample proportions, `QUANTILES` does a linear extrapolation using the two nearest sample points.

### Action with `RESTRICT`

If the `DATA` variate is restricted, the quantiles are formed only using the units that are not restricted out. The `PROPORTION` and `QUANTILES` variates must not be restricted.

Directive: `TABULATE`.

Procedure: `RQLINEAR`.

Function: `QUANTILES`.

Commands for: Calculations and manipulation.

### Example

```CAPTION   'QUANTILE example',\
!t('Generate some Normal random numbers, and print the',\
'five-number summary (min, lower 25%, median, upper 25%, max).')\;
STYLE=meta,plain
CALCULATE Normal = NED(URAND(37752; 500))
QUANTILE  Normal
PRINT !T('Form the 10,20...90 percent quantiles,',\
'and compare with the theoretical values.'); JUSTIFICATION=left
VARIATE   [VALUES=0.1,0.2...0.9] Proportn
&         [VALUES=-1.282,-0.8416,-0.5244,-0.2533,0,\
0.2533,0.5244,0.8416,1.282] Theory
QUANTILE  [PRINT=*; PROPORTION=Proportn] Normal; QUANTILE=Sample
PRINT     [RLPRINT=*] Proportn,Theory,Sample; DECIMALS=4
```
Updated on March 6, 2019