1. Home
  2. QUANTILE procedure

QUANTILE procedure

Calculates quantiles of the values in a variate (P.W. Lane).

Options

PRINT = string token What to print (quantiles); default quan
METHOD = string token Type of quantile to form (population, sample); default samp
PROPORTION = variate or scalar Proportions at which to calculate quantiles; default !(0,0.25,0.5,0.75,1)

Parameters

DATA = variates Values whose quantiles are required; this parameter must be specified
QUANTILES = variates or scalars Identifiers of structures to store results, if required

Description

Quantiles are statistics that characterize a distribution. The DATA parameter supplies a sample of numbers {xi, i=1…n} from which the quantiles are to be calculated, and the METHOD option specifies the type of quantile to form.

By default QUANTILE calculates quantiles of the sample itself. For a proportion p in the range [0,1], the corresponding quantile q of the sample {xi} has the following properties:

1) at least the proportion p of {xi} are less than or equal to q;

2) at least the proportion (1-p) of {xi} are greater than or equal to q;

3) if q=xi and q=xi+1 satisfy 1) and 2), then take q = (xi+xi+1)/2.

Thus the sample quantile for proportion 0.5 is the median, for 0.0 it is the minimum, and for 1.0 it is the maximum of the sample.

Alternatively, you can set METHOD=population to estimate quantiles of the underlying population from which data have been sampled. (This type of quantile is the one used most often elsewhere in Genstat.) The quantile is now an estimate of the value x such that a proportion p of the population has values less than or equal to x.

By default, QUANTILE produces the five quantiles called the “five-number summary” of a sample, corresponding to the proportions 0.0, 0.25, 0.5, 0.75, 1.0. The option PROPORTION can be set to a scalar or variate to request other single quantiles or sets of quantiles. By default, QUANTILE prints the statistics, but this can be suppressed by setting option PRINT=*. The quantiles can be stored in a variate using the parameter QUANTILES.

Options: PRINT, METHOD, PROPORTION.

Parameters: DATA, QUANTILES.

Method

With METHOD=sample, QUANTILE calculates the quantiles itself, using the SORT and CALCULATE directives. First, the values are sorted into ascending order. Then for each proportion, the two values that are candidates for the quantile are found, by counting from either end of the sorted list to leave the required number of values from that point in the list to the end. The quantiles are the averages of the two values found.

The alternative setting, METHOD=population, uses the Genstat QUANTILES function. QUANTILES assumes that the sorted data values are evenly distributed along the range of proportions, but with the lowest data value located at proportion 1/2n, and the highest one located at proportion 1-1/2n, where n is the size of the sample. (This recognises that sample is unlikely to contain the minumum and maximum values in the population.) If the required proportion p coincides with one of these sample proportions, QUANTILES estimates the quantile as the corresponding data value. If not, QUANTILES finds the nearest sample point with a proportion below p, and the nearest one with a proportion above p. It then interpolates between these two points, i.e. it takes a weighted average of their data values, with weights given by the absolute difference between their proportions and p. However, if p lies outside (i.e. above or below) the sample proportions, QUANTILES does a linear extrapolation using the two nearest sample points.

Action with RESTRICT

If the DATA variate is restricted, the quantiles are formed only using the units that are not restricted out. The PROPORTION and QUANTILES variates must not be restricted.

See also

Directive: TABULATE.

Procedure: RQLINEAR.

Function: QUANTILES.

Commands for: Calculations and manipulation.

Example

CAPTION   'QUANTILE example',\ 
          !t('Generate some Normal random numbers, and print the',\ 
          'five-number summary (min, lower 25%, median, upper 25%, max).')\; 
          STYLE=meta,plain
CALCULATE Normal = NED(URAND(37752; 500))
QUANTILE  Normal
PRINT !T('Form the 10,20...90 percent quantiles,',\ 
  'and compare with the theoretical values.'); JUSTIFICATION=left
VARIATE   [VALUES=0.1,0.2...0.9] Proportn
&         [VALUES=-1.282,-0.8416,-0.5244,-0.2533,0,\
          0.2533,0.5244,0.8416,1.282] Theory
QUANTILE  [PRINT=*; PROPORTION=Proportn] Normal; QUANTILE=Sample
PRINT     [RLPRINT=*] Proportn,Theory,Sample; DECIMALS=4
Updated on March 6, 2019

Was this article helpful?