1. Home
  2. DESCRIBE procedure

DESCRIBE procedure

Saves and/or prints summary statistics for variates (R.C. Butler & D.A. Murray).

Options

PRINT = string token Controls whether or not the summaries are printed (summaries); default summ
SELECTION = string tokens Selects the statistics to be produced (nval, nobs, nmv, mean, median, min, max, range, q1, q3, sd, sem, var, sevar, %cv, sum, ss, uss, skew, seskew, kurtosis, sekurtosis, all); default mean, min, max, nobs, nmv, medi, q1, q3
GROUPS = factor Allows groups to be defined, so that summaries are produced for each group in turn

Parameters

DATA = variates Data to summarize
SUMMARIES = variates or pointers To save summaries for each DATA variate, in a variate if GROUPS is unset, or in a pointer to a set of variates (one for each group) if groups have been specified; will be redefined if necessary

Description

DESCRIBE calculates up to 22 different summary statistics for values stored in a variate. The statistics may be saved, or printed, or both. The statistics to be calculated are indicated by the SELECTION option; the available settings are:

    nval number of values
    nobs number of non-missing values
    nmv number of missing values
    mean arithmetice mean
    median median
    min minimum
    max maximum
    range range (max-min)
    q1 lower quartile
    q3 upper quartile
    sd standard deviation
    sem standard error of mean
    var variance
    sevar standard error of variance
    %cv coefficient of variation
    sum total of values
    ss corrected sum of squares
    uss uncorrected sum of squares
    skew skewness (see Method)
    seskew standard error of skewness
    kurtosis kurtosis (see Method)
    sekurtosis s.e. of kurtosis
    all all 22 summaries

by default the mean, min, max, nobs, nmv, median and both quartiles are calculated.

Printing is controlled by the PRINT option. The statistics are printed by default, so to suppress printing you need to put PRINT=*.

The GROUPS option allows groups of observations to be defined. Summaries are then given for each group.

The SUMMARIES parameter allows the statistics to be saved in a variate, or a pointer to a set of variates if there are groups. These need not be declared in advance. The units of the variate(s) are labelled by the corresponding strings from the settings (in capital letters) of the SELECTION option, to simplify the subsequent access of any individual statistic. For example, the minimum value can be copied from a SUMMARIES variate v into a scalar m by

CALCULATE m = v$['MIN']

Options: PRINT, SELECTION, GROUPS.

Parameters: DATA, SUMMARIES.

Method

The statistics are calculated in a variate which is then restricted to print only those that were required, and to obtain the unit numbers of those to be copied into the SUMMARIES variate.

SE Variance is calculated as

√((N (M4 – 4 M1 M3 + 6 M1 M1 M2 – 3 M14)/(N-1) – (N (M2M1 M1)/(N-1))2)/N)

Skewness is calculated as (M3 – 3 M1 M2 + 2 M13 ) / (M2M1 M1)3/2

SE Skewness is calculated as √({6N×(N-1)}/{(N-2)×(N+1)×(N+3)})

Kurtosis is calculated as (M4 – 4 M1 M3 + 6 M12 M2 – 3 M14)/(M2M1 M1)2 – 3

SE Kurtosis is calculated as √({24N(N-1)2}/{(N-2)(N-3)(N+5)(N+3)})

where Mi = ∑ xi / N

and N = NOBSERVATIONS(DATA)

Action with RESTRICT

The statistics are calculated for the restricted set of units from each DATA variate. Any existing restrictions are not affected by the procedure.

See also

Directive: TABULATE.
Procedures: CDESCRIBE, PTDESCRIBE, TABMODE, VSUMMARY.
Commands for: Basic and nonparametric statistics.

Example

CAPTION 'DESCRIBE example',\
  !t('1. The default statistics (mean, min, max, nobs, nmv, median, q1, q3)',\
  'are printed for a variate of 20 random numbers called data.');\
  STYLE=meta,plain
VARIATE   [NVALUES=20] data
CALCULATE data = URAND(50697; 20)
DESCRIBE  data
CAPTION !t('2. From a variate containing the weights of children and a',\
  'factor sex, use DESCRIBE to print median and quartiles of the',\
  'weights of the girls, and save them in a variate called save.')
VARIATE  [values=38.2,40.1,45.2,39.6,41.4,47.9,38.8,42.3,47.5,41.2] weight
FACTOR   [label=!t(girl,boy); values=5(1,2)] sex
RESTRICT weight; CONDITION=( sex .IN. 'girl')
DESCRIBE [SELECT=median,q1,q3] weight; SUMMARIES=save
PRINT save
RESTRICT weight
CAPTION !t('3. Use DESCRIBE to print all the summary statistics',\
  'for girls and boys separately.')
DESCRIBE [GROUP=sex; SELECTION=all] weight
CAPTION !t('4. Use DESCRIBE to save (but not print) the skewness and',\
  'its standard error in a variate called skew, for',\
  '100 Normally distributed random numbers.')
CALC normal = grnormal(100; 20; 10)
HISTOGRAM normal
DESCRIBE  [PRINT=*; SELECT=skew,seskew] normal; SUMMARIES=skew
PRINT     [RLPRINT=*] !t('skewness','s.e. of skewness'),skew
Updated on January 12, 2022

Was this article helpful?