Saves and/or prints summary statistics for variates (R.C. Butler & D.A. Murray).
Options
PRINT = string token |
Controls whether or not the summaries are printed (summaries ); default summ |
---|---|
SELECTION = string tokens |
Selects the statistics to be produced (nval , nobs , nmv , mean , median , min , max , range , q1 , q3 , sd , sem , var , sevar , %cv , sum , ss , uss , skew , seskew , kurtosis , sekurtosis , all ); default mean , min , max , nobs , nmv , medi , q1 , q3 |
GROUPS = factor |
Allows groups to be defined, so that summaries are produced for each group in turn |
Parameters
DATA = variates |
Data to summarize |
---|---|
SUMMARIES = variates or pointers |
To save summaries for each DATA variate, in a variate if GROUPS is unset, or in a pointer to a set of variates (one for each group) if groups have been specified; will be redefined if necessary |
Description
DESCRIBE
calculates up to 22 different summary statistics for values stored in a variate. The statistics may be saved, or printed, or both. The statistics to be calculated are indicated by the SELECTION
option; the available settings are:
nval |
number of values |
---|---|
nobs |
number of non-missing values |
nmv |
number of missing values |
mean |
arithmetice mean |
median |
median |
min |
minimum |
max |
maximum |
range |
range (max-min) |
q1 |
lower quartile |
q3 |
upper quartile |
sd |
standard deviation |
sem |
standard error of mean |
var |
variance |
sevar |
standard error of variance |
%cv |
coefficient of variation |
sum |
total of values |
ss |
corrected sum of squares |
uss |
uncorrected sum of squares |
skew |
skewness (see Method) |
seskew |
standard error of skewness |
kurtosis |
kurtosis (see Method) |
sekurtosis |
s.e. of kurtosis |
all |
all 22 summaries |
by default the mean, min, max, nobs, nmv, median and both quartiles are calculated.
Printing is controlled by the PRINT
option. The statistics are printed by default, so to suppress printing you need to put PRINT=*
.
The GROUPS
option allows groups of observations to be defined. Summaries are then given for each group.
The SUMMARIES
parameter allows the statistics to be saved in a variate, or a pointer to a set of variates if there are groups. These need not be declared in advance. The units of the variate(s) are labelled by the corresponding strings from the settings (in capital letters) of the SELECTION
option, to simplify the subsequent access of any individual statistic. For example, the minimum value can be copied from a SUMMARIES
variate v
into a scalar m
by
CALCULATE m = v$['MIN']
Options: PRINT
, SELECTION
, GROUPS
.
Parameters: DATA
, SUMMARIES
.
Method
The statistics are calculated in a variate which is then restricted to print only those that were required, and to obtain the unit numbers of those to be copied into the SUMMARIES
variate.
SE Variance is calculated as
√((N (M4 – 4 M1 M3 + 6 M1 M1 M2 – 3 M14)/(N-1) – (N (M2 – M1 M1)/(N-1))2)/N)
Skewness is calculated as (M3 – 3 M1 M2 + 2 M13 ) / (M2 – M1 M1)3/2
SE Skewness is calculated as √({6N×(N-1)}/{(N-2)×(N+1)×(N+3)})
Kurtosis is calculated as (M4 – 4 M1 M3 + 6 M12 M2 – 3 M14)/(M2 – M1 M1)2 – 3
SE Kurtosis is calculated as √({24N(N-1)2}/{(N-2)(N-3)(N+5)(N+3)})
where Mi = ∑ xi / N
and N = NOBSERVATIONS(DATA)
Action with RESTRICT
The statistics are calculated for the restricted set of units from each DATA
variate. Any existing restrictions are not affected by the procedure.
See also
Directive: TABULATE
.
Procedures: CDESCRIBE
, PTDESCRIBE
, TABMODE
, VSUMMARY
.
Commands for: Basic and nonparametric statistics.
Example
CAPTION 'DESCRIBE example',\ !t('1. The default statistics (mean, min, max, nobs, nmv, median, q1, q3)',\ 'are printed for a variate of 20 random numbers called data.');\ STYLE=meta,plain VARIATE [NVALUES=20] data CALCULATE data = URAND(50697; 20) DESCRIBE data CAPTION !t('2. From a variate containing the weights of children and a',\ 'factor sex, use DESCRIBE to print median and quartiles of the',\ 'weights of the girls, and save them in a variate called save.') VARIATE [values=38.2,40.1,45.2,39.6,41.4,47.9,38.8,42.3,47.5,41.2] weight FACTOR [label=!t(girl,boy); values=5(1,2)] sex RESTRICT weight; CONDITION=( sex .IN. 'girl') DESCRIBE [SELECT=median,q1,q3] weight; SUMMARIES=save PRINT save RESTRICT weight CAPTION !t('3. Use DESCRIBE to print all the summary statistics',\ 'for girls and boys separately.') DESCRIBE [GROUP=sex; SELECTION=all] weight CAPTION !t('4. Use DESCRIBE to save (but not print) the skewness and',\ 'its standard error in a variate called skew, for',\ '100 Normally distributed random numbers.') CALC normal = grnormal(100; 20; 10) HISTOGRAM normal DESCRIBE [PRINT=*; SELECT=skew,seskew] normal; SUMMARIES=skew PRINT [RLPRINT=*] !t('skewness','s.e. of skewness'),skew