Summarizes a variate, with classifying factors, into a data matrix of variates and factors (D.B. Baird).
|What to print (
||Factors classifying the summary groups|
||Factors in the data matrix to classify the output variates|
||Whether to redefine the
||How to form levels for carried factors (
||Whether to include factor combinations with no observations in summaries (
||What warnings to output (
||Data to be summarized|
||What statistic to calculate (
||Percentile to be used for quantiles; default 50|
||Summary statistics as variates or factors for
VSUMMARY forms data matrices containing summary statistics rather than the usual tables created by
TABULATE. This can be useful if the summary statistics are to be used in a further analysis (e.g. an analysis of variance).
CLASSIFICATION option specifies the classifying factors for the summaries, and the
DATA parameter provides variates or factors to be summarized. The
STATISTIC parameter specifies the type of numerical summary: counts, totals, numbers of non-missing values, means, medians, minima, maxima, variances, quantiles, standard deviations, skewness and kurtosis coefficients and (within-cell) standard errors of means, skewness and kurtosis. The statistic
sums is a synonym of
totals. The statistic
carry, which only applies to factors, can be used to create summary factors with levels that occur in each group, e.g., in a field trial with repeated measurements in plots, we would like to carry across the factors that give the replicate and treatments for each plot. If the carried factors vary within the classification groups, a warning will be given if
carry, but this can be suppressed with
*. In the case of varying levels within groups, the
CMETHOD option controls how the levels for these groups are chosen, taking either the
maximum level present within the group for the summary level. When
PERCENTILE parameter specifies the quantile to be calculated, as a percentage between 0 and 100.
NEWDATA parameter saves the summary statistics and the
NEWCLASSIFICATION option saves new factors that gives levels of the classifying factors for the summaries. These parameters do not need to set if you set
DATA and the
CLASSIFICATION structures are then redefined to be the summary statistics and factors respectively.
VSUMMARY takes account of any restrictions on the classifying factors or the
CAPTION 'VSUMMARY example','New Zealand income survey summaries'; \ STYLE=meta,plain SPLOAD [PRINT=*] '%Data%/New Zealand Income Survey.GSH' FOR "Group commands so print out is separate to echoed statements" VSUMMARY [PRINT=summaries; CLASS=Gender,Qualification; \ NEWCLASS=gender,qualification] Age,Hours,Income; \ STATISTIC=median; NEWDATA=age,hours,income VSUMMARY [CLASS=Gender,Qualification,Marital,Ethnicity; REDEFINE=yes] \ Age,Hours,Income; STATISTIC='mean' CAPTION 'Class means'; STYLE=minor PRINT Gender,Qualification,Marital,Ethnicity,Age,Hours,Income; \ FIELD=7,11,10,10,6,7,8; DECIMALS=0; JUST=4(left),3(right) ENDFOR