Performs tests of univariate and/or multivariate Normality (M.S. Ridout).
Option
PRINT = string tokens |
Allows the required printed output to be selected: test statistics, tables of critical values and the flagging of significant values with stars (marginal , bivariateangle , radius , critical , stars ); default marg , biva , radi |
---|
Parameter
DATA = variates or pointers |
Variates whose univariate Normality is to be tested or pointers, each to a set of variates whose Normality and/or multivariate Normality are to be tested |
---|
Description
This procedure offers three types of test of Normality.
Marginal (univariate) tests – assess the Normality of each variate in turn. The variates are standardized to have mean=0, variance=1 and then transformed with the NORMAL
function. The test is based on the idea that, assuming Normality, these transformed values should look like a sample from a uniform distribution on (0,1).
Bivariate angle tests – assess the bivariate Normality of each pair of variates in turn. The variates are standardized so that they are uncorrelated and have mean=0 and variance=1. The test is based on the following idea: if x and y are the standardized values, then the angle between the x-axis and the line joining (0,0) to (x,y) should, assuming Normality, be uniformly distributed on (0,2π).
Radius test – provides a single overall test of multivariate Normality. The variates are again standardized to have mean=0 and so that their covariance matrix is the identity matrix. The test uses the fact that if z1, z2,…, zn are the standardized values then z12 + z22 + … + zn2 should, under multivariate Normality, be approximately distributed as chi-square on n degrees of freedom.
For each type of test, the test statistics are empirical distribution function (EDF) statistics – i.e. they compare the empirical distribution function of the sample with the theoretical distribution expected under the null hypothesis. Three EDF statistics are provided for each type of test – the Anderson-Darling statistic, the Cramer-von Mises statistic and the Watson statistic. The idea is to provide good power against a wide range of alternatives. The test statistics are adjusted so that their null distribution is independent of the sample size; critical values can be printed by the procedure (option PRINT=critical
).
The DATA
parameter is used to indicate the variate(s) whose Normality is to be assessed. If a single variate is supplied, its Normality is tested using the marginal test. Alternatively, DATA
can supply a pointer to a set of variates to be tested for multivariate Normality.
The PRINT
option can be used to select the type of test using the settings marginal
, bivariateangle
and radius
. The setting critical
allows tables of critical values to be printed, and stars
requests that significant values of the test statistics be flagged with stars. Settings bivariateangle
and radius
are relevant only when testing for multivariate Normality. By default PRINT=marginal,bivariateangle,radius
Option: PRINT
.
Parameter: DATA
.
Method
The calculations are clearly set out in Aitchison (1986; Section 7.3). Bivariate angle and radius tests are described by Andrews, Gnanadesikan & Warner (1973). Stephens (1974) describes the EDF statistics used and gives tables of critical values and information on their comparative power.
Action with RESTRICT
If a variate to which the DATA
parameter is set is restricted, the tests will be calculated using only the units included by the restriction. Similarly, the variates in a DATA
pointer can be restricted, but then must all be restricted in the same way. The procedure does not work properly with missing values. If missing values are present, RESTRICT
should be used (before calling the procedure) to exclude all units for which any of the variates has a missing value.
References
Aitchison J.A. (1986). The Statistical Analysis of Compositional Data. London: Chapman & Hall.
Andrews D.F., Gnanadesikan R. & Warner J.L. (1973). Methods for assessing multivariate normality. In: Multivariate Analysis III (ed. P.R. Krishnaiah) 95-116. New York: Academic Press.
Stephens M.A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American Statistical Association, 69, 730-737.
See also
Directive: DISTRIBUTION
.
Procedures: EDFTEST
, WSTATISTIC
.
Commands for: Basic and nonparametric statistics.
Example
CAPTION 'NORMTEST example',\ !t('Data from Aitchison (1986, The Statistical Analysis of',\ 'Compositional Data, pages 354-355), percentages by weight',\ 'of A,B,C,D,E in 25 samples of Hongite.'); STYLE=meta,plain VARIATE [NVALUES=25] V[1...5] READ [SERIAL=yes] V[] 48.8 48.2 37.0 50.9 44.2 52.3 44.6 34.6 41.2 42.6 49.9 45.2 32.7 41.4 46.2 32.3 43.2 49.5 42.3 44.6 45.8 49.9 48.6 45.5 45.9 : 31.7 23.8 9.1 23.8 38.3 26.2 33.0 5.2 11.7 46.6 19.5 37.3 8.5 12.9 17.5 7.3 44.3 32.3 15.8 11.5 16.6 25.0 34.0 16.6 24.9 : 3.8 9.0 34.2 7.2 2.9 4.2 4.6 42.9 26.7 0.7 11.4 2.7 38.9 23.4 15.8 40.9 1.0 3.1 20.4 23.8 16.8 6.8 2.5 17.6 9.7 : 6.4 9.2 9.5 10.1 7.7 12.5 12.2 9.6 9.6 5.6 9.5 5.5 8.0 15.8 8.3 12.9 7.8 8.7 8.3 11.6 12.0 10.9 9.4 9.6 9.8 : 9.3 9.8 10.2 8.0 6.9 4.8 5.6 7.7 10.8 4.5 9.7 9.3 11.9 6.5 12.2 6.6 3.7 6.3 13.2 8.5 8.8 7.4 5.5 10.7 9.7 : " Transform data, as on page 142, and test for multivariate normality." CALCULATE Y[1...4]=LOG(V[1...4]/V[5]) NORMTEST [PRINT=marginal,bivariateangle,radius,critical] Y CAPTION !t(\ 'The results agree with Table 7.4 except those for the Cramer-von Mises and',\ 'Watson forms of the bivariate angle and radius tests. This appears to be',\ 'because in forming the modified test statistics there has been a DIVISION ',\ 'by (1+1/N) or (1+0.8/N) instead of a MULTIPLICATION by these quantities',\ '(see Aitchison, Table 7.3; also Stephens, 1974, J.A.S.A., 69, p.732).')