NORMTEST procedure

Performs tests of univariate and/or multivariate Normality (M.S. Ridout).

Option

`PRINT` = string tokens	Allows the required printed output to be selected: test statistics, tables of critical values and the flagging of significant values with stars (`marginal`, `bivariateangle`, `radius`, `critical`, `stars`); default `marg`, `biva`, `radi`

Parameter

`DATA` = variates or pointers	Variates whose univariate Normality is to be tested or pointers, each to a set of variates whose Normality and/or multivariate Normality are to be tested

Description

This procedure offers three types of test of Normality.

Marginal (univariate) tests – assess the Normality of each variate in turn. The variates are standardized to have mean=0, variance=1 and then transformed with the NORMAL function. The test is based on the idea that, assuming Normality, these transformed values should look like a sample from a uniform distribution on (0,1).

Bivariate angle tests – assess the bivariate Normality of each pair of variates in turn. The variates are standardized so that they are uncorrelated and have mean=0 and variance=1. The test is based on the following idea: if x and y are the standardized values, then the angle between the x-axis and the line joining (0,0) to (x,y) should, assuming Normality, be uniformly distributed on (0,2π).

Radius test – provides a single overall test of multivariate Normality. The variates are again standardized to have mean=0 and so that their covariance matrix is the identity matrix. The test uses the fact that if z₁, z₂,…, z_n are the standardized values then z₁² + z₂² + … + z_n² should, under multivariate Normality, be approximately distributed as chi-square on n degrees of freedom.

For each type of test, the test statistics are empirical distribution function (EDF) statistics – i.e. they compare the empirical distribution function of the sample with the theoretical distribution expected under the null hypothesis. Three EDF statistics are provided for each type of test – the Anderson-Darling statistic, the Cramer-von Mises statistic and the Watson statistic. The idea is to provide good power against a wide range of alternatives. The test statistics are adjusted so that their null distribution is independent of the sample size; critical values can be printed by the procedure (option PRINT=critical).

The DATA parameter is used to indicate the variate(s) whose Normality is to be assessed. If a single variate is supplied, its Normality is tested using the marginal test. Alternatively, DATA can supply a pointer to a set of variates to be tested for multivariate Normality.

The PRINT option can be used to select the type of test using the settings marginal, bivariateangle and radius. The setting critical allows tables of critical values to be printed, and stars requests that significant values of the test statistics be flagged with stars. Settings bivariateangle and radius are relevant only when testing for multivariate Normality. By default PRINT=marginal,bivariateangle,radius

Option: PRINT.

Parameter: DATA.

Method

The calculations are clearly set out in Aitchison (1986; Section 7.3). Bivariate angle and radius tests are described by Andrews, Gnanadesikan & Warner (1973). Stephens (1974) describes the EDF statistics used and gives tables of critical values and information on their comparative power.

Action with `RESTRICT`

If a variate to which the DATA parameter is set is restricted, the tests will be calculated using only the units included by the restriction. Similarly, the variates in a DATA pointer can be restricted, but then must all be restricted in the same way. The procedure does not work properly with missing values. If missing values are present, RESTRICT should be used (before calling the procedure) to exclude all units for which any of the variates has a missing value.

References

Aitchison J.A. (1986). The Statistical Analysis of Compositional Data. London: Chapman & Hall.

Andrews D.F., Gnanadesikan R. & Warner J.L. (1973). Methods for assessing multivariate normality. In: Multivariate Analysis III (ed. P.R. Krishnaiah) 95-116. New York: Academic Press.

Stephens M.A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American Statistical Association, 69, 730-737.

Example

CAPTION 'NORMTEST example',\
        !t('Data from Aitchison (1986, The Statistical Analysis of',\
        'Compositional Data, pages 354-355), percentages by weight',\
        'of A,B,C,D,E in 25 samples of Hongite.'); STYLE=meta,plain
VARIATE [NVALUES=25] V[1...5]
READ    [SERIAL=yes] V[]
48.8 48.2 37.0 50.9 44.2 52.3 44.6 34.6 41.2 42.6 49.9 45.2 32.7 41.4 46.2
32.3 43.2 49.5 42.3 44.6 45.8 49.9 48.6 45.5 45.9 :
31.7 23.8 9.1 23.8 38.3 26.2 33.0 5.2 11.7 46.6 19.5 37.3 8.5 12.9 17.5 7.3
44.3 32.3 15.8 11.5 16.6 25.0 34.0 16.6 24.9 :
3.8 9.0 34.2 7.2 2.9 4.2 4.6 42.9 26.7 0.7 11.4 2.7 38.9 23.4 15.8 40.9 1.0
3.1 20.4 23.8 16.8 6.8 2.5 17.6 9.7 :
6.4 9.2 9.5 10.1 7.7 12.5 12.2 9.6 9.6 5.6 9.5 5.5 8.0 15.8 8.3 12.9 7.8 8.7
8.3 11.6 12.0 10.9 9.4 9.6 9.8 :
9.3 9.8 10.2 8.0 6.9 4.8 5.6 7.7 10.8 4.5 9.7 9.3 11.9 6.5 12.2 6.6 3.7 6.3
13.2 8.5 8.8 7.4 5.5 10.7 9.7 :
 " Transform data, as on page 142, and test for multivariate normality."
CALCULATE Y[1...4]=LOG(V[1...4]/V[5])
NORMTEST  [PRINT=marginal,bivariateangle,radius,critical] Y
CAPTION   !t(\
'The results agree with Table 7.4 except those for the Cramer-von Mises and',\
'Watson forms of the bivariate angle and radius tests. This appears to be',\
'because in forming the modified test statistics there has been a DIVISION ',\
'by (1+1/N) or (1+0.8/N) instead of a MULTIPLICATION by these quantities',\
'(see Aitchison, Table 7.3; also Stephens, 1974, J.A.S.A., 69, p.732).')

Updated on March 7, 2019

Was this article helpful?

Yes No