HANOVA procedure

Does hierarchical analysis of variance/covariance for unbalanced data (P.W. Lane).

Options

`PRINT` = string token	Which analyses to print (`all`, `some`, `none`); default `all`
`INCHANNEL` = scalar	Channel from which to read data; default `*` specifies that the data values are already stored in the factors and variates specified by the parameters of `HANOVA`
`FORMAT` = variate	Format for reading data; default `*` requests free format
`ANALYSIS` = symmetric matrix	For `PRINT=some`, this indicates which analyses to print
`SSPM` = SSPM	Stores the corrected sums of squares and products; default `*`
`COEFFICIENT` = matrix	Stores the estimated variance and co-variance components; default `*`

Parameters

`VARIATES` = pointers	Variates to be analysed
`FACTORS` = pointers	Factors defining the hierarchy, the first factor of the pointer defining the first stratum, and so on

Description

Procedure HANOVA performs hierarchical analysis of variance and covariance, estimating the components of variance corresponding to each level of a nested classification. It uses the method of Gower (1962), which is based on the method of moments. This method is less efficient than REML, and may produce different results. However, it does not require the assumption of Normal distributions for the random terms.

Data are said to be classified hierarchically if the units have several groupings successively nested within each other. One way of representing such a classification would be to identify the groupings in each stratum of the hierarchy by a single factor; two units with the same value for one of the factors would then be required to have the identical values for the factors representing the previous strata. An alternative method is to use not only the factor for the current stratum, but also the factors for previous strata, to indicate the groupings that occur there. For example, the following classifications are effectively equivalent:

	(1)		(2)
Unit	Factor 1	Factor 2	Factor 1	Factor 2
	(stratum 1)	(stratum 2)	(stratum 1)	(stratum 2)
1	1	1	1	1
2	1	1	1	1
3	1	2	1	2
4	2	3	2	1
5	2	4	2	2

Thus, in the second form of representation, the second factor indicates the sub-divisions within each group in the first stratum, using the same levels each time. This more efficient method is the one required by HANOVA.

The simplest way to use HANOVA is to set the VARIATES parameter to a single variate (or to a pointer if several variates are to be analysed), and set the FACTORS parameter to a pointer of factors. The factors must be in the order of the hierarchy with the first factor defining the coarsest grouping of the units and succeeding factors being nested within the first. The units of data stored in the variates and factors can be in any order.

Since hierarchical data can often be extensive, HANOVA can be requested to read the data sequentially, tabulating it with respect to the factors, so that the data need not all be held in core at the same time. The INCHANNEL defines the channel number of the file from which the data are to be read; if INCHANNEL is not set, the data are assumed to be present already, in the factors and variates contained in the VARIATES and FACTORS parameters. The FORMAT option allows a variate to be specified for use in the FORMAT option of the READ command within the procedure; if this is not set, the default format of READ is assumed.

If a unit has a missing value for any of the variates or factors, it is omitted from all the analyses. The procedure carries out analyses of variance for specified variates, and of covariance for specified pairs of variates. Variance components are calculated for each stratum: that is, the proportion of the total variance per individual ascribable to the various strata of the classification.

Output is controlled by the PRINT option: by default, the matrix of coefficients of variance components is printed, followed by an analysis of variance of each variate and of covariance of each pair of variates. To obtain only some of the analyses, option PRINT should be set to some, and the ANALYSIS option to a symmetric matrix with numbers of rows and columns equal to the number of variates. A non-zero value in the matrix indicates that the corresponding analysis of variance or covariance is to be displayed. Printed output can be suppressed by setting PRINT=none.

The matrix of coefficients can be saved using the COEFFICIENTS option, and the sum of squares and products of the variates using the SSPM option.

Options: PRINT, INCHANNEL, FORMAT, ANALYSIS, COEFFICIENT, SSPM.

Parameters: VARIATES, FACTORS.

Method

HANOVA uses the method described by Gower (1962).

Action with `RESTRICT`

Account is taken of restriction on any factor, or on the first variate in the VARIATES parameter: subsequent variates must either have the same restriction, or be unrestricted.

Reference

Gower, J.C. (1962). Variance component estimation for unbalanced hierarchical classifications. Biometrics, 18, 537-542.

Example

CAPTION  'HANOVA example',\
         !t('Analysis of variance and covariance of two',\
         'variables grouped in a four-stratum hierarchy.',\
         'Data from Gower, J.C. (1962, Biometrics 18, 537).');\
         STYLE=meta,plain
FACTOR   [LEVELS=2; VALUES=1,1,1,1,1,1,1,1,2,2,2,2,2,2] f1
&        [LEVELS=2; VALUES=1,1,1,1,2,2,2,2,1,1,1,2,2,2] f2
&        [LEVELS=3; VALUES=1,1,2,3,1,2,2,3,1,2,3,1,1,1] f3
VARIATE  [VALUES=3,4,6,3,3,2,1,2,8,12,11,14,12,15] v1
&        [VALUES=5,24,7,25,57,42,18,14,-12,-34,-14.5,-42.2,-21.5,-12] v2
HANOVA   !p(v1,v2); !p(f1,f2,f3)

Updated on March 7, 2019

Was this article helpful?

Yes No