Estimates missing values for units in a multivariate data set (H.R. Simpson & R.P. White).
||Defines the maximum allowed number of iterations; default 10|
||Each pointer contains a set of variates whose missing values are to be estimated; these will be overwritten by the estimates unless the
||Each pointer contains a set of variates to hold the results|
MULTMISSING estimates missing values for units in a multivariate data set, using an iterative regression technique. The input for the procedure is a set of variates contained in a pointer specified by the
DATA parameter. The output can be saved in a different set of variates by supplying a similar pointer with the parameter
OUT; if this is absent, the output values will overwrite the values of the variates given by
DATA. The maximum number of iterations is set by the option
MAXCYCLE, with a default of 10. If
MAXCYCLE is set to zero, missing values will be replaced by variate means calculated from the units that have no values missing for any of the variates.
Initial estimates of the missing values in each variate are formed from the variate means using the values for units that have no missing values for any variate. Estimates of the missing values for each variate are then recalculated as the fitted values from the multiple regression of that variate on all the other variates. When all the missing values have been estimated the variate means are recalculated. If any of the means differs from the previous mean by more than a tolerance (the initial standard error divided by 1000) the process is repeated, subject to a maximum number of repetitions defined by the
The default maximum number of iterations (10) is usually sufficient when there are few missing values, say two or three. If there are many more, 20 or so, it may be necessary to increase the maximum number of iterations to around 30.
The method is similar to that of Orchard & Woodbury (1972), but does not adjust for bias in the variance-covariance matrix as suggested by Beale & Little (1975).
All the variates must be unrestricted, or they must all be restricted to the same set of units; otherwise a fault will occur in a
CALCULATE statement within
Beale, E.M.L. & Little, R.J.A. (1975). Missing values in multivariate analysis. Journal of the Royal Statistical Society, Series B, 37, 129-145.
Orchard, T. & Woodbury, M.A. (1972). A missing information principle: theory and applications. In: Proceedings of the 6th Berkeley Symposium in Mathematical Statistics and Probability, Vol I, 697-715.
CAPTION 'MULTMISSING example',\ 'There are three variates, two having one missing value each.';\ STYLE=meta,plain VARIATE V[1...3]; VALUES=!(1,2,5,6,4),!(2,*,6,8,6),!(3,4,7,*,8) PRINT V; FIELDWIDTH=8; DECIMALS=2 MULTMISSING V PRINT V; FIELDWIDTH=8; DECIMALS=2