Performs generalized calibration of survey data (S.D. Langton).
|Controls printed output (
||Controls which high-resolution graphs are plotted (
||Stratification factor; default
||Factors indicating the sampling units in a two-stage design; default
||Constraint totals or tables|
||Variates corresponding to
||Final (calibration) weights|
||Method to use (
||Lower bound for g-weights; default 0.1|
||Upper bound for g-weights; default 10|
||Maximum number of iterations; default 50|
||Tolerence for convergence; default 0.0001|
||Response data for analysis|
||Saves estimated totals|
||Saves standard errors of totals|
||Saves fitted values from the regression|
SVCALIBRATE performs calibration estimation of survey data (Deville & Sarndal 1992). The sampling weights from a survey are often adjusted to ensure that they produce estimates that match known population totals. For example, if in an agricultural survey the sampling weights are applied to the areas of the sampled farms, the resulting estimate will not generally exactly equal the known total agricultural area in the population, and so an adjustment is required. Calibration calculates adjusted weights that ensure the constraints are met, while remaining as close as possible to the original sampling weights.
CONSTRAINTS option is used to specify the constraints, either in a scalar to provide a total for the whole population, or in a table specifying totals for subgroups defined by the classification factors of the table. The
X option specifies a list of variates (in parallel) to which the constraints relate, with a null value indicating that the corresponding constraint relates to a count of units in the population. If
STRATUMFACTOR is set a separate calibration is performed in each stratum and
TCONSTRAINTS must be set to one or more tables, classified by the stratification factor. The
SAMPLINGUNITS option can be used to specify primary sampling units in a two stage design; this information is only used for calculation of the standard error of the total and does not affect the calibration process. The
WEIGHTS option specifies the initial sampling weights, which will usually be the inverse of the probability of selection of each unit, whilst
OUTWEIGHTS returns the adjusted weights.
METHOD option controls the restrictions on the range of adjustments (the “g-weights”) used to convert the initial to the modified weights and has three possible settings:
linear produces estimates equivalent to the usual regression estimates, the g-weights are not restricted and may be negative;
truncatedlinear restricts the g-weights to the range specified by the
UPPER options by replacing extreme values with these bounds;
logistic uses a logit-like transformation to ensure that the weights remain within the specified bounds. These correspond to methods 1, 5 and 7 respectively of Singh & Mohl (1996). The last two methods use iterative calculations which are controlled by the
TOLERENCE options. Progress of the iterations can be viewed using the
monitoring setting of
UPPER are 0.1 and 10, thus allowing the adjusted weights to differ from the initial weights by a factor of ten in either direction.
The procedure can be run without setting any options, in order to produce adjusted weights for use with
SVTABULATE. Alternatively the first parameter,
Y, may be used to specify variates for which estimates are required. The estimates of totals and approximate standard errors can be saved using the
SETOTALS parameters. More complex analyses (e.g. cross-tabulations, and two-stage analyses with a finite population correction) can be achieved by saving the
OUTWEIGHTS and using them as input weights for
SVTABULATE. Fitted values from the generalized regression method
(METHOD=linear) are saved in
FITTEDVALUES; these are needed to calculate the correct asymptotic standard errors for estimates produced using the weights by means of
SVTABULATE. You can produce
FITTEDVALUES without any calibration, by setting
METHOD=fittedvalues; this avoids having to repeat the full calibration process when analysing additional
Any restriction on
Y excludes the restricted units from the calibration process, so that their values of
WEIGHTS pass unchanged to
TCONSTRAINTS should be based only on the unrestricted units and, if
Y is set, estimates of the total are for the subpopulation defined by the restrictions on
WEIGHTS. Any restrictions on
X are ignored.
Deville, J.-C. & Sarndal, C.-E. (1992). Calibration estimators in survey sampling. Journal of the American Statistical Association, 87, 376-382.
Singh, A.C. & Mohl, C.A. (1996). Understanding calibration estimators in survey sampling. Survey Methodology, 22, 107-115.
Commands for: Survey analysis.
CAPTION 'SVCALIBRATE example',!t(\ 'Example from Table 1 of Survey Reweighting for Tax',\ 'Microsimulation Modelling, John Creedy',\ 'NEW ZEALAND TREASURY WORKING PAPER 03/17',\ 'http://www.treasury.govt.nz/workingpapers/2003/03-17.asp');\ STYLE=meta,plain VARIATE k,x[1...4],initweight READ k,x[1...4],initweight 1 1 1 0 0 3 2 0 1 0 0 3 3 1 0 2 0 5 4 0 0 6 1 4 5 1 0 4 1 2 6 1 1 0 0 5 7 1 0 5 0 5 8 0 0 6 1 4 9 0 1 0 0 3 10 0 0 3 1 3 11 1 0 2 0 5 12 1 1 0 1 4 13 1 0 3 1 4 14 1 0 4 0 3 15 0 0 5 0 5 16 0 1 0 1 3 17 1 0 2 1 4 18 0 0 6 0 5 19 1 0 4 1 4 20 0 1 0 0 3 : SCALAR constrain[1...4]; VALUE=50,20,230,35 " This gives the output weights wk in Table 1." SVCALIBRATE [TCONSTRAINTS=constrain; X=x; WEIGHTS=initweight;\ OUTWEIGHTS=outlinear] " This forms the last but one column of Table 4." SVCALIBRATE [PRINT=#,monitor; TCONSTRAINTS=constrain; X=x;\ WEIGHTS=initweight; OUTWEIGHTS=outlogistic; METHOD=logistic;\ LOWER=0.8; UPPER=1.25] PRINT k,outlinear,outlogistic; DECIMALS=0,2(3)