Performs generalized calibration of survey data (S.D. Langton).
Options
PRINT = string token |
Controls printed output (summary , totals , monitoring ); default summ, tota |
---|---|
PLOT = string token |
Controls which high-resolution graphs are plotted (weights ); default * i.e. none |
STRATUMFACTOR = factor |
Stratification factor; default * i.e. unstratified |
SAMPLINGUNITS = factor |
Factors indicating the sampling units in a two-stage design; default * , i.e. single-stage design |
TCONSTRAINTS = scalars |
Constraint totals or tables |
X = variates |
Variates corresponding to TCONSTRAINTS ; * implies the equivalent constraint relates to a count |
WEIGHTS = variate |
Initial weights |
OUTWEIGHTS = variate |
Final (calibration) weights |
METHOD = string token |
Method to use (linear , truncatedlinear , logistic , fittedvalues ); default line |
LOWER = scalar |
Lower bound for g-weights; default 0.1 |
UPPER = scalar |
Upper bound for g-weights; default 10 |
MAXCYCLE = scalar |
Maximum number of iterations; default 50 |
TOLERENCE = scalar |
Tolerence for convergence; default 0.0001 |
Parameters
Y = variates |
Response data for analysis |
---|---|
TOTALS = scalars |
Saves estimated totals |
SETOTALS = scalars |
Saves standard errors of totals |
FITTEDVALUES = variates |
Saves fitted values from the regression |
Description
SVCALIBRATE
performs calibration estimation of survey data (Deville & Sarndal 1992). The sampling weights from a survey are often adjusted to ensure that they produce estimates that match known population totals. For example, if in an agricultural survey the sampling weights are applied to the areas of the sampled farms, the resulting estimate will not generally exactly equal the known total agricultural area in the population, and so an adjustment is required. Calibration calculates adjusted weights that ensure the constraints are met, while remaining as close as possible to the original sampling weights.
The CONSTRAINTS
option is used to specify the constraints, either in a scalar to provide a total for the whole population, or in a table specifying totals for subgroups defined by the classification factors of the table. The X
option specifies a list of variates (in parallel) to which the constraints relate, with a null value indicating that the corresponding constraint relates to a count of units in the population. If STRATUMFACTOR
is set a separate calibration is performed in each stratum and TCONSTRAINTS
must be set to one or more tables, classified by the stratification factor. The SAMPLINGUNITS
option can be used to specify primary sampling units in a two stage design; this information is only used for calculation of the standard error of the total and does not affect the calibration process. The WEIGHTS
option specifies the initial sampling weights, which will usually be the inverse of the probability of selection of each unit, whilst OUTWEIGHTS
returns the adjusted weights.
The METHOD
option controls the restrictions on the range of adjustments (the “g-weights”) used to convert the initial to the modified weights and has three possible settings: linear
produces estimates equivalent to the usual regression estimates, the g-weights are not restricted and may be negative; truncatedlinear
restricts the g-weights to the range specified by the LOWER
and UPPER
options by replacing extreme values with these bounds; logistic
uses a logit-like transformation to ensure that the weights remain within the specified bounds. These correspond to methods 1, 5 and 7 respectively of Singh & Mohl (1996). The last two methods use iterative calculations which are controlled by the MAXCYCLE
and TOLERENCE
options. Progress of the iterations can be viewed using the monitoring
setting of PRINT
. The default values for LOWER
and UPPER
are 0.1 and 10, thus allowing the adjusted weights to differ from the initial weights by a factor of ten in either direction.
The procedure can be run without setting any options, in order to produce adjusted weights for use with TABULATE
or SVTABULATE
. Alternatively the first parameter, Y
, may be used to specify variates for which estimates are required. The estimates of totals and approximate standard errors can be saved using the TOTALS
and SETOTALS
parameters. More complex analyses (e.g. cross-tabulations, and two-stage analyses with a finite population correction) can be achieved by saving the OUTWEIGHTS
and using them as input weights for SVTABULATE
. Fitted values from the generalized regression method (METHOD=linear
) are saved in FITTEDVALUES
; these are needed to calculate the correct asymptotic standard errors for estimates produced using the weights by means of SVTABULATE
. You can produce FITTEDVALUES
without any calibration, by setting METHOD=fittedvalues
; this avoids having to repeat the full calibration process when analysing additional Y
variates.
Options: PRINT
, PLOT
, STRATUMFACTOR
, SAMPLINGUNITS
, TCONSTRAINTS
, X
, WEIGHTS
, OUTWEIGHTS
, METHOD
, LOWER
, UPPER
, MAXCYCLE
, TOLERENCE
.
Parameters: Y
, TOTALS
, SETOTALS
, FITTEDVALUES
.
Action with RESTRICT
Any restriction on WEIGHTS
, OUTWEIGHTS
or Y
excludes the restricted units from the calibration process, so that their values of WEIGHTS
pass unchanged to OUTWEIGHTS
. TCONSTRAINTS
should be based only on the unrestricted units and, if Y
is set, estimates of the total are for the subpopulation defined by the restrictions on WEIGHTS
. Any restrictions on X
are ignored.
References
Deville, J.-C. & Sarndal, C.-E. (1992). Calibration estimators in survey sampling. Journal of the American Statistical Association, 87, 376-382.
Singh, A.C. & Mohl, C.A. (1996). Understanding calibration estimators in survey sampling. Survey Methodology, 22, 107-115.
See also
Procedures: SVBOOT
, SVGLM
, SVHOTDECK
, SVREWEIGHT
, SVSAMPLE
, SVSTRATIFIED
, SVTABULATE
, SVWEIGHT
.
Commands for: Survey analysis.
Example
CAPTION 'SVCALIBRATE example',!t(\ 'Example from Table 1 of Survey Reweighting for Tax',\ 'Microsimulation Modelling, John Creedy',\ 'NEW ZEALAND TREASURY WORKING PAPER 03/17',\ 'http://www.treasury.govt.nz/workingpapers/2003/03-17.asp');\ STYLE=meta,plain VARIATE k,x[1...4],initweight READ k,x[1...4],initweight 1 1 1 0 0 3 2 0 1 0 0 3 3 1 0 2 0 5 4 0 0 6 1 4 5 1 0 4 1 2 6 1 1 0 0 5 7 1 0 5 0 5 8 0 0 6 1 4 9 0 1 0 0 3 10 0 0 3 1 3 11 1 0 2 0 5 12 1 1 0 1 4 13 1 0 3 1 4 14 1 0 4 0 3 15 0 0 5 0 5 16 0 1 0 1 3 17 1 0 2 1 4 18 0 0 6 0 5 19 1 0 4 1 4 20 0 1 0 0 3 : SCALAR constrain[1...4]; VALUE=50,20,230,35 " This gives the output weights wk in Table 1." SVCALIBRATE [TCONSTRAINTS=constrain[]; X=x[]; WEIGHTS=initweight;\ OUTWEIGHTS=outlinear] " This forms the last but one column of Table 4." SVCALIBRATE [PRINT=#,monitor; TCONSTRAINTS=constrain[]; X=x[];\ WEIGHTS=initweight; OUTWEIGHTS=outlogistic; METHOD=logistic;\ LOWER=0.8; UPPER=1.25] PRINT k,outlinear,outlogistic; DECIMALS=0,2(3)