1. Home
  2. DEMC procedure

DEMC procedure

Performs Bayesian computing using the Differential Evolution Markov Chain algorithm (W. van den Berg & R.W. Payne).

Options

PRINT = string token What to print (results, monitoring, scatterplot, histogram); default resu, moni, scat, hist
CALCULATION = expression Calculation(s) of logposterior, involving explanatory or pointer variate; if unset, this is calculated by the procedure specified by the PROCEDURE option
LOGPOSTERIOR = scalar Identifier of scalar holding log-posterior within CALCULATION (must be set if CALCULATION is set)
MULTIPLE = scalar Number of populations is number of parameters times MULTIPLE; default 3
UNIFORMLIMIT = scalar Uniform random numbers are drawn from (-UNIFORMLIMIT, UNIFORMLIMIT) and added to candidate parameter sets; default 0.00001
DATA = identifiers Data structures used in CALCULATION or by PROCEDURE
NGENERATIONS = scalar Maximum number of iterations; default 1000
STEP1 = scalar or variate Generations for which gamma is set to 1; default 0
FRACTIONBURNIN = scalar Fraction of iterations used for burn-in; default 0.5
GRVARIANCE = scalar or variate Variance to generate populations from initial values of the parameters; default 0.1
PERCENTAGES = variate Percentages for which quantiles has to be calculated; default !(2.5, 25, 50, 75, 97.5)
PROCEDURE = identifier Identifier of procedure to calculate LOGPOSTERIOR if CALCULATION is unset; default _DEMCLOGPOSTERIOR
SEED = scalar Seed for the random numbers; default 0
NWINDOWS = scalar Number of histograms and scatterplots per screen when plotting estimates and logposterior from all iterations
SDLOGPOSTERIOR = scalar Saves the s.d. for LOGPOSTERIOR
QUANTILESLOGPOSTERIOR = variate Saves quantiles for LOGPOSTERIOR
RHATLOGPOSTERIOR = scalar Saves the convergence criterion for LOGPOSTERIOR
ALLLOGPOSTERIOR = variate Saves the parameter estimates for LOGPOSTERIOR from all the iterations
IPOPULATIONS = pointers Pointer to supply initial populations of the parameters and the corresponding log-posteriors
FPOPULATIONS = pointers Pointer to save final populations of the parameters and the corresponding log-posteriors

Parameters

PARAMETER = scalars Parameters to estimate
INITIAL = scalars Initial values of the parameters; must be set unless IPOPULATIONS is set
SD = scalars Standard errors of the estimates
QUANTILES = variates Saves the quantiles for each parameter
RHAT = scalars Convergence criteria
ALLESTIMATES = variates Saves the parameter estimates from all the iterations

Description

DEMC uses the Differential Evolution Markov Chain algorithm of Ter Braak (2006) to do Bayesian computations by Markov chain Monte Carlo. The logarithm of the posterior density for each set of parameters can be calculated either by a list of expressions supplied by the CALCULATION option, or by a (user-defined) procedure whose name is specified by the PROCEDURE option (with default name _DEMCLOGPOSTERIOR). The names of the parameters and their initial values are specified by the PARAMETER and INITIAL parameters, respectively. Data structures containing information that is needed to calculate the log-posterior are supplied by the DATA option. Also, if you are using the CALCULATION option, you must define the identifier of the log-posterior (as used to store the results of the calculations) using the LOGPOSTERIOR option.

The number of populations of parameters to be generated is defined as the number of parameters multiplied by the value supplied by the MULTIPLE option (default 3). The Normal variance used to generate the initial population from the initial values is specified by the GRVARIANCE option. You can set this to a scalar to use the same variance for each parameter, or to a variate to define different variances for the parameters; by default GRVARIANCE=0.1. The fraction of the data used for burn-in is specified by the FRACTIONBURNIN option (default 0.5).

The NGENERATIONS option defines the number of generations to form from the populations, and the FRACTIONBURNIN option defines the proportion of these that are for burn-in. (The distributions of the parameters are determined only from the generations that are produced after burn-in is complete.) The SEED option defines a seed for the random numbers that are used within DEMC. The default value 0 continues from the previous random-number generation or (if none) initializes the seed automatically. Options UNIFORMLIMIT and STEP1, which control how the new populations are formed, are explained in the Method section.

Once the generations are complete, the identifiers defined by PARAMETER are defined as scalars containing the means of the parameters over the populations generated after burn-in. Standard deviations and convergence criteria for the parameters can be saved, in scalars, using the SD and RHAT parameters. If RHAT is greater than 1.1, say, for any parameter, the number of generations should be increased. The QUANTILES parameter allows to save a variate for each PARAMETER, containing quantiles at percentages specified by the PERCENTAGES option (by default 2.5, 25, 50, 75, 97.5). To study the parameter distributions in more detail, you can also use the ALLESTIMATES parameter to save variates containing all the values generated after burn-in for each PARAMETER. The LOGPOSTERIOR, SDLOGPOSTERIOR, RHATLOGPOSTERIOR, QUANTILESLOGPOSTERIOR and ALLLOGPOSTERIOR allow the equivalent information to be saved for the log-posterior.

The final populations and corresponding log-posteriors can be saved, in a pointer, by the FPOPULATIONS option. You can then restart DEMC from the current position, and run some more generations, by using this pointer as the setting of the IPOPULATIONS option. FPOPULATIONS[1...N] have number of units equal to the number of parameters d, while FPOPULATIONS[N1] has number of units equal to N, where N = MULTIPLE × d. This can cause problems if you try to save FPOPULATIONS[] using procedure EXPORT.

Options: PRINT, CALCULATION, LOGPOSTERIOR, MULTIPLE, UNIFORMLIMIT, DATA, NGENERATIONS, STEP1, FRACTIONBURNIN, GRVARIANCE, PERCENTAGES, PROCEDURE, SEED, NWINDOWS, SDLOGPOSTERIOR, QUANTILESLOGPOSTERIOR, RHATLOGPOSTERIOR, ALLLOGPOSTERIOR, IPOPULATIONS, FPOPULATIONS.

Parameters: PARAMETER, INITIAL, SD, QUANTILES, RHAT, ALLESTIMATES.

Method

DEMC uses the DE-MC algorithm of Ter Braak (2006) to perform Markov chain Monte Carlo (MCMC); see Congdon (2001, 2003), Gelman et al. (2004) or Lee (2003). The DE-MC algorithm combines the genetic algorithm called Differential Evolution (DE) with MCMC. The values of the INITIAL parameter are used to generate n parameter sets, by generating d independent Normal deviates with means INITIAL and variance GRVARIANCE. Here, d is the number of parameters, and n is d multiplied by the value of the MULTIPLE option.

For each parameter set i (i=1…n), the algorithm selects two other parameter sets at random, and calculates the differences between their parameter values, multiplied by a parameter γ and a random number taken from the uniform distribution on (-UNIFORMLIMIT, UNIFORMLIMIT); γ generally takes the value 2.38/√(2×d), but the STEP1 option allows you to define generations in which γ takes the value 1 (by default there are none). These differences are then added to the parameter values in set i to form a new candidate set of values. The candidate set replaces set i if its log-posterior likelihood is greater than the log-posterior likelihood of set i + the logarithm of a random number from the uniform distribution on (0,1); see Ter Braak 2006).

References

Congdon, P. (2001). Bayesian Statistical Modelling. Wiley, Chichester, England

Congdon, P. (2003). Applied Bayesian Modelling. Wiley, Chichester, England.

Gelman, A., Carlin, J.B., Stern, H.S. & D.B. Rubin (2004). Bayesian Data Analysis, 2nd Edition. Chapman & Hall, London.

Lee, P.M. (2003). Bayesian Statistics an Introduction, 3rd Edition. Arnold, London.

Ter Braak, C.J.F. (2006) A Markov chain Monte Carlo version of the genetic algorithm Differential Evolution: easy Bayesian computing for real parameter spaces. Statistics & Computing, 16, in press.

See also

Procedure: BGXGENSTAT.

Commands for: Bayesian methods.

Example

CAPTION     'DEMC example',!t(\
  'Coagulation time data from Table 11.2 of Gelman, Carlin, Stern & Rubin',\
  '(2004). Bayesian Data Analysis, 2nd Edition, p. 299.'); STYLE=meta,plain
VARIATE     [VALUES=62,60,63,59,63,67,71,64,65,66,68,66,\
                    71,67,68,68,56,62,60,61,63,64,63,59] Coagulation_time
FACTOR      [LABELS=!t(A,B,C,D); VALUES=4(1),6(2,3),8(4)] Diet
VARIATE     [VALUES=4(0)] muvar
VCOMPONENTS Diet
REML        [PRINT=model,components] Coagulation_time
VKEEP       [SIGMA2=sigma2reml] Diet; COMPONENT=compreml; MEAN=mean
CALCULATE   muin=MEAN(mean)
EXPRESSION  p[1...5]; VALUE=\
  !E(muvar$[1...4] = mu1, mu2, mu3, mu4 ),\
  !E(fit = NEWL(Diet; muvar)),\
  !E(l1 = -12 * logs2 - 0.5 * SUM((Coagulation_time - fit)**2) / EXP(logs2)),\
  !E(l2 = -2 * logtau2 - 0.5 * SUM((muvar - mu)**2) / EXP(logtau2)),\
  !E(lposterior = l1 + l2 + logtau2 / 2 - 14 * LOG(2 * C('pi')))
DEMC        [PRINT=results,monitoring,histogram;\
            CALCULATION=p[]; LOGPOSTERIOR=lposterior;\
            DATA=Coagulation_time,Diet; PERCENTAGES=!(25,50,75);\
            NGENERATIONS=1000; SEED=349472; SDLOGPOSTERIOR=sdlposterior;\
            RHATLOGPOSTERIOR=rhlposterior; QUANTILESLOGPOSTERIOR=qu[8]]\
            mu1,mu2,mu3,mu4,mu,logs2,logtau2; INITIAL=#mean,muin,1,1
Updated on June 20, 2019

Was this article helpful?