Fits a model where different units follow different generalized linear models (R.W. Payne).
Options
PRINT = string tokens |
Controls printed output (model , deviance , summary , estimates , correlations , fittedvalues , accumulated , monitoring ); default mode , summ , esti |
---|---|
Y = variate |
Response variate |
TERMS = formula |
Terms in the model |
NBINOMIAL = variate |
Binomial totals |
DISPERSION = scalar |
Dispersion parameter; default * for DIST=norm , gamm , inve or calc , and 1 for DIST=pois , bino , mult , nega , geom , expo or bern |
WEIGHTS = variate |
Prior weights; default 1 |
OFFSET = variate |
Offset variate to be included in model; default * i.e. none |
CONSTANT = string token |
How to treat the constant (estimate , omit , ignore ); default esti |
FACTORIAL = scalar |
Limit for expansion of model terms; default 3 |
FULL = string token |
Whether to assign all possible parameters to factors and interactions (no , yes ); default no |
DATASET = factor |
Indicates which generalized linear model to apply to each unit; default defined from NVALUES |
LINEARPREDICTOR = variate |
Initial values for linear predictor |
MAXCYCLE = scalar |
Maximum number of iterations; default 30 |
MVINCLUDE = string token |
Whether to include units with missing values in the explanatory factors and variates (explanatory ); default * i.e. omit these |
SAVE = identifier |
To name the regression save structure; default * |
Parameters
NVALUES = scalars |
Number of units for each generalized linear model |
---|---|
DISTRIBUTION = string tokens |
Error distributions (normal , poisson , binomial , gamma , inversenormal , multinomial , calculated , negativebinomial , geometric , exponential , bernoulli ); default norm |
LINK = string tokens |
Link functions (canonical , identity , logarithm , logit , reciprocal , power , squareroot , probit , complementaryloglog , calculated , logratio ); default cano (i.e. iden for DIST=norm or calc ; loga for DIST=pois ; logi for DIST=bino , bern or mult ; reci for DIST=gamm or expo ; powe for DIST=inve ; logr for DIST=nega or geom ) |
EXPONENT = scalars |
Exponent for power links |
Description
RMGLM
is useful if you want to fit a model where there are several generalized linear models, each one applying to a different set of data units. This is required, for example, in the fitting of hierarchical generalized linear models (see HGANALYSE
), and would also allow the fitting of multivariate generalized linear models.
The NVALUES
parameter can specify a list of scalars defining the number of units following each generalized linear model. If NUNITS
is used, the units are assumed to be ordered so that all the units with the first generalized linear model come first, then those with the second one, and so on. The DATASET
option can then save a factor to indicate which generalized linear model applies to each unit. Alternatively, you can specify a list of null settings (*
) for NVALUES
, and supply a pre-defined factor using the DATASET
option. The DISTRIBUTION
parameter specifies the error distributions, the LINK
parameter specifies the link function, and the EXPONENT
exponent parameter specifies the exponent where there is a power link.
The Y
option specifies response variate, and the NBINOMIAL
option specifies the totals for binomial data. Prior weights can be supplied using the WEIGHTS
option. The TERMS
option specifies the terms to be fitted, and the FULL
option controls the parameterization, as in the TERMS
directive. The MVINCLUDE
option allows units with missing values with missing values in factors or variates in the model to be included (by default these are excluded). Where this occurs, the factor or variate is taken to make no contribution to the fitted value for the unit concerned (see TERMS
for more details).
The CONSTANT
option indicates whether or not to fit a constant, and the FACTORIAL
option specifies a limit (default 3) on the number of variates and factors in each term, as in the FIT
directive. An offset can be supplied using the OFFSET
option. The LINEARPREDICTOR
option can supply initial values for linear predictor, and the MAXCYCLE
option can set a limit (default 30) on the number of iterations. Printed output is controlled by the PRINT
option, with the same settings as in the FIT
directive.
After the fit, the RDISPLAY
directive can be used to generate additional output, and the RKEEP
directive can be used to save information, in the usual way.
Options: PRINT
, Y
, TERMS
, NBINOMIAL
, DISPERSION
, WEIGHTS
, OFFSET
, CONSTANT
, FACTORIAL
, FULL
, DATASET
, LINEARPREDICTOR
, MAXCYCLE
, MVINCLUDE
, SAVE
.
Parameter: NVALUES
, DISTRIBUTION
, LINK
, EXPONENT
.
Method
RMGLM
uses the calculated
settings of the DISTRIBUTION
and LINK
options
of MODEL
.
Action with RESTRICT
You can restrict the units that Genstat will use for the fit by putting a restriction on the response variates, weight variate, offset variate, binomial totals, or any explanatory variate or factor. However, you must then supply the initial values for linear predictor (using the LINEARPREDICTOR
option), as the default calculation requires use of RESTRICT
.
See also
Commands for: Regression analysis.
Example
CAPTION 'RMGLM example',\ 'Set 1: binomial distribution, probit link.'; STYLE=meta,plain VARIATE Dose,N,R; VALUES=!(10.2, 7.7, 5.1, 3.8, 2.6),\ !( 50, 49, 46, 48, 50),\ !( 44, 42, 24, 16, 6) VARIATE Logdose CALC Logdose = LOG10(Dose) MODEL [DISTRIBUTION=binomial; LINK=probit] R; NBINOMIAL=N FIT Logdose CAPTION 'Set 2: gamma distribution, reciprocal link.' VARIATE [VALUES=10.22,7.37,5.72,4.78,4.3,3.85,3.74,3.54,3.39] Conc50 & [VALUES=0.5,0.75,1,1.5,2,3,4,6,8] Antiser SCALAR Offset; VALUE=0.52 CALC U = 1/(Antiser+Offset) MODEL [DIST=gamma] Conc50 FIT [PRINT=m,s,e,c,f] U CAPTION 'Fit both simultaneously.' CALC N1,N2 = NVAL(R,Conc50) FACTOR [LEVELS=2; VALUES=#N1(1),#N2(2); REFERENCE=2] Dataset VARIATE CY,CN,CLd,CU,Constdiff; VALUES=\ !(#R,#Conc50),!(#N,#N2(0)),!(#Logdose,#N2(0)),!(#N1(0),#U),\ !(#N1(0),#N2(1)) RMGLM [Y=CY; TERMS=Constdiff+CLd+CU; NBINOMIAL=CN]\ 5,9 ; DISTRIBUTION=binomial,gamma; LINK=probit,reciprocal