1. Home
  2. QMTBACKSELECT procedure


Performs a QTL backward selection for loci in multi-trait trials (M.P. Boer, M. Malosetti, S.J. Welham & J.T.N.M. Thissen).


PRINT = string tokens What to print (summary, model, components, effects, means, stratumvariances, monitoring, vcovariance, deviance, Waldtests, missingvalues, covariancemodels); default summ
POPULATIONTYPE = string token Type of population (BC1, DH1, F2, RIL, BCxSy, CP); must be set
ALPHALEVEL = scalar Defines a significance level; default 0.05
VCMODEL = string token Defines the variance-covariance model for the set of traits (identity, diagonal, cs, hcs, outside, fa, fa2, unstructured); default cs
VCPARAMETERS = string token Whether to re-estimate the variance-covariance model parameters (estimate, fix); default esti
VCSELECT = string token Whether to re-select the variance-covariance model (no, yes); default no
STANDARDIZE = string token How to standardize the traits (none, normalize) ; default norm
CRITERION = string token Criterion to use for model selection (aic, sic); default sic
FIXED = formula Defines extra fixed effects
UNITFACTOR = factor Saves the units factor required to define the random model when UNITERROR is to be used
MVINCLUDE = string tokens Whether to include units with missing values in the explanatory factors and variates and/or the y-variates (explanatory, yvariate); default expl, yvar
MAXCYCLE = scalar Limit on the number of iterations; default 100
WORKSPACE = scalar Number of blocks of internal memory to be set up for use by the REML algorithm; default 100


Y = variates Quantitative traits to be analysed; must be set
GENOTYPES = factors Genotype factor; must be set
FTRAITS = factors Factor indicating the trait of each y-value; must be set
UNITERROR = variates Uncertainty on trait means (derived from individual unit or plot error) to be included in QTL analysis; default * i.e. omitted
VCINITIAL = pointers Initial values for the parameters of the variance-covariance model
SELECTEDMODEL = texts VCMODEL setting for the selected covariance structure
ADDITIVEPREDICTORS = pointers Additive genetic predictors; must be set
ADD2PREDICTORS = pointers Second (paternal) set of additive genetic predictors
DOMINANCEPREDICTORS = pointers Dominance genetic predictors
CHROMOSOMES = factors Chromosomes corresponding to the genetic predictors; must be set
POSITIONS = variates Positions on the chromosomes corresponding to the genetic predictors; must be set
IDLOCI = texts Labels for the loci
IDMGENOTYPES = texts Labels for the genotypes corresponding to the genetic predictors
QTLCANDIDATES = variates Specifies the locus index numbers from which to start the selection; must be set
QTLSELECTED = variates Saves the index numbers of the selected QTLs
INTERACTIONS = variates Saves a logical variate indicating whether each selected QTL showed a significant (1) or non-significant (0) QTL-by-trait interaction
DOMSELECTED = variates Saves a logical variate indicating whether each selected QTL showed a significant (1) or non-significant (0) effect of the DOMINANCEPREDICTORS
DOMINTERACTIONS = variates Saves a logical variate indicating whether each selected QTL showed a significant (1) or non-significant (0) dominance-by-trait interaction
WALDSTATISTICS = variates Saves the Wald test statistics
PRWALD = variates Saves the associated Wald probabilities


QMTBACKSELECT selects QTLs by backward selection from a list of candidate QTLs (loci) in multi-trait trials. It uses means per genotype-trait combinations as phenotypic data, but weights can be attached to the means (see the UNITERROR parameter and the UNITFACTOR option below). The response variable must be specified by the Y parameter, and the corresponding trait and genotype factors must be specified by the FTRAITS and GENOTYPES parameters, respectively. The POPULATIONTYPE option must be set to specify the population from which the genotypes have been derived. By default, the values of each trait are standardized by dividing them by their standard deviation, but you can set option STANDARDIZE=none to suppress this.

Molecular information must be provided in the form of additive genetic predictors stored in variates and supplied, in a pointer, by the ADDITIVEPREDICTORS parameter. Non-additive effects can be included in the model by specifying dominance genetic predictors using the DOMINANCEPREDICTORS parameter (e.g. in a F2 population). In the case of segregating F1 populations (outbreeders) two sets of additive genetic predictors must be specified, the maternal ones by the ADDITIVEPREDICTORS parameter, and the paternal ones by the ADD2PREDICTORS parameter. The corresponding map information for the genetic predictors must be given by the CHROMOSOMES and POSITIONS parameters. The labels for the loci can be supplied by the IDLOCI parameter, and the labels for the genotypes in the marker data can be supplied by the IDMGENOTYPES parameter. If IDMGENOTYPES is set, the match between the genotypes in the phenotypic and in the marker data will be checked.

The set of candidate QTLs must be supplied by the QTLCANDIDATES parameter. The model assumes FTRAITS as a fixed term, and GENOTYPES as a random term. Extra fixed effects can be defined using the FIXED option. A multi-Normal distribution is assumed for the random genetic effects, with mean vector 0 and variance-covariance matrix Σ. The VCMODEL option defines the model to use for Σ. See the VGESELECT procedure for details of the available models; the default is to use compound symmetry. Initial values for the parameters in the variance-covariance model can be specified by the VCINITIAL parameter. The VCPARAMETERS option controls whether the variance-covariance parameters are re-estimated at each step of the backward selection (VCPARAMETERS=estimate), or whether they are fixed at the defined initial values (VCPARAMETERS=fix). The VCSELECT option defines whether an extra check is made at each step on the variance-covariance model, to assess whether a simpler model is more suitable than the current model (based on the criterion defined by the CRITERION option). The SELECTEDMODEL parameter stores the final variance-covariance model that is selected. The significance level to use at each step of the backward selection process is given by the ALPHALEVEL option (default 0.05).

The MVINCLUDE, MAXCYCLE and WORKSPACE options operate in the same way as these options of the REML directive. The UNITERROR parameter allows uncertainty on the trait means (derived from individual unit or plot error) to be specified to include in the random model; by default this is omitted. The UNITFACTOR option allows the factor that is needed to define the unit-error term to be saved (this would be needed, for example, to save information later about the term using VKEEP).

The PRINT option specifies the output to be displayed. The summary setting prints the information about the QTLs retained in the model, and the other settings correspond to those in the PRINT option of the REML directive.

The list of selected QTLs can be saved by the QTLSELECTED parameter, and a logical variate that indicates whether the selected QTL showed a significant QTL-by-trait interaction can be saved by the INTERACTIONS parameter. This interaction is the combined effect of the ADDITIVEPREDICTORS, ADD2PREDICTORS and DOMINANCEREDICTORS pointers if specified. After the final step of the backward selection, extra tests are performed if the DOMINANCEPREDICTORS parameter is set. If the selected QTL has no interaction effect with trait, a test is performed of whether the dominance effect has a significant contribution in the combined QTL effect. If dominance is significant, the corresponding units of the logical variate saved by the DOMSELECTED parameter are set to one; the other units are set to zero. If the selected QTL has significant interaction with trait, a test is performed of whether the dominance-by-trait interaction has a significant contribution in the combined QTL-by-trait interaction. If the dominance-by-trait interaction is significant, the corresponding units of the logical variate saved by DOMINTERACTIONS parameter are set to one; the other units are set to zero. The Wald test and associated probability values for the combined effects (including the possible not-significant dominance and dominance-by-trait interactions) of the selected QTLs can be saved by the WALDSTATISTICS and PRWALD parameters, respectively.




QMTBACKSELECT starts with the following mixed models, which include a set L of candidate QTLs:

1)       yij = μ + Tj + ΣlL xiladd αjladd + GTij

if only ADDITIVEPREDICTORS are specified

2)       yij = μ + Tj + ΣlL ( xiladd αjladd + xildom αjldom ) + GTij

if DOMINANCEPREDICTORS are also specified

3)       yij = μ + Tj + ΣlL ( xiladd αjladd + xiladd2 αjladd2 + xildom αjldom ) + GTij

if both ADD2PREDICTORS and DOMINANCEPREDICTORS are specified (for population type CP)

where yij is the value of trait j for genotype i, Tj is the trait main effect, xiladd are the additive genetic predictors of genotype i for locus l, and αjladd are the associated effects. In models 2 and 3, xildom are the dominance genetic predictors, and αjldom are the associated effects. In model 3, xiladd are the additive genetic predictors for maternal genotype i at locus l, xiladd2 are the additive genetic predictors for paternal genotype i, and αjladd and αjladd2 are the associated effects. Genetic predictors are genotypic covariables that reflect the genotypic composition of a genotype at a specific chromosome location (Lynch & Walsh 1998). GTij is assumed to follow a multi-Normal distribution with mean vector 0, and a variance covariance matrix Σ, that can either be modelled explicitly (with an unstructured model) or by some parsimonious model (defined by option VCMODEL) as described in the VGESELECT procedure.

The backward selection procedure starts with the initial set of loci (defined by the QTLCANDIDATES parameter), and checks whether all loci are significant. If not, the locus with the lowest Wald test statistic is dropped from the model. This process is repeated until all loci in the model are significant. The procedure then switches to test whether the remaining QTLs show significant QTL-by-trait interaction, by breaking down the QTL effects into QTL main effects and QTL-by-trait interaction effects. If the QTL-by-trait interaction term is not significant, only a main effect is retained in the model for the corresponding QTL.

Action with RESTRICT

Restrictions are not allowed.


Lynch, M. & Walsh, B. (1998). Genetics and Analysis of Quantitative Traits. Sinauer Associates, Sunderland, MA.

See also


Commands for: Statistical genetics and QTL estimation.


SPLOAD        [PRINT=*] '%GENDIR%/Examples/F2maize_traits.gsh'
&             '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='LOCI'
&             '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='ADDPREDICTORS'
&             '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='DOMPREDICTORS'
POINTER       [MODIFY=yes; NVAL=idlocus] addpred
POINTER       [MODIFY=yes; NVAL=idlocus] dompred
" append the traits "
SUBSET        [E.EQ.6] G,asi,eno,mflw,ph,yld
APPEND        [NEWVECTOR=y ; GROUPS=ftraits] asi,eno,mflw,ph,yld
" candidate QTL positions from QMTQTLSCAN "
VARIATE       [VALUES=  17...22, 72,101,102...105, 135...139,\
              227,236,237,238] Qid
TEXT          model; VALUE='fa'
              VCMODEL=#model] Y=y; FTRAITS=ftraits; GENOTYPES=G;\
              QTLCANDIDATES=Qid; CHROMOSOMES=mkchr; POSITIONS=mkpos;\ 
              IDLOCI=idlocus; QTLSELECTED=qtlsel; INTERACTIONS=qtlint;\
              DOMSELECTED=domsel; DOMINTERACTIONS=domint; WALDSTAT=stat;\
PRINT         qtlsel,qtlint,domsel,domint; DECIMALS=0
Updated on March 6, 2019

Was this article helpful?