Performs a QTL backward selection for loci in single-environment trials (M.P. Boer, M. Malosetti, S.J. Welham & J.T.N.M. Thissen).

### Options

`PRINT` = string tokens |
What to print (`summary` , `model` , `components` , `effects` , `means` , `stratumvariances` , `monitoring` , `vcovariance` , `deviance` , `Waldtests` , `missingvalues` , `covariancemodels` ); default `summ` |
---|---|

`POPULATIONTYPE` = string token |
Type of population (`BC1` , `DH1` , `F2` , `RIL` , `BCxSy` , `CP` ); must be set |

`ALPHALEVEL` = scalar |
Defines a significance level; default 0.05 |

`FIXED` = formula |
Formula with extra fixed effects |

`UNITFACTOR` = factor |
Saves the units factor required to define the random model when `UNITERROR` is to be used |

`MVINCLUDE` = string tokens |
Whether to include units with missing values in the explanatory factors and variates and/or the y-variates (`explanatory` , `yvariate` ); default `expl` , `yvar` |

`MAXCYCLE` = scalar |
Limit on the number of iterations; default 100 |

`WORKSPACE` = scalar |
Number of blocks of internal memory to be set up for use by the `REML` algorithm; default 100 |

### Parameters

`TRAIT` = variates |
Quantitative trait to be analysed; must be set |
---|---|

`GENOTYPES` = factors |
Genotype factor; must be set |

`UNITERROR` = variates |
Uncertainty on trait means (derived from individual unit or plot error) to be included in QTL analysis; default `*` i.e. omitted |

`ADDITIVEPREDICTORS` = pointers |
Additive genetic predictors; must be set |

`ADD2PREDICTORS` = pointers |
Second (paternal) set of additive genetic predictors |

`DOMINANCEPREDICTORS` = pointers |
Dominance genetic predictors |

`CHROMOSOMES` = factors |
Chromosomes corresponding to the genetic predictors; must be set |

`POSITIONS` = variates |
Positions on the chromosomes corresponding to the genetic predictors; must be set |

`IDLOCI` = texts |
Labels for the loci |

`IDMGENOTYPES` = texts |
Labels for the genotypes corresponding to the genetic predictors |

`QTLCANDIDATES` = variates |
Specifies the locus index numbers from which to start the selection; must be set |

`QTLSELECTED` = variates |
Saves the index numbers of the selected QTLs; must be set |

`DOMSELECTED` = variates |
Logical indicator variable storing one where the selected QTLs show a significant effect of the dominance predictor, zero otherwise |

`WALDSTATISTICS` = variates |
Saves the Wald test statistics |

`PRWALD` = variates |
Saves the associated Wald probabilities |

### Description

`QSBACKSELECT`

selects QTLs from a list of candidate QTLs (loci) in single-environment trials by backward selection. It uses single observation per genotype as phenotypic data. The response variable must be specified by the `TRAIT`

parameter, and the genotypes by the `GENOTYPES`

parameter. The `POPULATIONTYPE`

option must be set to specify the population from which the genotypes are derived.

Molecular information must be provided in the form of additive genetic predictors stored in variates and supplied, in a pointer, by the `ADDITIVEPREDICTORS`

parameter. Non-additive effects can be included in the model by using the `DOMINANCEPREDICTORS`

parameter to specify dominance genetic predictors (e.g. in a F2 population); again they are stored in variates and supplied in a pointer. In the case of segregating F1 populations (outbreeders) two sets of additive genetic predictors must be specified: the maternal ones by the `ADDITIVEPREDICTORS`

parameter, and the paternal ones by the `ADD2PREDICTORS`

parameter. The corresponding map information for the genetic predictors must be given by the `CHROMOSOMES`

and `POSITIONS`

parameters. The labels for the loci can be supplied by the `IDLOCI`

parameter, and the labels for the genotypes in the marker data can be supplied by the `IDMGENOTYPES`

parameter. If `IDMGENOTYPES`

is set, the match between the genotypes in the phenotypic and in the marker data will be checked.

The set of candidate QTLs must be supplied by the `QTLCANDIDATES`

parameter. The model assumes genotypes as random and QTLs as fixed effects. Extra fixed effects can be defined using the `FIXED`

option. The significance level to use at each step of the backward selection process is given by the `ALPHALEVEL`

option (default 0.05).

The `MVINCLUDE`

, `MAXCYCLE`

and `WORKSPACE`

options operate in the same way as these options of the `REML`

directive. The `UNITERROR`

parameter allows uncertainty on the trait means (derived from individual unit or plot error) to be specified to include in the random model; by default this is omitted. The `UNITFACTOR`

option allows the factor that is needed to define the unit-error term to be saved (this would be needed, for example, to save information later about the term using `VKEEP`

).

The `PRINT`

option specifies the output to be displayed. The `summary`

setting prints the information about the QTLs retained in the model, and the other settings correspond to those in the `PRINT`

option of the `REML`

directive.

The list of selected QTLs can be saved by the `QTLSELECTED`

parameter. If the dominance predictors have been specified, the `DOMSELECTED`

parameter can save a logical indicator variate storing one where the selected QTLs show a significant effect of the dominance predictor, and zero otherwise. The Wald test and associated probability values for the selected QTLs can be saved by the `WALDSTATISTICS`

and `PRWALD`

parameters, respectively.

Options: `PRINT`

, `POPULATIONTYPE`

, `ALPHALEVEL`

, `FIXED`

, `UNITFACTOR`

, `MVINCLUDE`

, `MAXCYCLE`

, `WORKSPACE`

.

Parameters: `TRAIT`

, `GENOTYPES`

, `UNITERROR`

, `ADDITIVEPREDICTORS`

, `ADD2PREDICTORS`

, `DOMINANCEPREDICTORS`

, `CHROMOSOMES`

, `POSITIONS`

, `IDLOCI`

, `IDMGENOTYPES`

, `QTLCANDIDATES`

, `QTLSELECTED`

, `DOMSELECTED`

, `WALDSTATISTICS`

, `PRWALD`

.

### Method

`QSBACKSELECT`

starts with one of the following models which includes a set *L* of candidate QTLs:

1) *y _{i}* =

*μ*+ Σ

_{l∈L}

*x*

_{il}^{add}*α*+

_{l}^{add}*G*

_{i}if only `ADDITIVEPREDICTORS`

are specified

2) *y _{i}* =

*μ*+ Σ

_{l∈L}(

*x*

_{il}^{add}*α*+

_{l}^{add}*x*

_{il}^{dom}*α*) +

_{l}^{dom}*G*

_{i}if `DOMINANCEPREDICTORS`

are also specified

3) *y _{i}* =

*μ*+ Σ

_{l∈L}(

*x*

_{il}^{add}*α*+

_{l}^{add}*x*

_{il}^{add2}*α*+

_{l}^{add2}*x*

_{il}^{dom}*α*) +

_{l}^{dom}*G*

_{i}if both `ADD2PREDICTORS`

and `DOMINANCEPREDICTORS`

are specified (for population type `CP`

)

where *y _{i}* is the trait value of genotype

*i*,

*x*are the additive genetic predictors of genotype

_{il}^{add}*i*for locus

*l*, and

*α*are the associated effects. In models 2 and 3,

^{add}*x*are the dominance genetic predictors, and

_{il}^{dom}*α*are the associated effects. In model 3,

_{l}^{add}*x*are the additive genetic predictors for maternal genotype

_{il}^{add}*i*at locus

*l*,

*x*are the additive genetic predictors for paternal genotype

_{il}^{add2}*i*, and

*α*and

^{add}*α*are the associated effects. Genetic predictors are genotypic covariables that reflect the genotypic composition of a genotype at a specific chromosome location (Lynch & Walsh 1998).

^{add2}*G*is the residual unexplained genetic and environmental variation, which is assumed to follow a Normal distribution with mean 0 and variance σ

_{i}^{2}.

The backward selection process starts with the initial set of loci *L* (defined by the `QTLCANDIDATES`

parameter), and checks whether all the loci are significant. If not, the locus with the smallest Wald test statistic is dropped from the model. The process is repeated until all loci in the model are significant. If model 2 or 3 is specified, a further step of model reduction is performed by checking, for each of the remaining loci, whether the dominance effects can be dropped from the model.

### Action with `RESTRICT`

Restrictions are not allowed.

### Reference

Lynch, M. & Walsh, B. (1998). *Genetics and Analysis of Quantitative Traits*. Sinauer Associates, Sunderland, MA.

### See also

Procedures: `QSESTIMATE`

, `QSQTLSCAN`

.

Commands for: Statistical genetics and QTL estimation.

### Example

CAPTION 'QSBACKSELECT example'; STYLE=meta SPLOAD [PRINT=*] '%GENDIR%/Examples/F2maize_traits.gsh' & '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='LOCI' & '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='ADDPREDICTORS' & '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='DOMPREDICTORS' " create single environment " SUBSET [E.EQ.6] G,yld " candidate QTL positions from QSQTLSCAN " VARIATE [VALUES=18,19,111,112,236,237] Qid QSBACKSELECT [PRINT=summary,model,components,effects,monitoring,vcovariance,\ deviance,waldtests; POPULATIONTYPE=F2; ALPHA=0.10]\ TRAIT=yld; GENOTYPES=G;\ ADDITIVEPREDICTORS=addpred;\ DOMINANCEPREDICTORS=dompred;\ CHROMOSOMES=mkchr; POSITIONS=mkpos; QTLCANDIDATES=Qid;\ QTLSELECTED=qtlsel; DOMSELECTED=domsel PRINT qtlsel,domsel; DECIMALS=0