QSQTLSCAN procedure

Performs a genome-wide scan for QTL effects (Simple and Composite Interval Mapping) in single-environment trials (M.P. Boer, M. Malosetti, S.J. Welham & J.T.N.M. Thissen).

Options

`PRINT` = string tokens	What to print (`summary`, `progress`, `model`, `components`, `effects`, `means`, `stratumvariances`, `monitoring`, `vcovariance`, `deviance`, `Waldtests`, `missingvalues`, `covariancemodels`); default `summ`
`PLOT` = string token	Whether to plot the profile along the genome (`profile`); default `prof`
`POPULATIONTYPE` = string token	Type of population (`BC1`, `DH1`, `F2`, `RIL`, `BCxSy`, `CP`); must be set
`ALPHALEVEL` = scalar	Defines a genome-wide significance level to calculate the threshold; default 0.05
`COFACTORS` = variate	Index numbers of loci to be used as cofactors for the genetic background
`COFWINDOW` = scalar	Specifies a window for cofactor exclusion from the model; default 10⁶ which means that all cofactors on the same chromosomes are excluded
`THRMETHOD` = string token	Which method to use to calculate the threshold for QTL detection (`bonferroni`, `liji`, `given`); default `liji`
`THRESHOLD` = scalar	Threshold value for test statistic when `THRMETHOD=given`
`DISTANCE` = scalar	Distance between loci when `THRMETHOD=bonferroni`; default 4
`FIXED` = formula	Formula with extra fixed terms
`UNITFACTOR` = factor	Saves the units factor required to define the random model when `UNITERROR` is to be used
`STATISTICTYPE` = string token	Which test statistic to plot and save using the `STATISTICS` parameter (`wald`, `minlog10p`); default `minl`
`COLOURS` = scalar, variate or text	Colours to use for the chromosomes; default `*` uses the colours of pens 1, 2 up to the number of chromosomes
`TITLE` = text	General title for plot
`YTITLE` = text	Title for the y-axis; default uses the identifier of the `STATISTICS` variate or pointer
`XTITLE` = text	Title for the x-axis; default `'Chromosomes'`
`MVINCLUDE` = string tokens	Whether to include units with missing values in the explanatory factors and variates and/or the y-variates (`explanatory`, `yvariate`); default `expl`, `yvar`
`MAXCYCLE` = scalar	Limit on the number of iterations; default 100
`WORKSPACE` = scalar	Number of blocks of internal memory to be set up for use by the `REML` algorithm; default 100

Parameters

`TRAIT` = variates	Quantitative trait to be analysed; must be set
`GENOTYPES` = factors	Genotype factor; must be set
`UNITERROR` = variates	Uncertainty on trait means (derived from individual unit or plot error) to be included in QTL analysis; default `*` i.e. omitted
`ADDITIVEPREDICTORS` = pointers	Additive genetic predictors; must be set
`ADD2PREDICTORS` = pointers	Second (paternal) set of additive genetic predictors
`DOMINANCEPREDICTORS` = pointers	Dominance genetic predictors
`CHROMOSOMES` = factors	Chromosomes corresponding to the genetic predictors; must be set
`POSITIONS` = variates	Positions on the chromosomes corresponding to the genetic predictors; must be set
`IDLOCI` = texts	Labels for the loci
`IDMGENOTYPES` = texts	Labels for the genotypes corresponding to the genetic predictors
`IDEFFECTS` = texts	Labels for the effects along the y-axis, in the frame below the profile plot
`IDPARENTS` = texts	Labels to use to identify the parents
`QSTATISTICS` = variates	Saves test statistics for QTL effects along the genome
`QEFFECTS` = pointers	Saves QTL effects along the genome (additive effects,and, if specified, also second additive and dominance effects)
`QSE` = pointers	Saves standard errors of the QTL effects
`OUTFILENAME` = texts	Name of the Genstat workbook file (`*.gwb`) to be created
`DFILENAME` = texts	Name of the graphs file for the plots

Description

QSQTLSCAN performs a genome-wide QTL scan in single-environment trials. It uses single observation per genotype as phenotypic data. The response variable must be specified by the TRAIT parameter, and the genotypes by the GENOTYPES parameter. The POPULATIONTYPE option must be set to specify the population type.

Molecular information must be provided in the form of additive genetic predictors stored in variates and supplied, in a pointer, by the ADDITIVEPREDICTORS parameter. Non-additive effects can be included in the model by using the DOMINANCEPREDICTORS parameter to specify dominance genetic predictors (e.g. in a F2 population); again they are stored in variates and supplied in a pointer. In the case of segregating F1 populations (outbreeders) two sets of additive genetic predictors must be specified: the maternal ones by the ADDITIVEPREDICTORS parameter, and the paternal ones by the ADD2PREDICTORS parameter. The corresponding map information for the genetic predictors must be given by the CHROMOSOMES and POSITIONS parameters. The labels for the loci can be supplied by the IDLOCI parameter, and the labels for the genotypes in the marker data can be supplied by the IDMGENOTYPES parameter. If IDMGENOTYPES is set, the match between the genotypes in the phenotypic and in the marker data will be checked.

The QTL detection model assumes genotypes as random and QTLs as fixed effects. Extra fixed effects can be specified using the FIXED option. The QTL search can be performed without cofactors (Simple Interval Mapping) or with cofactors that control for genetic background effects (Composite Interval Mapping). For Composite Interval Mapping, the COFACTORS option must specify a variate containing the index numbers of the loci designated as cofactors. The COFWINDOW option defines a window around a tested position within which cofactors are temporarily excluded from the model.

The MVINCLUDE, MAXCYCLE and WORKSPACE options operate in the same way as these options of the REML directive. The UNITERROR parameter allows uncertainty on the trait means (derived from individual unit or plot error) to be specified to include in the random model; by default this is omitted. The UNITFACTOR option allows the factor that is needed to define the unit-error term to be saved (this would be needed, for example, to save information later about the term using VKEEP).

The method to define the threshold value is defined by the THRMETHOD option and uses a genome-wide error rate defined by the option ALPHALEVEL (default 0.05). If THRMETHOD=given, a user-defined threshold value must be specified using the THRESHOLD option. If THRMETHOD=bonferroni, an effective number of tests is calculated using the value specified by the DISTANCE option as the step size (default 4). Alternatively the liji setting uses the method described by Li & Ji (2005). See procedure QTHRESHOLD for details.

The PRINT option specifies the output to be displayed. The summary setting prints the information about the QTLs retained in the model, and the progress setting shows how the scan is progressing. The other settings correspond to those in the PRINT option of the REML directive.

By default QSQTLSCAN plots the test statistic associated with the effects of the genetic predictors against their position on the chromosomes, but you can set option PLOT=* to suppress this. The STATISTICTYPE option specifies what to plot along the y-axis of the upper plot, either the test statistic or the associated probability value (on a -log10 scale), and also defines what is saved in the variates specified by the QSTATISTICS parameter. The IDEFFECTS parameter can be used to label the effects, and the IDPARENTS parameter can supply labels to identify the parents.

The corresponding effects of each genetic predictor and their standard errors can be saved by the QEFFECTS and QSE parameters, respectively. These are saved in pointers that contain a single variate if only the ADDITIVEPREDICTORS parameter is specified, or two or three variates if the DOMINANCEPREDICTORS and/or ADD2PREDICTORS parameters are also specified. The TITLE, YTITLE and XTITLE options can specify the general title of the graph, the title of the y-axis and the title of the x-axis, respectively. The colours to use for the chromosomes in the upper graph are specified by the COLOURS option using either a text of colour names or a variate of RGB values (see the PEN directive for details). If COLOURS is not set, the default is to use the default colours of the pens 1, 2, onwards, up to the number of chromosomes. By default, the plot is sent to the screen. However, you can supply a file for the plot, using the DFILENAME parameter. You can discover the types of graphics file that are supported by running the command.

The OUTFILENAME parameter can be used to write the QSTATISTICS, QEFFECTS and QSE structures to a Genstat work book file in a sheet named STATISTICS. This parameter should not contain an extension as the extension is defined automatically given as .gwb.

Options: PRINT, PLOT, POPULATIONTYPE, ALPHALEVEL, COFACTORS, COFWINDOW, THRMETHOD, THRESHOLD, DISTANCE, FIXED, UNITFACTOR, STATISTICTYPE, COLOURS, TITLE, YTITLE, XTITLE MVINCLUDE, MAXCYCLE, WORKSPACE.

Parameters: TRAIT, GENOTYPES, UNITERROR, ADDITIVEPREDICTORS, ADD2PREDICTORS, DOMINANCEPREDICTORS, CHROMOSOMES, POSITIONS, IDLOCI, IDMGENOTYPES, IDEFFECTS, IDPARENTS, QSTATISTICS, QEFFECTS, QSE, OUTFILENAME, DFILENAME.

Method

QSQTLSCAN fits the following mixed models repeatedly along the genome:

1) y_i = μ + Σ_f∈F x_if c_f + x_i α_lj + G_i

if only ADDITIVEPREDICTORS are specified

2) y_i = μ + Σ_f∈F ( x_if^add c_f^add + x_if^dom c_f^dom ) + ( x_i^add α^add + x_i^dom α^dom ) + G_i

if DOMINANCEPREDICTORS are also specified

3) y_i = μ + Σ_l∈L ( x_if^add c_l^add + x_if^add2 c_l^add2 + x_if^dom c_l^dom )

+ ( x_i^add α^add + x_i^add2 α^add2 + x_i^dom α^dom ) + G_i

if both ADD2PREDICTORS and DOMINANCEPREDICTORS are specified (for population type CP)

where y_i is the trait value of individual i, F is a set of cofactors (if cofactors are included in the model), and x_if^add and x_i^add are the additive genetic predictors of genotype i at the cofactor positions and at the tested position, respectively. The associated effects are denoted by c_i^add and α^add for cofactors and tested position respectively. In model 2 and 3, x_if^dom and x_i^dom are dominance genetic predictors of genotype i at the cofactor positions and at the tested position, respectively, with associated effects c_f^dom, and α^dom. In model 3, x_if^add and x_i^add are the additive genetic predictors for the maternal genotype, for cofactors and tested position, respectively, and x_if^add2 and x_i^add2 are the equivalent additive genetic predictors for the paternal genotype. Finally x_if^dom and x_i^dom are the dominance genetic predictors for the cofactors and tested position, respectively. The associated effects are given by c_f^add, c_f^add2 and c_f^dom for cofactors, and α^add, α^add2 and α^dom for tested positions. Genetic predictors are genotypic covariables that reflect the genotypic composition of a genotype at a specific chromosome location (Lynch & Walsh 1998). The residual unexplained genetic and environmental effects are modelled by the G_i term, which is assumed to follow a Normal distribution with mean 0 and variance σ².

The procedure uses the REML directive iteratively to fit the model at each chromosome position, storing the Wald statistic for hypothesis testing. The resulting Wald statistic or the associated probability value (on a -log10 scale) can be plotted to produce the well-known profile plots used for interpretation.

Action with `RESTRICT`

Restrictions are not allowed.

Reference

Lynch, M. & Walsh, B. (1998). Genetics and Analysis of Quantitative Traits. Sinauer Associates, Sunderland, MA.

Example

CAPTION   'QSQTLSCAN example'; STYLE=meta
SPLOAD    [PRINT=*] '%GENDIR%/Examples/F2maize_traits.gsh'
&         '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='LOCI'
&         '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='ADDPREDICTORS'
&         '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='DOMPREDICTORS'
" create single environment "
SUBSET    [E.EQ.6] G,yld
POINTER   [MODIFY=yes; NVALUES=idlocus] addpred
POINTER   [MODIFY=yes; NVALUES=idlocus] dompred
QSQTLSCAN [PRINT=summary; POPULATIONTYPE=F2; THRMETHOD=liji;\ 
          STATISTICTYPE=minlog; THRESHOLD=thres; TITLE='yld Wald 2 d.f.']\ 
          TRAIT=yld; GENOTYPES=G; CHROMOSOMES=mkchr;\ 
          POSITIONS=mkpos; IDLOCI=idlocus;\ 
          ADDITIVEPREDICTORS=addpred; DOMINANCEPREDICTORS=dompred;\ 
          QSTATISTICS=minlog10p; QEFFECTS=qeff2; QSE=qse2;\ 
          OUTFILENAME='F2maize_single_out'

Updated on June 19, 2019

Was this article helpful?

Yes No