Estimates linkage disequilibrium (LD) decay along a chromosome (M. Malosetti & J.T.N.M. Thissen).
Options
PRINT = string token |
What to print (progress ); default * |
---|---|
PLOT = string tokens |
What to plot (ldmatrix , lddecay ); default ldde |
RELATIONSHIPMODEL = string token |
What model to use to account for genetic relatedness (eigenanalysis , subpopulations , null ); default eige |
SCORES = pointer |
Provides the scores of significant principal components, obtained from an eigenvalue analysis |
SUBPOPULATIONS = factor |
Defines groupings of genotypes into subpopulations |
CHRANALYSE = scalar |
Defines which chromosome to analyse, using a level of the CHROMOSOMES factor |
MAX%MISSING = scalar |
Markers with more than the specified % of missing values will be excluded from the LD calculations; default 20 |
MAXDISTANCE = scalar |
Defines the maximum distance between markers to show in LD plots; default 30 |
TITLE = text |
General title for the plots |
YTITLE = text |
Title for the y-axis |
XTITLE = text |
Title for the x-axis |
Parameters
MKSCORES = pointers |
Genotype codes for each marker; must be set |
---|---|
CHROMOSOMES= factors |
Linkage groups for the markers; must be set |
POSITIONS = variates |
Positions within the linkage groups of markers; must be set |
DISTANCES = symmetric matrices |
Saves the distances between markers |
R2 = symmetric matrices |
Saves the value of r2 between markers |
Description
QLDDECAY
estimates linkage disequilibrium (LD) between pairs of markers on a chromosome. The association between two markers is assessed by a linear regression model, with one marker set as response and the second one as regressor, and LD is expressed in terms of r2 values.
The model to account for genetic relatedness between genotypes is specified by the RELATIONSHIPMODEL
option, with one of the following settings:
eigenanalysis |
infers the underlying genetic substructure in the population by retaining the most significant principal components from the molecular marker matrix (Patterson et al. 2006) – the scores of the significant axes are used as covariables in the regression model, which is effectively an approximation to the structuring of the genetic variance covariance matrix by a coefficient of coancestry matrix (kinship matrix); |
---|---|
subpopulations |
includes a factor supplied by the SUBPOPULATIONS option in the regression model (imposing a constant covariance between genotypes within the same subpopulation); |
null |
makes no correction for genetic relatedness. |
By default RELATIONSHIPMODEL=eigenanalysis
; the scores of the significant axes are then calculated by the QEIGENANALYSIS
procedure with options STANDARDIZE=frequency
and SCALE=none
. Alternatively, scores calculated elsewhere can be supplied, in a pointer, using the option SCORES
.
LD is estimated per chromosome. It is not calculated between markers with too many missing values. The threshold is specified by the MAX%MISSING
option; default 20 (i.e. 20%). While LD is calculated along the whole of the chromosome, one expects LD decay at relatively short distances. Therefore, when plotting r2 values versus marker distances, only pairs of markers that are closer than the value specified by the MAXDISTANCE
option are displayed (default 30).
The marker scores are supplied by the MKSCORES
parameter, in a pointer containing a factor for each marker. The corresponding map information for the markers is supplied by the CHROMOSOMES
and POSITIONS
parameters. The CHRANALYSE
option must be set to specify the chromosome for which the analysis is to be performed.
The parameter MKNAMES
can be used to supply marker names that will be used to name rows and columns of output matrices. The DISTANCE
parameter can save a symmetric matrix of distances between the markers, and the R2
parameter can save a symmetric matrix of r2 values between markers.
The PRINT
option can be set to progress
, to monitor the progress of the analysis.
The PLOT
option selects the graphs to plot, with settings:
lddecay |
plots the probability values for the deviance ratios, on a -log10 scale, against the marker distance, and |
---|---|
ldmatrix |
gives a shade plot of the LD matrix. |
By default PLOT=lddecay
. The TITLE
option can be used to provide a title for the graphs, and the YTITLE
and XTITLE
options can supply titles for the y- and x-axis, respectively.
Options: PRINT
, PLOT
, RELATIONSHIPMODEL
, SCORES
, SUBPOPULATIONS
, CHRANALYSE
, MAX%MISSING
, MAXDISTANCE
, TITLE
, YTITLE
, XTITLE
.
Parameters: MKSCORES
, CHROMOSOMES
, POSITIONS
, DISTANCES
, R2
.
Method
QLDDECAY
handles any type of marker, taking the first allele as reference (if a bi-allelic marker) or the most frequent allele if a marker has multiple alleles. The procedure fits a linear regression with one marker taken as response and a second one used as regressor. To account for genetic relatedness, the model can also include extra covariables (either principal component scores, or a grouping factor). Models are fitted using RYPARALLEL
to perform several fits in parallel. From each fit the r2 value is stored as measure of LD between the markers. Plots are produced to display results according to the settings of the PLOT
option.
Action with RESTRICT
Restrictions are not allowed.
See also
Procedures: QEIGENANALYSIS
, QMASSOCIATION
, QSASSOCIATION
.
Commands for: Statistical genetics and QTL estimation.
Example
CAPTION 'QLDDECAY example'; STYLE=meta QIMPORT [POPULATION=amp] '%GENDIR%/Examples/LD_example_geno.txt';\ MAPFILE='%GENDIR%/Examples/LD_example_map.txt';\ MKSCORES=scores; CHROMOSOMES=mkchr; POSITIONS=mkpos;\ MKNAMES=mknames " calculate LD decay with eigenanalysis " QLDDECAY [PRINT=progress; PLOT=lddecay,ldmatrix;\ RELATIONSHIPMODEL=eigenanalysis;\ CHRANALYSE=2; MAXDISTANCE=25] \ scores; CHROMOSOMES=mkchr; POSITIONS=mkpos;\ MKNAMES=mknames; DISTANCE=distance; R2=r2 PRINT r2