1. Home
  2. QLDDECAY procedure

QLDDECAY procedure

Estimates linkage disequilibrium (LD) decay along a chromosome (M. Malosetti & J.T.N.M. Thissen).

Options

PRINT = string token What to print (progress); default *
PLOT = string tokens What to plot (ldmatrix, lddecay); default ldde
RELATIONSHIPMODEL = string token What model to use to account for genetic relatedness (eigenanalysis, subpopulations, null); default eige
SCORES = pointer Provides the scores of significant principal components, obtained from an eigenvalue analysis
SUBPOPULATIONS = factor Defines groupings of genotypes into subpopulations
CHRANALYSE = scalar Defines which chromosome to analyse, using a level of the CHROMOSOMES factor
MAX%MISSING = scalar Markers with more than the specified % of missing values will be excluded from the LD calculations; default 20
MAXDISTANCE = scalar Defines the maximum distance between markers to show in LD plots; default 30
TITLE = text General title for the plots
YTITLE = text Title for the y-axis
XTITLE = text Title for the x-axis

Parameters

MKSCORES = pointers Genotype codes for each marker; must be set
CHROMOSOMES= factors Linkage groups for the markers; must be set
POSITIONS = variates Positions within the linkage groups of markers; must be set
DISTANCES = symmetric matrices Saves the distances between markers
R2 = symmetric matrices Saves the value of r2 between markers

Description

QLDDECAY estimates linkage disequilibrium (LD) between pairs of markers on a chromosome. The association between two markers is assessed by a linear regression model, with one marker set as response and the second one as regressor, and LD is expressed in terms of r2 values.

The model to account for genetic relatedness between genotypes is specified by the RELATIONSHIPMODEL option, with one of the following settings:

    eigenanalysis infers the underlying genetic substructure in the population by retaining the most significant principal components from the molecular marker matrix (Patterson et al. 2006) – the scores of the significant axes are used as covariables in the regression model, which is effectively an approximation to the structuring of the genetic variance covariance matrix by a coefficient of coancestry matrix (kinship matrix);
    subpopulations includes a factor supplied by the SUBPOPULATIONS option in the regression model (imposing a constant covariance between genotypes within the same subpopulation);
    null makes no correction for genetic relatedness.

By default RELATIONSHIPMODEL=eigenanalysis; the scores of the significant axes are then calculated by the QEIGENANALYSIS procedure with options STANDARDIZE=frequency and SCALE=none. Alternatively, scores calculated elsewhere can be supplied, in a pointer, using the option SCORES.

LD is estimated per chromosome. It is not calculated between markers with too many missing values. The threshold is specified by the MAX%MISSING option; default 20 (i.e. 20%). While LD is calculated along the whole of the chromosome, one expects LD decay at relatively short distances. Therefore, when plotting r2 values versus marker distances, only pairs of markers that are closer than the value specified by the MAXDISTANCE option are displayed (default 30).

The marker scores are supplied by the MKSCORES parameter, in a pointer containing a factor for each marker. The corresponding map information for the markers is supplied by the CHROMOSOMES and POSITIONS parameters. The CHRANALYSE option must be set to specify the chromosome for which the analysis is to be performed.

The parameter MKNAMES can be used to supply marker names that will be used to name rows and columns of output matrices. The DISTANCE parameter can save a symmetric matrix of distances between the markers, and the R2 parameter can save a symmetric matrix of r2 values between markers.

The PRINT option can be set to progress, to monitor the progress of the analysis.

The PLOT option selects the graphs to plot, with settings:

    lddecay plots the probability values for the deviance ratios, on a -log10 scale, against the marker distance, and
    ldmatrix gives a shade plot of the LD matrix.

By default PLOT=lddecay. The TITLE option can be used to provide a title for the graphs, and the YTITLE and XTITLE options can supply titles for the y- and x-axis, respectively.

Options: PRINT, PLOT, RELATIONSHIPMODEL, SCORES, SUBPOPULATIONS, CHRANALYSE, MAX%MISSING, MAXDISTANCE, TITLE, YTITLE, XTITLE.

Parameters: MKSCORES, CHROMOSOMES, POSITIONS, DISTANCES, R2.

Method

QLDDECAY handles any type of marker, taking the first allele as reference (if a bi-allelic marker) or the most frequent allele if a marker has multiple alleles. The procedure fits a linear regression with one marker taken as response and a second one used as regressor. To account for genetic relatedness, the model can also include extra covariables (either principal component scores, or a grouping factor). Models are fitted using RYPARALLEL to perform several fits in parallel. From each fit the r2 value is stored as measure of LD between the markers. Plots are produced to display results according to the settings of the PLOT option.

Action with RESTRICT

Restrictions are not allowed.

See also

Procedures: QEIGENANALYSIS, QMASSOCIATION, QSASSOCIATION.

Commands for: Statistical genetics and QTL estimation.

Example

CAPTION  'QLDDECAY example'; STYLE=meta
QIMPORT  [POPULATION=amp] '%GENDIR%/Examples/LD_example_geno.txt';\ 
         MAPFILE='%GENDIR%/Examples/LD_example_map.txt';\ 
         MKSCORES=scores; CHROMOSOMES=mkchr; POSITIONS=mkpos;\ 
         MKNAMES=mknames
" calculate LD decay with eigenanalysis "
QLDDECAY [PRINT=progress; PLOT=lddecay,ldmatrix;\ 
         RELATIONSHIPMODEL=eigenanalysis;\ 
         CHRANALYSE=2; MAXDISTANCE=25] \
         scores; CHROMOSOMES=mkchr; POSITIONS=mkpos;\ 
         MKNAMES=mknames; DISTANCE=distance; R2=r2
PRINT    r2
Updated on March 6, 2019

Was this article helpful?