1. Home
  2. QMKDIAGNOSTICS procedure

QMKDIAGNOSTICS procedure

Generates descriptive statistics and diagnostic plots of molecular marker data (D.A. Murray, S.J. Welham, M. Malosetti, M.P. Boer, L.C.P. Keizer & J.T.N.M. Thissen).

Options

PRINT = string tokens What to print (summary, missingvalues, frequencies); default summ, miss, freq
PLOT = string tokens What to plot (missingvalues, frequencies, probabilities, genotypes, map); default miss, geno, map
GEN%MISSING = scalar Threshold for printing genotypes with many missing values (i.e. genotypes with a higher percentage of missing values than the specified value); default 10
MK%MISSING = scalar Threshold for printing markers with many missing values (i.e. markers with a higher percentage of missing values than the specified value); default 10
MK%EXTREME = scalar Threshold for printing markers with rare alleles (i.e. alleles present with a lower percentage than the specified threshold); default 10
POPULATIONTYPE = string token Type of population (BC1, DH1, F2, RIL, BCxSy, CP, AMP); must be set
NGENERATIONS = scalar Number of generations for a RIL population; default 6
NBACKCROSSES = scalar Number of backcrosses; must be set for a BCxSy population
NSELFINGS = scalar Number of selfings; must be set for a BCxSy population
DCHROMOSOMES = variate, text or scalar Specifies a subset of the linkage groups to be displayed
PDIRECTION = string token How to sort the probabilities when PRINT=frequencies with BC1, DH1, F2, RIL and BCxSy populations (ascending, descending); default * i.e. no sorting

Parameters

MKSCORES = pointers Genotype codes for each marker; must be set
CHROMOSOMES= factors Linkage groups for the markers; must be set
POSITIONS = variates Positions within the linkage groups of markers; must be set
MKNAMES = texts Marker name; must be sets
IDMGENOTYPES = texts Labels for genotypes corresponding to the marker scores
PARENTS = pointers Parent information
IDPARENTS = texts Labels to identify the parents
GENCHECK = variates Logical variates containing the value one for genotypes with missing value problems, according to the setting of the GEN%MISSING option, and zero otherwise
MKCHECK = variates Logical variates containing the value one for markers with missing or extreme value problems, as defined by the MK%MISSING and MK%EXTREME options, and zero otherwise
SUMMARY = pointers Saves a summary of counts and probabilities for the chi-square tests for BC1, DH1, F2, RIL and BCxSy populations

Description

QMKDIAGNOSTICS generates descriptive statistics and diagnostic plots of molecular marker data. The marker scores data must be supplied in a pointer by the MKSCORES pointer. The length of the MKSCORES pointer must be equal to the number of markers, and each structure of the pointer must be a factor with labels. The population type must be specified by the POPULATIONTYPE option. For a RIL population, the number of generations is specified by the NGENERATIONS option; default 6. For a BCxSy population, the number of backcrosses and the number of selfings are supplied by the NBACKCROSSES and NSELFINGS options, respectively.The labels for the genotypes corresponding to the marker scores can be supplied by the IDMGENOTYPES parameter.

The corresponding map information for the markers must be supplied by the CHROMOSOMES and POSITIONS parameters, and the labels of the markers must be supplied by the MKNAMES parameter.

The parent information must be supplied using the PARENTS parameter in a pointer to a set of texts. The first text in the pointer defines the alleles for parent 1, the second text defines the allele for parent 2, and so on. The labels for the parents are supplied in a text using the IDPARENTS parameter.

The PRINT option controls printed output, with settings:

    summary to print the number of genotypes and markers, and summary statistics per chromosome,
    missingvalues to print the genotypes with percentages of missing values GEN%MISSING and the markers with percentages of missing values greater than MK%MISSING,
    frequencies to print the allele frequencies of all markers with allele frequencies greater than MK%EXTREME for for an AMP population, or the frequencies of genotype codes for markers for BC1, DH1, F2, RIL and BCxSy populations.

By default PRINT = summary, missingvalues, frequencies. If PRINT=frequencies or PLOT=probabilities, the output for BC1, DH1, F2, RIL and BCxSy populations includes the probabilities of the calculated chi-square tests of Mendelian segregation; the expected ratios are defined in the Method Section. The summary table of genotypic code frequencies can be sorted into ascending or descending order of probabilities by setting the PDIRECTION option.

The PLOT option controls graphical output, with settings:

    missingvalues to produce a trellis plot of percentages of missing values against the map position for each linkage group and a plot of missing marker scores using the DQMKSCORES procedure,
    frequencies to produce a trellis plot of the allele frequency percentages against the map position for each linkage group (for AMP population only),
    probabilities to produce a trellis plot of the chi-square probabilities, plotted on a -log10 scale against the map position for each linkage group (for BC1, DH1, F2, RIL and BCxSy populations only),
    genotypes to plot all graphical genotypes, and
    map to plot the linkage map.

By default PLOT = missingvalues, genotypes, map.

The DCHROMOSOMES option can be used to select a subset of the linkage groups to display. The setting can be either a variate or scalar to define a subset using the levels of the CHROMOSOMES factor, or a text to define a subset using its labels.

The GENCHECK parameter can save a logical variate identifying the genotypes that have less (with values of zero) or more (with values of one) than the required number of missing values, based on the setting of the GEN%MISSING option. Similarly the MKCHECK parameter can save a logical variate identifying the markers that have problems of missing or extreme values, according to the settings of the MK%MISSING and MK%EXTREME options.

The SUMMARY parameter can save a pointer containing the structures that are printed when PRINT=frequencies for F2, BC1, DH1 and RIL populations. This contains the marker number, the marker name, the chromosome number, the position on the chromosome, percentage missing, the allele frequencies and the chi-square probability.

Options: PRINT, PLOT, GEN%MISSING, MK%MISSING, MK%EXTREME, POPULATIONTYPE, NGENERATIONS, NBACKCROSSES, NSELFINGS, DCHROMOSOMES, PDIRECTION.

Parameters: MKSCORES, CHROMOSOMES, POSITIONS, MKNAMES, IDMGENOTYPES, PARENTS, IDPARENTS, GENCHECK, MKCHECK, SUMMARY.

Method

For markers the segregation is evaluated against the expected allele frequencies using a chi-square test. The frequencies are as follows:

Population Alleles Expected ratio
BC1 1/1 : 1/2 1 : 1
DH1 1/1 : 2/2 1 : 1
F2 1/1 : 1/2 : 2/2 1 : 2 : 1
  1/1 : 2/- 1 : 3
  2/2 : 1/- 1 : 3
RILn 1/1 : 1/2 : 2/2 2n-1-1 : 2 : 2n-1-1
  1/1 : 2/- 2n-1-1 : 2n-1+1
  2/2 : 1/- 2n-1-1 : 2n-1+1
BCxSy 1/1 : 1/2 : 2/2 2x+y+1-2y-1 : 2 : 2y-1
  1/1 : 2/- 2x+y+1-2y-1 : 2y+1
  2/2 : 1/- 2x+y+1-2y-1 : 2y-1

where 1 is the allele for parent 1, 2 is the allele for parent 2, n is the number of RIL generations, and x and y are the number of backcrosses and selfings, respectively, for a BCxSy population.

Action with RESTRICT

Restrictions are not allowed.

See also

Procedures: DQMAP, DQMKSCORES, DQMQTLSCAN, DQSQTLSCAN, QMKRECODE.

Commands for: Statistical genetics and QTL estimation, Graphics.

Example

CAPTION        'QMKDIAGNOSTICS example'; STYLE=meta
" SxM DH1 population "
QIMPORT        [POPULATIONTYPE=DH1] '%GENDIR%/Examples/SxM_geno.txt';\ 
               MAPFILE='%GENDIR%/Examples/SxM_map.txt';\ 
               MKSCORES=m_scores1; CHROMOSOMES=m_chromo1; POSITIONS=m_pos1;\ 
               MKNAMES=m_names1; PARENTS=parents1; IDPARENTS=idparents1
QMKDIAGNOSTICS [POPULATIONTYPE=DH1] m_scores1;\
               CHROMOSOMES=m_chromo1; POSITIONS=m_pos1;\ 
               MKNAMES=m_names1; SUMMARY=summary1; PARENTS=parents1;\
               IDPARENTS=idparents1
" F2 population "
QIMPORT        [POPULATIONTYPE=F2] '%GENDIR%/Examples/F2maize_geno.txt';\ 
               MAPFILE='%GENDIR%/Examples/F2maize_map.txt';\ 
               MKSCORES=m_scores2; CHROMOSOMES=m_chromo2; POSITIONS=m_pos2;\ 
               MKNAMES=m_names2; PARENTS=parents2; IDPARENTS=idparents2
QMKDIAGNOSTICS [POPULATIONTYPE=F2; DCHROMOSOMES=!(1,5); PDIRECTION=asce]\ 
               m_scores2; CHROMOSOMES=m_chromo2; POSITIONS=m_pos2;\ 
               MKNAMES=m_names2; SUMMARY=summary2; PARENTS=parents2;\
               IDPARENTS=idparents2
" CP population "
QIMPORT        [POPULATIONTYPE=CP] '%GENDIR%/Examples/CPapple_geno.txt';\ 
               MAPFILE='%GENDIR%/Examples/CPapple_map.txt';\ 
               MKSCORES=m_scores3; CHROMOSOMES=m_chromo3; POSITIONS=m_pos3;\ 
               MKNAMES=m_names3; PARENTS=parents3; IDPARENTS=idparents3
QMKDIAGNOSTICS [POPULATIONTYPE=CP; PDIRECTION=asce]\ 
               m_scores3; CHROMOSOMES=m_chromo3; POSITIONS=m_pos3;\ 
               MKNAMES=m_names3; SUMMARY=summary3; PARENTS=parents3;\
               IDPARENTS=idparents3
" AMP population "
QIMPORT        [POPULATIONTYPE=AMP] '%GENDIR%/Examples/LD_match_geno.txt';\ 
               MAPFILE='%GENDIR%/Examples/LD_match_map.txt';\ 
               MKSCORES=m_scores4; CHROMOSOMES=m_chromo4; POSITIONS=m_pos4; \
               MKNAMES=m_names4
QMKDIAGNOSTICS [POPULATIONTYPE=AMP; PDIRECTION=asce]\ 
               m_scores4; CHROMOSOMES=m_chromo4; POSITIONS=m_pos4;\ 
               MKNAMES=m_names4; SUMMARY=summary4
Updated on March 6, 2019

Was this article helpful?