Imports genotypic and phenotypic data for QTL analysis (D.A. Murray).
|What to print (
||Type of population (
||Character representing a missing genotype in Flapjack or R/QTL format; default
||Character separating data values in Flapjack format; default separates them by tabs|
||Character separating allele values in Flapjack format; default
||Specifies whether the genotypes or markers are stored on the rows in Flapjack format (
||Number of parents in Flapjack file; default 0 for population AMP, 4 for CP, and 2 otherwise|
||For data in Flapjack format, this sets a limit on the number of markers that may be found to contain errors before the import is abandoned; default 200|
||Whether to remove markers with errors in Flapjack format automatically (
||Name of the file for import|
||Name of the map file (Flapjack or MapQTL(R))|
||Name of the phenotypic file (MapQTL(R))|
||Saves the genotype codes for each marker|
||Saves the trait data from the phenotypic file|
||Saves linkage groups for each marker|
||Saves positions of the markers within linkage groups|
||Saves the marker names|
||Saves marker sets|
||Labels for genotypes|
||Saves the parent information|
||Saves the labels used to identify the parents|
||Specifies a file containing genotype labels for MapQTL(R) files; if unset, they are assumed to be in the
||Specifies the names of any markers to exclude from an import in Flapjack format|
||In Flapjack format, this saves the names of any markers that contain errors|
||In Flapjack format, this saves a pointer to texts that identify any errors in the marker-by-genotype (individual) scores|
||Specifies the name of a Genstat workbook (
QIMPORT loads genotypic and phenotypic data for QTL analysis. The name of the genotypic data file to be imported is specified by the
FILENAME parameter. The format of the file to be imported is specified by the file extension, and can be either a Flapjack text genotype file (
.txt), a MapQTL(R) Locus genotype file (
.loc) or a comma-delimited text (
.csv). The format of the
.csv file is an extended R/QTL separate genotype data
.csv file format, which can include an extra column for the marker sets.
If a Flapjack genotype or MapQTL(R) Locus genotype file name is supplied, the associated map information can supplied by setting the
MAPFILENAME option to a file name with the extension
.txt for Flapjack or
.map for MapQTL(R). For Flapjack and R/QTL formats, the
POPULATIONTYPE option must be set to specify the population from which the genotypes come. For MapQTL(R), the population is determined from the
.loc file. The
MISSING option can specify a character to identify missing genotypes in Flapjack genotype files and R/QTL files. By default, Genstat expects the genotype data in Flapjack files to be tab-delimited, but the
SEPARATOR option can be used to specify an alternative separator. Similarly, by default, Genstat expects the alleles for each genotype to be separated using a
'/' character, but an alternative can be supplied using the
ASEPARATOR option. For the Flapjack genotype format, the
FJROWS option indicates whether the genotypes or markers are stored in the rows of the file; by default the genotypes are in the rows.
The marker scores for the genotypes are stored in a set of factors in the pointer supplied by the
MKSCORES parameter. Each factor within the pointer will contain data for a marker, with factor labels supplied in the same order.
When importing genotypic data the linkage groups for each marker, marker names and positions are saved using the
POSITIONS parameters, respectively. If a
.csv file is imported, any marker sets within the file can be saved using the
MKSETS parameter. The grouping factor identifying marker sets in a
.csv file can be saved using the
CP populations, the parent information and associated names can be saved using the
IDPARENTS parameters respectively.
The genotype labels can be saved using the
IDMGENOTYPES parameter. By default, for MapQTL(R) locus and map files, the genotype labels are the values 1 to n. However, Genstat allows individual names to be included at the bottom of the locus file, below the genotype data. The file should then include the instruction
followed by each individual name on a separate line in the same order as that in which the genotypes are specified for each locus. Alternatively, a text file containing the genotype labels can be supplied using the
IDFILENAME parameter; each individual name should then be on a separate line in the same order as that in which the genotypes are specified for each locus in the
For data in Flapjack format, markers can be excluded by setting the
EXCLUDEMARKERS parameter to a text containing the names of the markers to omit. When importing Flapjack genotypic data, the parental and individual scores are checked for errors. You can set option
MKREMOVE=yes to remove any markers that are found to contain errors, automatically from the imported data. The
NMKERROR option sets a limit on the number of markers that may be found to contain errors before the import is abandoned; default 200. The names of any markers that containing errors in the parent or individual genotype scores can be saved, in a text, using the
MKERROR parameter. The
ERRORLOCATIONS parameter can save a pointer containing a text with marker names and a text with genotype names, identifying the marker × genotype locations of any marker score errors.
||produces a summary listing attributes of the data that have been read and, for phenotypic data, a list of the data structures that have been imported,|
||gives a report of any errors in genotypic data that have been read in Flapjack format.|
Phenotypic data in MapQTL(R) quantatitive data files (
.qua) can be imported by supplying the name of the file with the
PHEFILENAME parameter. The
TRAITS parameter can be set to a pointer to store the identifiers (i.e. column names) read from the file. The pointer can then be used to refer to the variates containing the loaded data.
OUTFILENAME can specify the name of a Genstat workbook (
.gwb) file to save the marker scores and associated information.
QEXPORT procedure for further details of the file formats. Data in Flapjack format are read and checked using the
Dataload dll, and the valid data are passed back to Genstat using temporary files.
Commands for: Statistical genetics and QTL estimation.
CAPTION 'QIMPORT example'; STYLE=meta QIMPORT [POPULATION=F2]\ FILENAME='%GENDIR%/Examples/F2maize_geno.txt';\ MAPFILENAME='%GENDIR%/Examples/F2maize_map.txt';\ MKSCORES=mgenotypes; CHROMOSOMES=linkagegroups;\ POSITIONS=lpos; MKNAMES=markers;\ PARENTS=parents; IDPARENTS=idparents DQMAP CHROMOSOMES=linkagegroups; POSITIONS=lpos; MKNAMES=markers DQMKSCORES [POPULATIONTYPE=F2; PLOT=all] mgenotypes;\ CHROMOSOMES=linkagegroups; PARENTS=parents; IDPARENTS=idparents