1. Highlights

● produced in 2012

● 5 new directives, 31 new procedures and 3 new functions

● spreadsheet output of results from analysis of variance and `REML`

(`ASPREADSHEET`

, `VSPREADSHEET`

)

● automatic checking of assumptions for `ANOVA`

(`ACHECK`

)

● D-optimal designs for nonlinear and generalized linear models (`AFNONLINEAR`

)

● Lasso (`RLASSO`

)

● selection of representative genotypes and markers (`QGSELECT`

, `QMKSELECT`

)

● quadratic discrimination (`QDISCRIMINATE`

)

● L-splines, P-splines, penalized radial and tensor splines (`LSPLINE`

, `PSPLINE`

, `PENSPLINE`

, `RADIALSPLINE`

, `TENSORSPLINE`

)

● alignment, baseline adjustment and finding of peaks in observed curves (`ALIGNCURVE`

, `BASELINE`

, `PEAKFINDER`

)

● data mining – association rules (`ASRULES`

) and *k* nearest neighbour prediction (`KNEARESTNEIGHBOURS`

)

2. What’s new

2.1 Directives

`ASRULES`

derives association rules from transaction data.

`FCOPY`

makes copies of files.

`FDELETE`

deletes files.

`FRENAME`

renames files.

`GETLOCATIONS`

finds locations of an identifier within a pointer, or a string within a factor or text, or a number within any numerical data structure.

2.2 Procedures

`ACHECK`

checks assumptions for an `ANOVA`

analysis.

`ADPOLYNOMIAL`

plots single-factor polynomial contrasts fitted by `ANOVA`

.

`AFNONLINEAR`

forms D-optimal designs to estimate the parameters of a nonlinear or generalized linear model.

`ALIGNCURVE`

forms an optimal warping to align an observed series of observations with a standard series.

`AN1ADVICE`

aims to give useful advice if a design that is thought to be balanced fails to be analysed by `ANOVA`

.

`ASPREADSHEET`

saves results from an analysis of variance in a spreadsheet.

`BASELINE`

estimates a baseline for a series of numbers whose minimum value is drifting.

`CDNBLOCKDESIGN`

constructs a block design using CycDesigN.

`CDNROWCOLUMNDESIGN`

constructs a row-column design using CycDesigN.

`DMSCATTER`

produces a scatter-plot matrix for one or two sets of variables.

`FDISTINCTFACTORS`

checks sets of factors to remove any that define duplicate classifications.

`FNCORRELATION`

calculates correlations from variances and covariances, together with their variances and covariances.

`FNLINEAR`

estimates linear functions of random variables, and calculates their variances and covariances.

`FNPOWER`

estimates products of powers of two random variables, and calculates their variances and covariances.

`G2AEXPORT`

forms a dbase file to transfer `ANOVA`

output to Agronomix Generation II.

`G2AFACTORS`

redefines block and treatment variables as factors.

`G2VEXPORT`

forms a dbase file to transfer `REML`

output to Agronomix Generation II.

`KNEARESTNEIGHBOURS`

classifies items or predicts their responses by examining their *k* nearest neighbours.

`LSPLINE`

calculates design matrices to fit a natural polynomial or trignometric L-spline as a linear mixed model.

`PEAKFINDER`

finds the locations of peaks in an observed series.

`PENSPLINE`

calculates design matrices to fit a penalized spline as a linear mixed model.

`PSPLINE`

calculates design matrices to fit a P-spline as a linear mixed model.

`QDISCRIMINATE`

performs quadratic discrimination between groups i.e. enabling for different variance-covariance matrices.

`QGSELECT`

obtains a representative selection of genotypes by means of genetic distance sampling or genetic distance optimization.

`QMKRECODE`

recodes marker scores into separate alleles.

`QMKSELECT`

obtains a representative selection of markers by means of genetic distance sampling or genetic distance optimization.

`RADIALSPLINE`

calculates design matrices to fit a radial-spline surface as a linear mixed model.

`RLASSO`

performs lasso using iteratively reweighted least-squares.

`TENSORSPLINE`

calculates design matrices to fit a tensor-spline surface as a linear mixed model.

`VFPEDIGREE`

checks and prepares pedigree information from several factors, for use by `VPEDIGREE`

and `REML`

.

`VSPREADSHEET`

saves results from a `REML`

analysis in a spreadsheet.

2.2 Functions

`IPROBIT`

calculates the inverse probit transformation (result in percentages).

`PROBIT`

calculates the probit transformation for a percentage `p`

.

`REPLACE`

replaces values in any numerical data structure.

3. What’s changed

Most of the changes are compatible with Release 14, the previous release. There are a few commands, however, where new options or parameters have been inserted into the existing lists. These may cause problems in statements where option or parameter names have been omitted or abbreviated (see Section 1.7.1 of Part 1 of the *Guide to the GenStat Command Language* for details). To avoid any difficulty, the name of the option/parameter after the new option/parameter should be given explicitly, and not abbreviated to fewer than four characters.

Any command, where changes in Release 15 may cause incompatibilities in existing programs, is marked in Sections 3.1 and 3.2 by the symbol ^{†}. The full details are given in Section 3.4.

3.1 Directives

^{†}`AKEEP`

is now able to save variance-covariance matrices for covariate regression coefficients, and `RTERM`

can save appropriate strata for assessing block terms.

`DEVICE`

can set the resolution of hard-copy devices.

^{†}`DKEEP`

can now save the lower and upper bounds for the z-axis..

`FOR`

now provides more convenient and efficient ways of specifying an index that changes in equal increments.

`FSIMILARITY`

now allows rectangular (between-group) similarity matrices to be printed.

`GET`

can obtain the name of the working directory. It can also get an integer that will be unique within the current job to use, for example, to define suitable names for temporary files.

`OPEN`

can specify custom content for the header of an HTML document.

`RCYCLE`

can set step lengths for `FITCURVE`

.

^{†}`RKEEP`

can now save the fitted model, an indicator showing its type (regression, standard curve etc.) and an indicator to show whether or not a conatant term was included.

`SET`

now provides more flexible ways of setting default seeds to be used to generate random numbers in the various areas of GenStat. It can also set the working directory, and increase the amount of internal data space.

`SETCALCULATE`

now enables you to control whether to substitute dummies within pointers in the expression; it also lets you suppress the warning messages that are given when data structures in the expression have no values.

`SETRELATE`

now enables you to control whether to substitute dummies within within `LEFT`

and `RIGHT`

pointers.

`TERMS`

now allows a variate of ridge values to be supplied, one for each diagonal element of the sums-of-squares-and-products matrix; it can also save the row labels of the sum-of-squares-and-products matrix.

3.2 Procedures

^{†}`ADSPREADSHEET`

can now colour the cells of the spreadsheet according to levels of the design factors, and can save the spreadsheet as an Excel file.

^{†}`APERMTEST`

can now plot the statistics obtained from the permutations, and can save the probabilities and critical values obtained from the permutations.

^{†}`AYPARALLEL`

now allows covariates to be included in the analysis.

`DIALLEL`

can now produce the Griffing analysis of variance.

^{†}`HGKEEP`

has a new option `IGNOREFAILURE`

that lets you save information even if the fitting of the HGLM failed to converge.

`HGPREDICT`

now provides a clearer description of the predictions.

^{†}`MAANOVA`

now allows covariates to be included in the analysis.

^{†}`QIBDPROBABILITIES`

can now calculate probabilities for backcross inbred lines.

^{†}`QLDDECAY`

now uses regression to speed up the calculations, and displays quantile regression lines to help interpretation.

^{†}`QMATCH`

now lets you specify an explicit set of genotypes or markers to remove.

^{†}`QMESTIMATE`

has improved output.

^{†}`QMKDIAGNOSTICS`

can now save details of the genotypes and markers that have problems.

^{†}`QSASSOCIATION`

provides a new fast method, as an alternative to the exact method.

^{†}`QSESTIMATE`

has improved output.

^{†}`QSIMULATE`

can now simulate backcross inbred lines.

^{†}`QUANTILE`

can now form population quantiles instead of sample quantiles.

^{†}`RLFUNCTIONAL`

has been extended to provide plots, and many additional methods.

`RQUADRATIC`

can now save predictions, and plot the fitted quadratic surface.

`RSPREADSHEET`

can now save the spreadsheet as an Excel file.

`RYPARALLEL`

now allows a symmetric matrix of weights to be specified, for generalized least squares.

`TABSORT`

now lets you keep the levels of some of the classifying factors of the tables in their original order.

`T%CONTROL`

allows percentages to be calculated of the means of several control levels.

`UNSTACK`

now allows has a new option `MVINCLUDE`

, to control whether to include null levels or data sets.

^{†}`VPLOT`

has a new option `RMETHOD`

(as in `VKEEP`

) to specify which random terms to use when calculating the residuals.

^{†}`VTCOMPARISONS`

can now make comparisons for every level of a groups factor.

3.3 Functions

No changes.

3.4 Incompatibilities

`ADSPREADSHEET` procedure |
options `FOREGROUND` , `BACKGROUND` , `CFACTORS` , `GAPFOREGROUND` , `GAPBACKGROUND` , `YFOREGROUND` , `YBACKGROUND` and `XFOREGROUND` inserted before `SPREADSHEET` . |
---|---|

`AKEEP` directive |
option `CBCVCOVARIANCE` inserted before `TREATMENTSTRUCTURE` ; parameter `CVCOVARIANCE` inserted before `CSSP` . |

`APERMTEST` procedure |
option `PLOT` inserted before `NTIMES` ; `SAVE` parameter has now become an option. |

`AYPARALLEL` procedure |
option `COVARIATE` inserted before `FACTORIAL` . |

`DKEEP` directive |
options `YLOWER` and `YUPPER` moved to come after `XUPPER` , and new options `ZLOWER` and `ZUPPER` inserted between `YUPPER` and `FILE` . (This to make the ordering of the `X` , `Y` and `Z` options match that elsewhere; see e.g. `D3GRAPH` .) |

`HGKEEP` procedure |
option `IGNOREFAILURE` inserted before `SAVE` . |

`MAANOVA` procedure |
option `COVARIATE` inserted before `FACTORIAL` . |

`QIBDPROBABILITIES` procedure |
options `NBACKCROSSES` and `NSELFINGS` inserted before `MAPPINGFUNCTION` . |

`QLDDECAY` procedure |
options reordered; `SCORES` and `MAX%MISSING` options and `R2` parameter added; `DEVIANCERATIO` and `MINLOG10P` parameters removed; `decay` setting of the `PLOT` option renamed `lddecay` . |

`QMATCH` procedure |
options `GENSELECTION` and `MKSELECTION` inserted before `POPULATIONTYPE` . |

`QMESTIMATE` procedure |
option `IDPARENTS` inserted before `QTLSELECTED` . |

`QMKDIAGNOSTICS` procedure |
`PLOIDY` option deleted; parameters `GENCHECK` and `MKCHECK` inserted before `SUMMARY` . |

`QMVREPLACE` procedure |
`PLOIDY` option deleted |

`QSASSOCIATION` procedure |
options `METHOD` and `SCORES` inserted before `THRESHOLD` . |

`QSESTIMATE` procedure |
option `IDPARENTS` inserted before `QTLSELECTED` . |

`QSIMULATE` procedure |
options `NBACKCROSSES` and `NSELFINGS` inserted before `GENOMELENGTH` . |

`QUANTILE` procedure |
option `METHOD` inserted before `PROPORTION` . |

`RKEEP` directive |
options `FITMODEL` , `FITCONSTANT` and `FITTYPE` inserted before `SAVE` . |

`RLFUNCTIONAL` procedure |
extensive redesign, with many new options and parameters; in particular the `METHOD` parameter is now an option (enabling several methods to be studied at once). |

`RQLINEAR` procedure |
option `CIPROBABILITY` moved to come after `SEED` (as in `SVGLM` , `SVSTRATIFIED` , `SVTABULATE` etc.). |

`RQNONLINEAR` procedure |
options `CIPROBABILITY` and `MAXCYCLE` moved to come after `SEED` (c.f. `ANOVA` etc.). |

`RQSMOOTH` procedure |
option `CIPROBABILITY` moved to come after `SEED` . |

`VPLOT` procedure |
option `RMETHOD` inserted before `INDEX` . |

`VTCOMPARISONS` procedure |
option `GROUPS` inserted before `SAVE` ; option `VCOVARIANCE` inserted before `STATISTIC` . |