Release 15: new features

1. Highlights

● produced in 2012

● 5 new directives, 31 new procedures and 3 new functions

● spreadsheet output of results from analysis of variance and REML (ASPREADSHEET, VSPREADSHEET)

● automatic checking of assumptions for ANOVA (ACHECK)

● D-optimal designs for nonlinear and generalized linear models (AFNONLINEAR)

● Lasso (RLASSO)

● selection of representative genotypes and markers (QGSELECT, QMKSELECT)

● quadratic discrimination (QDISCRIMINATE)

● L-splines, P-splines, penalized radial and tensor splines (LSPLINE, PSPLINE, PENSPLINE, RADIALSPLINE, TENSORSPLINE)

● alignment, baseline adjustment and finding of peaks in observed curves (ALIGNCURVE, BASELINE, PEAKFINDER)

● data mining – association rules (ASRULES) and k nearest neighbour prediction (KNEARESTNEIGHBOURS)

2. What’s new

2.1 Directives

ASRULES derives association rules from transaction data.

FCOPY makes copies of files.

FDELETE deletes files.

FRENAME renames files.

GETLOCATIONS finds locations of an identifier within a pointer, or a string within a factor or text, or a number within any numerical data structure.

2.2 Procedures

ACHECK checks assumptions for an ANOVA analysis.

ADPOLYNOMIAL plots single-factor polynomial contrasts fitted by ANOVA.

AFNONLINEAR forms D-optimal designs to estimate the parameters of a nonlinear or generalized linear model.

ALIGNCURVE forms an optimal warping to align an observed series of observations with a standard series.

AN1ADVICE aims to give useful advice if a design that is thought to be balanced fails to be analysed by ANOVA.

ASPREADSHEET saves results from an analysis of variance in a spreadsheet.

BASELINE estimates a baseline for a series of numbers whose minimum value is drifting.

CDNBLOCKDESIGN constructs a block design using CycDesigN.

CDNROWCOLUMNDESIGN constructs a row-column design using CycDesigN.

DMSCATTER produces a scatter-plot matrix for one or two sets of variables.

FDISTINCTFACTORS checks sets of factors to remove any that define duplicate classifications.

FNCORRELATION calculates correlations from variances and covariances, together with their variances and covariances.

FNLINEAR estimates linear functions of random variables, and calculates their variances and covariances.

FNPOWER estimates products of powers of two random variables, and calculates their variances and covariances.

G2AEXPORT forms a dbase file to transfer ANOVA output to Agronomix Generation II.

G2AFACTORS redefines block and treatment variables as factors.

G2VEXPORT forms a dbase file to transfer REML output to Agronomix Generation II.

KNEARESTNEIGHBOURS classifies items or predicts their responses by examining their k nearest neighbours.

LSPLINE calculates design matrices to fit a natural polynomial or trignometric L-spline as a linear mixed model.

PEAKFINDER finds the locations of peaks in an observed series.

PENSPLINE calculates design matrices to fit a penalized spline as a linear mixed model.

PSPLINE calculates design matrices to fit a P-spline as a linear mixed model.

QDISCRIMINATE performs quadratic discrimination between groups i.e. allowing for different variance-covariance matrices.

QGSELECT obtains a representative selection of genotypes by means of genetic distance sampling or genetic distance optimization.

QMKRECODE recodes marker scores into separate alleles.

QMKSELECT obtains a representative selection of markers by means of genetic distance sampling or genetic distance optimization.

RADIALSPLINE calculates design matrices to fit a radial-spline surface as a linear mixed model.

RLASSO performs lasso using iteratively reweighted least-squares.

TENSORSPLINE calculates design matrices to fit a tensor-spline surface as a linear mixed model.

VFPEDIGREE checks and prepares pedigree information from several factors, for use by VPEDIGREE and REML.

VSPREADSHEET saves results from a REML analysis in a spreadsheet.

2.2 Functions

IPROBIT calculates the inverse probit transformation (result in percentages).

PROBIT calculates the probit transformation for a percentage p.

REPLACE replaces values in any numerical data structure.

3. What’s changed

Most of the changes are compatible with Release 14, the previous release. There are a few commands, however, where new options or parameters have been inserted into the existing lists. These may cause problems in statements where option or parameter names have been omitted or abbreviated (see Section 1.7.1 of Part 1 of the Guide to the GenStat Command Language for details). To avoid any difficulty, the name of the option/parameter after the new option/parameter should be given explicitly, and not abbreviated to fewer than four characters.

Any command, where changes in Release 15 may cause incompatibilities in existing programs, is marked in Sections 3.1 and 3.2 by the symbol ^†. The full details are given in Section 3.4.

3.1 Directives

^†AKEEP is now able to save variance-covariance matrices for covariate regression coefficients, and RTERM can save appropriate strata for assessing block terms.

DEVICE can set the resolution of hard-copy devices.

^†DKEEP can now save the lower and upper bounds for the z-axis..

FOR now provides more convenient and efficient ways of specifying an index that changes in equal increments.

FSIMILARITY now allows rectangular (between-group) similarity matrices to be printed.

GET can obtain the name of the working directory. It can also get an integer that will be unique within the current job to use, for example, to define suitable names for temporary files.

OPEN can specify custom content for the header of an HTML document.

RCYCLE can set step lengths for FITCURVE.

^†RKEEP can now save the fitted model, an indicator showing its type (regression, standard curve etc.) and an indicator to show whether or not a conatant term was included.

SET now provides more flexible ways of setting default seeds to be used to generate random numbers in the various areas of GenStat. It can also set the working directory, and increase the amount of internal data space.

SETCALCULATE now enables you to control whether to substitute dummies within pointers in the expression; it also allows you to suppress the warning messages that are given when data structures in the expression have no values.

SETRELATE now enables you to control whether to substitute dummies within within LEFT and RIGHT pointers.

TERMS now allows a variate of ridge values to be supplied, one for each diagonal element of the sums-of-squares-and-products matrix; it can also save the row labels of the sum-of-squares-and-products matrix.

3.2 Procedures

^†ADSPREADSHEET can now colour the cells of the spreadsheet according to levels of the design factors, and can save the spreadsheet as an Excel file.

^†APERMTEST can now plot the statistics obtained from the permutations, and can save the probabilities and critical values obtained from the permutations.

^†AYPARALLEL now allows covariates to be included in the analysis.

DIALLEL can now produce the Griffing analysis of variance.

^†HGKEEP has a new option IGNOREFAILURE that allows you to save information even if the fitting of the HGLM failed to converge.

HGPREDICT now provides a clearer description of the predictions.

^†MAANOVA now allows covariates to be included in the analysis.

^†QIBDPROBABILITIES can now calculate probabilities for backcross inbred lines.

^†QLDDECAY now uses regression to speed up the calculations, and displays quantile regression lines to help interpretation.

^†QMATCH now allows you to specify an explicit set of genotypes or markers to remove.

^†QMESTIMATE has improved output.

^†QMKDIAGNOSTICS can now save details of the genotypes and markers that have problems.

^†QSASSOCIATION provides a new fast method, as an alternative to the exact method.

^†QSESTIMATE has improved output.

^†QSIMULATE can now simulate backcross inbred lines.

^†QUANTILE can now form population quantiles instead of sample quantiles.

^†RLFUNCTIONAL has been extended to provide plots, and many additional methods.

RQUADRATIC can now save predictions, and plot the fitted quadratic surface.

RSPREADSHEET can now save the spreadsheet as an Excel file.

RYPARALLEL now allows a symmetric matrix of weights to be specified, for generalized least squares.

TABSORT now allows you to keep the levels of some of the classifying factors of the tables in their original order.

T%CONTROL allows percentages to be calculated of the means of several control levels.

UNSTACK now allows has a new option MVINCLUDE, to control whether to include null levels or data sets.

^†VPLOT has a new option RMETHOD (as in VKEEP) to specify which random terms to use when calculating the residuals.

^†VTCOMPARISONS can now make comparisons for every level of a groups factor.

3.3 Functions

No changes.

3.4 Incompatibilities

`ADSPREADSHEET` procedure	options `FOREGROUND`, `BACKGROUND`, `CFACTORS`, `GAPFOREGROUND`, `GAPBACKGROUND`, `YFOREGROUND`, `YBACKGROUND` and `XFOREGROUND` inserted before `SPREADSHEET`.
`AKEEP` directive	option `CBCVCOVARIANCE` inserted before `TREATMENTSTRUCTURE`; parameter `CVCOVARIANCE` inserted before `CSSP`.
`APERMTEST` procedure	option `PLOT` inserted before `NTIMES`; `SAVE` parameter has now become an option.
`AYPARALLEL` procedure	option `COVARIATE` inserted before `FACTORIAL`.
`DKEEP` directive	options `YLOWER` and `YUPPER` moved to come after `XUPPER`, and new options `ZLOWER` and `ZUPPER` inserted between `YUPPER` and `FILE`. (This to make the ordering of the `X`, `Y` and `Z` options match that elsewhere; see e.g. `D3GRAPH`.)
`HGKEEP` procedure	option `IGNOREFAILURE` inserted before `SAVE`.
`MAANOVA` procedure	option `COVARIATE` inserted before `FACTORIAL`.
`QIBDPROBABILITIES` procedure	options `NBACKCROSSES` and `NSELFINGS` inserted before `MAPPINGFUNCTION`.
`QLDDECAY` procedure	options reordered; `SCORES` and `MAX%MISSING` options and `R2` parameter added; `DEVIANCERATIO` and `MINLOG10P` parameters removed; `decay` setting of the `PLOT` option renamed `lddecay`.
`QMATCH` procedure	options `GENSELECTION` and `MKSELECTION` inserted before `POPULATIONTYPE`.
`QMESTIMATE` procedure	option `IDPARENTS` inserted before `QTLSELECTED`.
`QMKDIAGNOSTICS` procedure	`PLOIDY` option deleted; parameters `GENCHECK` and `MKCHECK` inserted before `SUMMARY`.
`QMVREPLACE` procedure	`PLOIDY` option deleted
`QSASSOCIATION` procedure	options `METHOD` and `SCORES` inserted before `THRESHOLD`.
`QSESTIMATE` procedure	option `IDPARENTS` inserted before `QTLSELECTED`.
`QSIMULATE` procedure	options `NBACKCROSSES` and `NSELFINGS` inserted before `GENOMELENGTH`.
`QUANTILE` procedure	option `METHOD` inserted before `PROPORTION`.
`RKEEP` directive	options `FITMODEL`, `FITCONSTANT` and `FITTYPE` inserted before `SAVE`.
`RLFUNCTIONAL` procedure	extensive redesign, with many new options and parameters; in particular the `METHOD` parameter is now an option (allowing several methods to be studied at once).
`RQLINEAR` procedure	option `CIPROBABILITY` moved to come after `SEED` (as in `SVGLM`, `SVSTRATIFIED`, `SVTABULATE` etc.).
`RQNONLINEAR` procedure	options `CIPROBABILITY` and `MAXCYCLE` moved to come after `SEED` (c.f. `ANOVA` etc.).
`RQSMOOTH` procedure	option `CIPROBABILITY` moved to come after `SEED`.
`VPLOT` procedure	option `RMETHOD` inserted before `INDEX`.
`VTCOMPARISONS` procedure	option `GROUPS` inserted before `SAVE`; option `VCOVARIANCE` inserted before `STATISTIC`.

Updated on February 9, 2022

Was this article helpful?

Yes No