Release 15: new features

1. Highlights

●  produced in 2012

●  5 new directives, 31 new procedures and 3 new functions

●  spreadsheet output of results from analysis of variance and REML (ASPREADSHEET, VSPREADSHEET)

●  automatic checking of assumptions for ANOVA (ACHECK)

●  D-optimal designs for nonlinear and generalized linear models (AFNONLINEAR)

●  Lasso (RLASSO)

●  selection of representative genotypes and markers (QGSELECT, QMKSELECT)

●  quadratic discrimination (QDISCRIMINATE)

●  L-splines, P-splines, penalized radial and tensor splines (LSPLINE, PSPLINE, PENSPLINE, RADIALSPLINE, TENSORSPLINE)

●  alignment, baseline adjustment and finding of peaks in observed curves (ALIGNCURVE, BASELINE, PEAKFINDER)

●  data mining – association rules (ASRULES) and k nearest neighbour prediction (KNEARESTNEIGHBOURS)

2. What’s new

2.1 Directives

ASRULES derives association rules from transaction data.

FCOPY makes copies of files.

FDELETE deletes files.

FRENAME renames files.

GETLOCATIONS finds locations of an identifier within a pointer, or a string within a factor or text, or a number within any numerical data structure.

2.2 Procedures

ACHECK checks assumptions for an ANOVA analysis.

ADPOLYNOMIAL plots single-factor polynomial contrasts fitted by ANOVA.

AFNONLINEAR forms D-optimal designs to estimate the parameters of a nonlinear or generalized linear model.

ALIGNCURVE forms an optimal warping to align an observed series of observations with a standard series.

AN1ADVICE aims to give useful advice if a design that is thought to be balanced fails to be analysed by ANOVA.

ASPREADSHEET saves results from an analysis of variance in a spreadsheet.

BASELINE estimates a baseline for a series of numbers whose minimum value is drifting.

CDNBLOCKDESIGN constructs a block design using CycDesigN.

CDNROWCOLUMNDESIGN constructs a row-column design using CycDesigN.

DMSCATTER produces a scatter-plot matrix for one or two sets of variables.

FDISTINCTFACTORS checks sets of factors to remove any that define duplicate classifications.

FNCORRELATION calculates correlations from variances and covariances, together with their variances and covariances.

FNLINEAR estimates linear functions of random variables, and calculates their variances and covariances.

FNPOWER estimates products of powers of two random variables, and calculates their variances and covariances.

G2AEXPORT forms a dbase file to transfer ANOVA output to Agronomix Generation II.

G2AFACTORS redefines block and treatment variables as factors.

G2VEXPORT forms a dbase file to transfer REML output to Agronomix Generation II.

KNEARESTNEIGHBOURS classifies items or predicts their responses by examining their k nearest neighbours.

LSPLINE calculates design matrices to fit a natural polynomial or trignometric L-spline as a linear mixed model.

PEAKFINDER finds the locations of peaks in an observed series.

PENSPLINE calculates design matrices to fit a penalized spline as a linear mixed model.

PSPLINE calculates design matrices to fit a P-spline as a linear mixed model.

QDISCRIMINATE performs quadratic discrimination between groups i.e. enabling for different variance-covariance matrices.

QGSELECT obtains a representative selection of genotypes by means of genetic distance sampling or genetic distance optimization.

QMKRECODE recodes marker scores into separate alleles.

QMKSELECT obtains a representative selection of markers by means of genetic distance sampling or genetic distance optimization.

RADIALSPLINE calculates design matrices to fit a radial-spline surface as a linear mixed model.

RLASSO performs lasso using iteratively reweighted least-squares.

TENSORSPLINE calculates design matrices to fit a tensor-spline surface as a linear mixed model.

VFPEDIGREE checks and prepares pedigree information from several factors, for use by VPEDIGREE and REML.

VSPREADSHEET saves results from a REML analysis in a spreadsheet.

2.2 Functions

IPROBIT calculates the inverse probit transformation (result in percentages).

PROBIT calculates the probit transformation for a percentage p.

REPLACE replaces values in any numerical data structure.

3. What’s changed

Most of the changes are compatible with Release 14, the previous release. There are a few commands, however, where new options or parameters have been inserted into the existing lists. These may cause problems in statements where option or parameter names have been omitted or abbreviated (see Section 1.7.1 of Part 1 of the Guide to the GenStat Command Language for details). To avoid any difficulty, the name of the option/parameter after the new option/parameter should be given explicitly, and not abbreviated to fewer than four characters.

Any command, where changes in Release 15 may cause incompatibilities in existing programs, is marked in Sections 3.1 and 3.2 by the symbol . The full details are given in Section 3.4.

3.1 Directives

AKEEP is now able to save variance-covariance matrices for covariate regression coefficients, and RTERM can save appropriate strata for assessing block terms.

DEVICE can set the resolution of hard-copy devices.

DKEEP can now save the lower and upper bounds for the z-axis..

FOR now provides more convenient and efficient ways of specifying an index that changes in equal increments.

FSIMILARITY now allows rectangular (between-group) similarity matrices to be printed.

GET can obtain the name of the working directory. It can also get an integer that will be unique within the current job to use, for example, to define suitable names for temporary files.

OPEN can specify custom content for the header of an HTML document.

RCYCLE can set step lengths for FITCURVE.

RKEEP can now save the fitted model, an indicator showing its type (regression, standard curve etc.) and an indicator to show whether or not a conatant term was included.

SET now provides more flexible ways of setting default seeds to be used to generate random numbers in the various areas of GenStat. It can also set the working directory, and increase the amount of internal data space.

SETCALCULATE now enables you to control whether to substitute dummies within pointers in the expression; it also lets you suppress the warning messages that are given when data structures in the expression have no values.

SETRELATE now enables you to control whether to substitute dummies within within LEFT and RIGHT pointers.

TERMS now allows a variate of ridge values to be supplied, one for each diagonal element of the sums-of-squares-and-products matrix; it can also save the row labels of the sum-of-squares-and-products matrix.

3.2 Procedures

ADSPREADSHEET can now colour the cells of the spreadsheet according to levels of the design factors, and can save the spreadsheet as an Excel file.

APERMTEST can now plot the statistics obtained from the permutations, and can save the probabilities and critical values obtained from the permutations.

AYPARALLEL now allows covariates to be included in the analysis.

DIALLEL can now produce the Griffing analysis of variance.

HGKEEP has a new option IGNOREFAILURE that lets you save information even if the fitting of the HGLM failed to converge.

HGPREDICT now provides a clearer description of the predictions.

MAANOVA now allows covariates to be included in the analysis.

QIBDPROBABILITIES can now calculate probabilities for backcross inbred lines.

QLDDECAY now uses regression to speed up the calculations, and displays quantile regression lines to help interpretation.

QMATCH now lets you specify an explicit set of genotypes or markers to remove.

QMESTIMATE has improved output.

QMKDIAGNOSTICS can now save details of the genotypes and markers that have problems.

QSASSOCIATION provides a new fast method, as an alternative to the exact method.

QSESTIMATE has improved output.

QSIMULATE can now simulate backcross inbred lines.

QUANTILE can now form population quantiles instead of sample quantiles.

RLFUNCTIONAL has been extended to provide plots, and many additional methods.

RQUADRATIC can now save predictions, and plot the fitted quadratic surface.

RSPREADSHEET can now save the spreadsheet as an Excel file.

RYPARALLEL now allows a symmetric matrix of weights to be specified, for generalized least squares.

TABSORT now lets you keep the levels of some of the classifying factors of the tables in their original order.

T%CONTROL allows percentages to be calculated of the means of several control levels.

UNSTACK now allows has a new option MVINCLUDE, to control whether to include null levels or data sets.

VPLOT has a new option RMETHOD (as in VKEEP) to specify which random terms to use when calculating the residuals.

VTCOMPARISONS can now make comparisons for every level of a groups factor.

3.3 Functions

No changes.

3.4 Incompatibilities

AKEEP directive option CBCVCOVARIANCE inserted before TREATMENTSTRUCTURE; parameter CVCOVARIANCE inserted before CSSP.
APERMTEST procedure option PLOT inserted before NTIMES; SAVE parameter has now become an option.
AYPARALLEL procedure option COVARIATE inserted before FACTORIAL.
DKEEP directive options YLOWER and YUPPER moved to come after XUPPER, and new options ZLOWER and ZUPPER inserted between YUPPER and FILE. (This to make the ordering of the X, Y and Z options match that elsewhere; see e.g. D3GRAPH.)
HGKEEP procedure option IGNOREFAILURE inserted before SAVE.
MAANOVA procedure option COVARIATE inserted before FACTORIAL.
QLDDECAY procedure options reordered; SCORES and MAX%MISSING options and R2 parameter added; DEVIANCERATIO and MINLOG10P parameters removed; decay setting of the PLOT option renamed lddecay.
QMESTIMATE procedure option IDPARENTS inserted before QTLSELECTED.
QMKDIAGNOSTICS procedure PLOIDY option deleted; parameters GENCHECK and MKCHECK inserted before SUMMARY.
QMVREPLACE procedure PLOIDY option deleted
QSASSOCIATION procedure options METHOD and SCORES inserted before THRESHOLD.
QSESTIMATE procedure option IDPARENTS inserted before QTLSELECTED.
QSIMULATE procedure options NBACKCROSSES and NSELFINGS inserted before GENOMELENGTH.
QUANTILE procedure option METHOD inserted before PROPORTION.
RKEEP directive options FITMODEL, FITCONSTANT and FITTYPE inserted before SAVE.
RLFUNCTIONAL procedure extensive redesign, with many new options and parameters; in particular the METHOD parameter is now an option (enabling several methods to be studied at once).
RQLINEAR procedure option CIPROBABILITY moved to come after SEED (as in SVGLM, SVSTRATIFIED, SVTABULATE etc.).
RQNONLINEAR procedure options CIPROBABILITY and MAXCYCLE moved to come after SEED (c.f. ANOVA etc.).
RQSMOOTH procedure option CIPROBABILITY moved to come after SEED.
VPLOT procedure option RMETHOD inserted before INDEX.
VTCOMPARISONS procedure option GROUPS inserted before SAVE; option VCOVARIANCE inserted before STATISTIC.
Updated on December 1, 2017

Was this article helpful?