Release 14: new features

1. Highlights

● produced in 2011

● 4 new directives, 32 new procedures

● graphics – automatic definition of sequences of colours (DCOLOURS), 2-d graphs with marginal distribution plots alongside the x- and y-axes (DXYGRAPH), keys in trellis plots (TRELLIS)

● random permutation tests for two-dimensional contingency tables (CHIPERMTEST)

● regression – within-dose error estimation (FITINDIVIDUALLY), automatic storage of results in spreadsheets (RSPREADSHEET), Bradley-Terry model for paired-preference comparison tests (RBRADLEYTERRY), logistic ridge regression (LRIDGE)

● analysis of variance – automatic selection of best method for tricky data sets (AOVANYHOW, AOVDISPLAY)

● REML – comparison contrasts (VTCOMPARISONS), easier assessment of random models (VRACCUMULATE), easier meta-analysis specification (VRMETA)

● design – spreadsheet representation of field plans (ADSPREADSHEET)

● QTL analysis extensions – construction of genetic linkage maps (QMAP, QLINKAGEGROUPS, QRECOMBINATIONS), association mapping of data from multi-environment trials (QMASSOCIATION), construction of kinship matrices (QKINSHIPMATRIX), simulation (QSIMULATE)

● data mining – radial basis functions (RBFIT, RBDISPLAY, RBPREDICT)

● time series – Kalman filter (KALMAN), harmonic and cross-spectrum analyses (DFOURIER, MCROSSPECTRUM)

● survey analysis – CSPro data files (CSPRO)

2. What’s new

2.1 Directives

RBDISPLAY displays output from a radial basis function model fitted by RBFIT.

RBFIT fits a radial basis function model.

RBPREDICT forms predictions from a radial basis function model fitted by RBFIT.

TXREPLACE replaces a subtext within a text structure.

Also several directives, with clearer or more appropriate names, have been introduced as replacements for existing directives. The earlier names are currently retained as synonyms to allow earlier programs to continue to run, but they may be removed in a future release.

HREDUCE forms a reduced similarity matrix referring to the GROUPS instead of the original units (replaces REDUCE).

PCORELATE relates the observed values on a set of variables to the results of a principal coordinates analysis (replaces RELATE).

TFILTER filters time series by time-series models (replaces FILTER).

TFIT estimates parameters in Box-Jenkins models for time series (replaces ESTIMATE).

TFORECAST forecasts future values of a time series (replaces FORECAST).

2.2 Procedures

ADSPREADSHEET puts the data and plan of an experimental design into GenStat spreadsheets.

AEFFICIENCY calculates efficiency factors for experimental designs.

AOVANYHOW performs analysis of variance using ANOVA, regression or REML as appropriate.

AOVDISPLAY provides further output from an analysis by AOVANYHOW.

CHIPERMTEST performs a random permutation test for a two-dimensional contingency table.

CSPRO reads a data set from a CSPro survey data file and dictionary, and loads it into GenStat or puts it into a spreadsheet file.

DCOLOURS forms a band of graduated colours for graphics.

DFOURIER performs a harmonic analysis of a univariate time series.

DKALMAN plots results from an analysis by KALMAN.

DQRECOMBINATIONS plots a matrix of recombination frequencies between markers.

DXYGRAPH draws two-dimensional graphs with marginal distribution plots alongside the y- and x-axes.

FACCOMBINATIONS forms a factor to indicate observations with identical combinations of values of a set of variates, texts or factors.

FACUNIQUE redefines a factor so that its levels and labels are unique.

FUNIQUEVALUES redefines a variate or text so that its values are unique.

KALMAN calculates estimates from the Kalman filter.

LRIDGE does logistic ridge regression.

MCROSSPECTRUM performs a spectral analysis of a multiple time series.

MINFIELDWIDTH calculates minimum field widths for printing data structures.

QKINSHIPMATRIX forms a kinship matrix from molecular markers.

QLINKAGEGROUPS forms linkage groups using marker data from experimental populations.

QMAP constructs genetic linkage maps using marker data from experimental populations.

QMASSOCIATION performs multi-environment marker-trait association analysis in a genetically diverse population using bi-allelic and multi-allelic markers.

QMATCH matches different data structures to be used in QTL estimation.

QRECOMBINATIONS calculates the expected numbers of recombinations and the recombination frequencies between markers.

QSASSOCIATION performs marker-trait association analysis in a genetically diverse population using bi-allelic and multi-allelic markers. (This procedure replaces QASSOCIATION, which is retained as a synonym.)

QSIMULATE simulates marker data and QTL effects for single and multiple environment trials.

RBRADLEYTERRY fits the Bradley-Terry model for paired-comparison preference tests.

RSPREADSHEET puts results from a regression, generalized linear or nonlinear model into GenStat spreadsheets.

SPCOMBINE combines spreadsheet and data files, without reading them into GenStat.

TXPAD pads strings of a text structure with extra characters so that their lengths are equal.

VRACCUMULATE forms a summary accumulating the results of a sequence of REML random models.

VRMETA forms the random model for a REML meta analysis.

VTCOMPARISONS calculates comparison contrasts within a multi-way table of predicted means from a REML analysis.

3. What’s changed

Most of the changes are compatible with Release 13, the previous release. There are a few commands, however, where new options or parameters have been inserted into the existing lists. These may cause problems in statements where option or parameter names have been omitted or abbreviated (see Section 1.7.1 of Part 1 of the Guide to the GenStat Command Language for details). To avoid any difficulty, the name of the option/parameter after the new option/parameter should be given explicitly, and not abbreviated to fewer than four characters.

Any command, where changes in Release 14 may cause incompatibilities in existing programs, is marked in Sections 3.1 and 3.2 by the symbol ^†. The full details are given in Section 3.4.

3.1 Directives

ADD, DROP, FIT, FITCURVE, FITNONLINEAR, RDISPLAY, SWITCH, STEP and TRY now allow bic as a synonym for sic for requesting the Schwarz Bayes information criterion.

ADD, DROP, FIT and SWITCH also have a new option AOVDESCRIPTION that allows you to supply your own description to use for the lines that they add to the accumulated analysis of variance (or deviance) table when their POOL option is set to yes.

ANOVA option ORTHOGONAL now has settings assumed and notassumed, that should be used instead of the previous settings yes and no. This is to rationalize the syntax – the aim is that options with settings yes and no should not have any other settings (ORTHOGONAL also has a setting compulsory). However, yes is retained as a synonym for assumed, so that existing programs will still run (no is, of course, a valid abbreviation of notassumed).

DELETE now allows data structures to be removed completely; their identifiers are then deleted as well as their attributed and values.

^†FCA now saves the factor score coefficients with the COEFFICIENTS parameter, and uses the SCORES parameter to save the factor scores.

FCLASSIFICATION can now save details of the functions defined in a formula.

^†MODEL and ^†RDISPLAY now allow you to specify the number of degrees of freedom for a value specified by the DISPERSION option. You might want to use this, for example, if you had estimated the dispersion from some other data set.

^†PEN now allows you to select plotting symbols by name as an alternative to giving its number. So you can now set parameter SYMBOL='Cross' or SYMBOL='Circle' instead of SYMBOL=1 or SYMBOL=2. This should make the specification much clearer and easier. A consequence, though, is that you can no longer use the SYMBOL parameter to specify character labels to use instead of symbols. However, this has been discouraged since Release 2, when the LABELS parameter was introduced! The preferred (and now only) way to use the letter A, for example, instead of a symbol is to specify

SYMBOL=0; LABELS='A';

So you will no longer be able to set SYMBOL to a text with several lines, to plot a different label at each point (you will need to use LABELS instead). PEN will continue to allow you to set SYMBOL to a factor, e.g. F, as a synonym for

SYMBOL=0; LABELS=F;

as this is unambiguous (but still discouraged!). Finally, to emphasize that SYMBOL is designed to set a single symbol for each pen, the documentation has been changed to define the parameter name as SYMBOL instead of SYMBOLS. PEN will still recognise SYMBOLS to allow existing programs to continue to run, but the element for symbol in the SAVE pointer will now have the label 'symbol' instead of 'symbols'.

POINTER has a new option EXTEND that allows you to extend an existing pointer e.g. to add new labels.

PREDICT now allows WEIGHTS tables classified by factors whose levels are restricted by its LEVELS parameter, and the LEVELS parameter can now specify texts (to identify the levels by their labels).

^†RKEEP can save the maximal model (as defined by TERMS).

3.2 Procedures

^†APOLYNOMIAL is now able to print equations for all the single-factor polynomial terms in an ANOVA analysis automatically.

ASCREEN can now print efficiency factors.

^†A2KEEP can now save the analysis-of-variance table.

BOXPLOT can now plot a nonparametric estimate of the standard error of the median, it can suppress the labels on the outlying points, and it can plot multiple variates at the same time as groups.

BRDISPLAY prints %variance accounted for and %ss changes.

^†DISCRIMINATE can now estimate error rates, using either cross-validation, jackknife or bootstrap.

DCORRELATION now allows weights for the units.

DREPMEASURES now allows the data to be supplied in a single variate, instead of separate variates (one for each time).

^†DQMKSCORES now allows you to supply a title.

DQMAP can now display QTLs.

^†DQMQTLSCAN and ^†DQSQTLSCAN now allow you to provide labels to identify the parents with the Data Information tool in the GenStat Graphics Viewer. DQSQTLSCAN can also now plot the QTL effects.

^†DVARIOGRAM and ^†MVARIOGRAM now allow the power, stable, exponential, Gaussian, pentashperical, spherical, cubic and circular functions to be anisotropic (see the MODEL and ISOTROPY options).

FACLEVSTANDARDIZE can remove unused levels.

FITINDIVIDUALLY can now estimate the lack of fit of a regression model.

FITINDIVIDUALLY, FITMULTINOMIAL, NLAR1, RAR1, RNEGBINOMIAL, RQUADRATIC, RSEARCH and VAIC now allow bic as a synonym for sic for requesting the Schwarz Bayes information criterion.

FMFACTORS now allows the codes to be specified by factors (as well as texts and variates).

^†FRESTRICTEDSET now fails incompatible restrictions (instead of giving a warning). It also now allows you to redefine the levels (and labels) of factors to remove any that do not occur in the restricted subset.

FFRAME settings of the DEFINE changed to windows and nothing (but yes is still recognised as a synonym for windows).

^†GLMM now allows you to include units with missing values in the explanatory vectors or y-variate.

HGPREDICT is now able to allow the option setting ADJUST=marginal when a WEIGHTS table is supplied.

^†QMBACKSELECT, ^†QMESTIMATE and ^†QMQTLSCAN now provide multiple-population analyses.

TRELLIS plots now include keys when necessary to identify the items being plotted.

^†VAIC can now print (and save) changes in the coefficients.

3.3 Functions

In the 14th Edition Service Pack 1, the meaning of the third argument of the probability functions for the Studentized maximum modulus distribution functions CLSMMODULUS, CUSMMODULUS, EDSMMODULUS and PRSMMODULUS has been changed so that you now supply the number of means (n), rather than the number of comparisons (n×(n-1)/2). This means that they are now the same as probability functions for the Studentized range, CLSRANGE, CUSRANGE, EDSRANGE and PRSRANGE – thus removing a source of confusion.

3.4 Incompatibilities

`A2KEEP` procedure	option `AOVTABLE` inserted before `RMETHOD`.
`APOLYNOMIAL` procedure	has been completely revised so that the terms whose polynomial equations are required are now specified by a model formula, using the `TERMS` parameter, instead of by the `FACTOR` and `GROUPS` parameters. Also the `SAVE` parameter is now an option. Simple commands to give the equation for a single factor are unchanged, but those for multi-factor terms must be respecified. In case of problems, however, the earlier procedure is retained as `OLDAPOLYNOMIAL`.
`CLSMMODULUS` function	meaning of the third argument has been changed in the 14th Edition Service Pack 1 so that you now supply the number of means (`n`), rather than the number of comparisons (`n`×(`n`-1)/2).
`CUSMMODULUS` function	meaning of the third argument has been changed in the 14th Edition Service Pack 1 so that you now supply the number of means (`n`), rather than the number of comparisons (`n`×(`n`-1)/2).
`DISCRIMINATE` procedure	options `VALIDATIONMETHOD`, `NSIMULATIONS`, `NCROSSVALIDATIONGROUPS` and `SEED` inserted before `YROOT`.
`DQMKSCORES` procedure	option `TITLE` inserted before `COLOURS`.
`DQMQTLSCAN` procedure	parameter `IDPARENTS` inserted before `DFILENAME`.
`DQSQTLSCAN` procedure	option `WINDOW` moved so that `WINDOW` and new option `KEYWINDOW` come before `SCREEN`; parameters `QEFFECTS`, `QSE` and `IDPARENTS` inserted before `DFILENAME`.
`DVARIOGRAM` procedure	option `ISOTROPY` inserted before `WINDOW`.
`EDSMMODULUS` function	meaning of the third argument has been changed in the 14th Edition Service Pack 1 so that you now supply the number of means (`n`), rather than the number of comparisons (`n`×(`n`-1)/2).
`FCA`	parameters are now in the order `DATA`, `NUNITS`, `LRV`, `SSPM`, `COMMUNALITIES`, `COEFFICIENTS`, `SCORES`, `RESIDUALS` `CRESIDUALS` and `VRESIDUALS`; the factor score coefficients are now saved by the `COEFFICIENTS` parameter, and the `SCORES` parameter saves the factor scores.
`FRESTRICTEDSET` procedure	now fails incompatible restrictions (instead of just giving a warning).
`GLMM` procedure	option `MVINCLUDE` inserted before `MAXCYCLE`.
`MODEL` directive	option `DFDISPERSION` inserted before `SAVE`.
`MVARIOGRAM` procedure	option `SMOOTHNESS` moved to come after `CONSTANT`, option `ISOTROPY` inserted between `SMOOTHNESS` and `WINDOW`, and parameter `INITIAL` inserted before `ESTIMATES`.
`PEN` directive	no longer allows the legacy (and discouraged) use of the `SYMBOL` parameter to specify a text of character labels to plot at the points; instead you must specify these using the `LABELS` parameter, and set `SYMBOL=0`.
`PRSMMODULUS` function	meaning of the third argument has been changed in the 14th Edition Service Pack 1 so that you now supply the number of means (`n`), rather than the number of comparisons (`n`×(`n`-1)/2).
`QIBDPROBABILITIES` procedure	now takes input as GenStat data structures instead of files: parameters `FILENAME` and `MAPFILENAME` are replaced by new parameters `MKSCORES`, `CHROMOSOMES`, `POSITIONS`, `MKNAMES`, `IDMGENOTYPES`, `PARENTS`, `IDPARENTS` and `PEDIGREE`; the original parameters `CHROMOSOMES` and `POSITIONS` are renamed `SCHROMOSOMES` and `SPOSITIONS`; new parameter `MKLOCI` inserted before `NLOCI`.
`QMBACKSELECT` procedure	parameter `POPULATIONS` inserted before `UNITERROR`.
`QMESTIMATE` procedure	option `POPULATIONTYPE` inserted before `VCMODEL`; parameter `POPULATIONS` inserted before `UNITERROR`, `MKLOCI` inserted before `IDMGENOTYPES`, and `QSAVE` inserted before `SAVE`.
`QMVREPLACE` procedure	now only replaces missing values; the matching of data structures is now in `QMATCH`.
`QMKDIAGNOSTICS` procedure	option `PLOIDY` inserted before `DCHROMOSOMES`, and parameters `PARENTS` and `IDPARENTS` inserted before `SUMMARY`.
`QMQTLSCAN` procedure	parameter `POPULATIONS` inserted before `UNITERROR`.
`QSASSOCIATION` procedure	as well as the change of name (from `QASSOCIATION`), option `MINORALLELE` is inserted before `KMATRIX`, parameter `NDF` is inserted before `MINLOG10P`, and `QSAVE` is inserted before `SAVE`.
`QSESTIMATE` procedure	option `POPULATIONTYPE` inserted before `FIXED`; parameter `MKLOCI` inserted before `IDMGENOTYPES`, and `QSAVE` inserted before `SAVE`.
`RDISPLAY` directive	option `DFDISPERSION` inserted before `SAVE`.
`RKEEP` directive	option `MAXIMALMODEL` inserted before `SAVE`.
`VAIC` procedure	parameter `CHANGES` inserted before `SAVE`.

Updated on June 19, 2019

Was this article helpful?

Yes No