1. Home
  2. Genstat Command Language
  3. Procedure Library: Instructions for Authors

Procedure Library: Instructions for Authors

The Genstat Procedure Library is controlled by an Editorial Committee whose current members are given at the top of the help page that lists the Authors of procedures.

Procedures submitted to the Library are checked to ensure that they are useful; also and more importantly, that they are reliable. To save time and effort in refereeing we would ask you to ensure that you supply all the information listed in Section 1 below, and that you try to follow the guidelines of style given in Sections 3 and 4. Remember that the easier a procedure is to assess, the more likely the referee is to be sympathetic and the faster you are likely to receive a report! As explained below, the Editors try to ensure consistency of syntax between submitted procedures and the Genstat directives and the procedures already in the Library. Often this is the main area where changes that are needed to a submitted procedure so, to avoid wasted effort, we are happy to advise on the syntax of proposed procedures before the algorithmic part is complete.

1. Submissions

Please send submissions to roger.payne@vsni.co.uk@vsni.co.uk.

The following information should be supplied.

a)       The code of the procedure following as far as possible the style described in Section 4.

b)       Information for the on-line help, including a simple example, to illustrate how to use the procedure. Output from the example should not be too voluminous and should be interspersed with captions to explain what is going on. Further details are given in Section 8.

c)       Test programs for the referees to use to check that the procedure produces correct and accurate results. These should be well commented and should check every aspect of the procedure. Please also supply evidence that the output is correct: for example by showing that the results agree with examples taken from books or papers, or by including simple examples that can be checked by hand.

d)       Any additional information to assist the editor and referees, for example references to books or papers to explain the method, flowcharts (if appropriate), comparisons with existing procedures, and so on.

2. Arguments

Information is transferred to and from a procedure by means of the arguments specified by its options and parameters, defined using the OPTION and PARAMETER directives. Information about how these work can be found in Section 5.3 of the Guide to the Genstat Command Language, Part 1, Syntax and Data Management. In addition to the OPTION and PARAMETER directives, please also use the CALLS directive to list any Library procedures that are used within your procedure.

Please try to avoid making the user specify information that can easily be determined within the procedure itself. For example, there is no need to require the length of a vector to be specified as well as the vector itself; this can easily be determined by using the NVALUES function in a CALCULATE statement. The GETATTRIBUTE directive can also be useful in this context. Relevant information, such as the current treatment formula (as set by the TREATMENTSTRUCTURE directive), can be obtained using the GET directive. The AKEEP, RKEEP and VKEEP directives can also be useful for information concerned with analysis of designed experiments, or regression and generalized linear models or REML.

Where some of the arguments are vectors, you should consider what will happen if they are restricted. The standard rule, used in the Genstat directives, is that only one vector of those in its current set of arguments needs to be restricted for the directive to treat them all as being restricted in the same way. A fault is reported if any are restricted in different ways. See the COMPATIBLE parameter of the OPTION and PARAMETER directives for ways of automatically checking the compatibility of the restrictions of vectors set by options and parameters of the procedure.

You should also take account of the possibility that numerical structures may contain missing values.

3. Syntax

The rules of syntax for the use of a procedure are exactly the same as those of the standard Genstat directives. Moreover, procedures in the Library are accessed automatically so that a user need not know whether a particular statement uses a procedure or a directive. For efficiency, procedures that are very popular may be supplied as directives in future releases of Genstat (and the converse may also be true). Consequently it is important that the names of Library procedures and their options and parameters follow the same conventions of type, ordering and vocabulary as have been set for the definition of the directives.

a)       When deciding on a name for each option or parameter, check whether there is a suitable name already in use (in directives or other procedures in the Library) before you invent another one. The best way to check is to click on the Summary sub-option of the Reference Manual option of the Help menu on the Genstat menu bar, go to Chapter 4 Syntax summary, and use the Find menu in the pdf viewer. You should also ensure that its existing purpose is similar to that required in your procedure. For example, if you need an option to control the maximum number of iterations for an algorithm, you should use the name MAXCYCLE (as in directives RCYCLE, ANOVA and ESTIMATE), and not some new name. On the other hand, you should not use MAXCYCLE for a different purpose – for example to define an attribute of a cyclic design. Alternatively, you may be able to use one of the existing prefixes; see (g) below.

b)       The first 4 characters of each option name must be different from the first 4 characters of the name of any other option of the procedure; similarly, the first 4 characters of each parameter name must be distinct from those of the other parameters. Thus for example you should not have options PRINT and PRINCIPAL in the same procedure. Also the first 4 letters of the name of the procedure must not be the same as the first 4 letters of any directive or any other Library procedure.

c)       If you do use an existing name for an option or parameter, ensure that it has the same mode for its setting (identifier, string, expression or formula) as in its use elsewhere. For example, if you have an option called PRINT, its setting should be one or more strings, as for example in directives ANOVA, CVA, FIT and TABULATE. (The mode is specified by the MODE parameter of the OPTION and PARAMETER directives.)

d)       Most option and parameter settings should be of mode identifier. Strings should occur only as the settings of VALUES options for the definition of texts, or for settings where the aim is to select one, or more, values from a predetermined set. Number lists should occur only with VALUES options for the definition of numerical structures.

e)       Try to keep to the same ordering of options and of parameters as has been used in the Genstat directives. For example, you will find that PRINT always comes before CHANNEL where both occur as options of the same directive – for example ADISPLAY, DUMP, READ, RDISPLAY, STORE, and so on.

f)       Options and parameters that have possible settings yes or no must always have no as their default.

g)       Standard prefixes have been defined for use in the names of directives and their options and parameters. Try to use these where appropriate, and avoid defining further prefixes (or causing further ambiguity in the meanings of the existing prefixes) unless this is unavoidable.

    A for analysis of variance (e.g. directives ADISPLAY and AKEEP, and the ASAVE option of SET);
    AU for unbalanced analysis of variance (e.g. procedures AUDISPLAY and AUKEEP,
    B for commands operating on trees i.e. branching structures (e.g. directive BASSESS and procedures BCLASSIFICATION and BREGRESSION);
    BC for classification trees (e.g. procedures BCDISPLAY and BCIDENTIFY);
    BCF for classification forests (e.g. procedures BCFDISPLAY and BCFIDENTIFY);
    BG for WinBUGS or OpenBUGS (e.g. procedures BGPLOT and BGXGENSTAT);
    BJ for Box Jenkins (procedures BJESTIMATE, BJFORECAST and BJIDENTIFY);
    BK for identification keys (e.g. procedures BKDISPLAY and BKIDENTIFY);
    BR for regression trees (e.g. procedures BRDISPLAY and BRPREDICT);
    C for colour (e.g. parameters CSYMBOL, CLINE, CFILL and CAREA of PEN), for covariates (e.g. option CPRINT of ANOVA and ADISPLAY, and options CREGRESSION and CSSP of AKEEP), or for clustering (e.g. option CTHRESHOLD of directive HCLUSTER);
    CB for combined (e.g. option CBRESIDUALS of AKEEP);
    CL for column labels (e.g. option CLPRINT of directive PRINT) or cumulative lower (functions CLBETA, CLBINOMIAL and so on);
    CU for cumulative upper (functions CUBETA, CUBINOMIAL and so on);
    D for high-resolution graphics (directives DGRAPH, DHISTOGRAM and so on) and for deviance (e.g. option DCALCULATION of MODEL);
    DF for degrees of freedom (e.g. parameter DFCONTRASTS of AKEEP);
    ED for equivalent deviate (functions EDBETA, EDCHISQUARE and so on);
    ENV for environment (procedures QMESTIMATE and QMQTLSCAN);
    F for form (e.g. directives FSSPM and FTSM), or for factor (e.g. option FREPRESENTATION of READ), or for the F distribution (e.g. option FPROBABILITY of ADISPLAY and ANOVA);
    FAC for factor (directive FACROTATE or procedures FACAMEND and FACSORT);
    FN for function (procedures FNLINEAR and FNPOWER);
    FV for fitted values (procedure YTRANSFORM);
    G for group (e.g. parameter GTHRESHOLD of HCLUSTER, and parameter GSIMILARITY of HDISPLAY), or generate (e.g. procedures AGALPHA and GRANDOM), or generalized (procedure GINVERSE), or genotype (procedure QGSELECT);
    GE for genotype-by-environment facilities (e.g. procedure GESTABILITY);
    GR for generate random (e.g. procedures GRMULTINORMAL and GRTHIN or function GRNORMAL);
    G2 for Agronomix Generation II (e.g. procedures G2AEXPORT, G2AFACTORS and G2VEXPORT);
    H for hierarchical (e.g. directives HCLUSTER and HDISPLAY);
    HG for hierarchical generalized linear models (e.g. procedures HGANALYSE and HGDISPLAY);
    I for identifier (e.g. option IPRINT of PRINT and TABULATE), or for imaginary (e.g. parameters ISERIES and ITRANSFORM of FOURIER);
    IN for input (e.g. option INPRINT of JOB and SET, and parameter INMATRIX of FLRV and SVD);
    L for labels (e.g. parameter LPOSITION of XAXIS) or for link (e.g. option LCALCULATION of MODEL);
    LIB for Library (e.g. procedure LIBHELP);
    LL for loglikelihood (functions LLBINOMIAL, LLGAMMA and so on);
    LOW for lower (limit of axis – procedure DDENDROGRAM);
    LT for left transpose (function LTPRODUCT);
    M for multiple (e.g. procedure MCORANALYSIS) and for margin (e.g. the MNAME parameter of PRINT);
    MAX for maximum (e.g. option MAXCYCLE of ANOVA and ESTIMATE, and option MAXLAG of CORRELATE and TSUMMARIZE);
    MK for marker (e.g. parameter MKNAMES of procedure QMKSELECT);
    MV for missing value (e.g. option MVREPLACE of ESTIMATE);
    N for number (e.g. option NVALUES of FACTOR, POINTER, TEXT, VARIATE and UNITS, option NROOTS of CVA, FLRV, PCO, PCP and RELATE, and option NTIMES of EXIT, FOR and RETURN);
    NEW for new (e.g. option NEWSTRUCTURE of COMBINE, parameter NEWSTRUCTURES of EQUATE, and parameters NEWVALUES and NEWINTERVALS of INTERPOLATE);
    NL for non-linear (procedure NLCONTRASTS);
    NN for neural networks (directives NNDISPLAY, NNFIT and NNPREDICT);
    NO for no (e.g. option NOMESSAGE of ADD, DROP, FIT, FITCURVE, FITNONLINEAR, RDISPLAY, STEP, SWITCH and TRY);
    OLD for old (e.g. option OLDSTRUCTURE of COMBINE, parameter OLDSTRUCTURES of EQUATE, and parameters OLDVALUES and OLDINTERVALS of INTERPOLATE);
    OUT for output (e.g. option OUTPRINT of JOB and SET, and option OUTCHANNEL of MERGE);
    P for print (e.g. option PUNKNOWN of PRINT, and options PFACTORIAL, PCONTRASTS and PDEVIATIONS of ADISPLAY and ANOVA);
    PEN for a graphical pen (e.g. parameter PENTITLE of XAXIS) or for penalized (procedure PENSPLINE);
    PR for probability density (functions PRBETA, PRBINOMIAL and so on);
    PT for point process (procedure PTDESCRIBE);
    Q for facilities for QTL estimation and statistical genetics (e.g. procedures QDESCRIBE and QIMPORT), for questions (e.g. procedure QLIST) and for quadratic (function QPRODUCT);
    R for regression (e.g. directives RDISPLAY and RKEEP, and option RSAVE of SET) and for residual (e.g. option RMETHOD of MODEL);
    RB for radial basis functions (e.g. directives RBDISPLAY and RBFIT);
    RL for row labels (e.g. options RLPRINT and RLWIDTH of PRINT);
    ROB for robust (procedure ROBSSPM);
    RQ for quantile regression (e.g. procedures RQLINEAR, RQNONLINEAR, and RQSMOOTH);
    RSS for residual sum of squares (e.g. parameter RSS of ROTATE);
    RT for right transpose (function RTPRODUCT);
    S for sample size (e.g. procedures SSIGNTEST and STTEST) and for smoothed (e.g. parameters STERMS and SCOMPONENTS of RKEEP);
    SE for standard error (e.g. parameter SECONTRASTS of AKEEP);
    SP for statistical process control or six-sigma (e.g. procedures SPCAPABILITY and SPCCHART);
    SV for survey analysis (e.g. procedures SVSTRATIFIED and SVTABULATE);
    T for time series (e.g. directives TKEEP and TSUMMARIZE, and option TSAVE of SET), or for table (e.g. functions TMEANS and TSUMS), or for the t distribution (e.g. option TPROBABILITY of ADD);
    TAB for table (e.g. procedures TABMODE and TABSORT);
    TX for text (e.g. directives TXBREAK and TXCONSTRUCT);
    U for unadjusted (e.g. option UPRINT of ADISPLAY and ANOVA) or unbalanced (procedure AUDISPLAY);
    V for variance components (directives VCOMPONENTS, VKEEP and VDISPLAY), or variance (e.g. option VCOVARIANCE of PREDICT, RKEEP and TKEEP), or variate (e.g. functions VMEANS and VSUMS);
    W for within-group (e.g. parameter WMEANS of SSPM, parameter WMATRIX of FLRV, and parameter WSSPM of CVA);
    X for x (e.g. parameters XLOWER and XUPPER of DGRAPH and FRAME, and parameters XINPUT and XOUTPUT of ROTATE);
    Y for y (e.g. parameters YLOWER and YUPPER of DGRAPH and FRAME, and parameters YINPUT and YOUTPUT of ROTATE);
    Z for z (e.g. option ZSCALE of DSURFACE).

4. Style of the code

To help the editors and referees to assess procedures from the many different authors, we strongly encourage a consistent style for the procedures in the Library. The suggested rules are intended to make the procedures easier to read and understand, for example by consistent use of spaces, and by the use of different patterns of small and capital letters to distinguish between identifiers, strings and system words. They are also intended to minimize the changes that may be necessary for new releases of Genstat. Procedures that depart from the rules may be returned to the author for revision before any detailed refereeing is undertaken. Many of the rules are similar to those used for examples in the Guide to the Genstat Command Language; see Part 1 Section 1.9 or Part 2 Section 1.1.

a)       There should be only one statement per line. Continuation lines should be indented by at least one character (i.e. space) to the right of the initial line of a statement. There should also be a consistent indentation of at least one character for the initial line of each statement within a loop, a block-if structure or a multiple-selection structure. (In fact most of the current procedures indent these lines by two characters.)

b)       You should be consistent in layout and the use of spaces within statements. We suggest that you leave one space before the opening square bracket enclosing an option sequence and one space after the closing bracket, but do not put spaces by square brackets when they are used to enclose suffixes, nor around the $ character. We prefer not to have spaces before the equals sign when it occurs as an option or parameter name, although you may put spaces around equals signs in expressions. Do not leave any spaces before commas nor before semi-colons, but do leave a space after a semi-colon and after a colon. Sometimes clarity can be improved by having several instead of a single space, for example to allow items in parallel lists to line up in successive continuation lines.

c)       System words (names of directives, functions, options and parameters) are more obvious if put in capitals. They should never be abbreviated to fewer than four characters and, for clarity, it is usually preferable to give the full form unless this is overlong and an abbreviation is self-explanatory: for example either TREAT or TREATMENTS would be acceptable instead of TREATMENTSTRUCTURE, but GRAPH might be better than GRAP.

d)       Omission of the names of parameters and options (with their accompanying equals signs) is not recommended other than for the first and perhaps second parameter of a directive or procedure. There is no guarantee that options and parameters of directives and procedures other than the first will retain their exact orderings from release to release of Genstat or the library, so it is best to name them explicitly. For example

GRAPH Sales; Time

or

GRAPH Sales; Time; SYMBOLS='+'

would be acceptable, but not

RKEEP Yield; LINEARPREDICTOR=Lpred; Excode

which would have changed in its effects from Release 2 to Release 3 of Genstat.

e)       Strings that occur as values of options or parameters (for example “left” in JUSTIFIED=left) should be given in full and in small letters. Again, if the full form is overlong and abbreviation to four characters is acceptable; for example PRINT=aovt instead of PRINT=aovtable in ANOVA.

f)       Structures should have identifiers that are easily associated with their purpose. Identifiers are best put in lower-case letters; it can be helpful to use a capital letter for the initial character (for example Sales) as this allows identifiers to be distinguished from strings, see (e) above. Clarity can sometimes also be enhanced by using several capital letters where an identifier is formed from two (or more) words or abbreviated words (for example MinSales). From Release 4.2 onwards the names of the dummies used to refer to the arguments of the procedure did not need not be all in capitals, but use of capitals is still recommended to distinguish these from local variables. Unless you specify the statement

SET [CASE=significant]

at the start of the procedure, and option RESTORE=case in the PROCEDURE statement itself, you should not use identifiers that differ from each other only because one has some of its letters in lower case instead of in capitals; the procedure could then give incorrect results if directive SET had been used in the calling program to make capital and lower-case identifiers identical.

g)       Comments should be included wherever appropriate, to explain what is going on. In particular, there should be an initial set of comment lines immediately after the PROCEDURE statement, to give the date of the current version of the procedure, the names of its authors, and a brief description of the purpose and method of use of the procedure. Also there should be an explanatory comment alongside the definition of each option and parameter. The comment should indicate whether the setting of the option or parameter is for input only (I), for output only (O), or for input and output. It should list the types of structure that may occur, also the set of allowed values (where appropriate) and the default. For example:

'PRINT', " (I: strings {test, ranks} default test)

           controls printed output "\

Examples of the source code of existing procedures can be obtained using the Examples menu in Genstat for Windows, or using procedure LIBEXAMPLE in other implementations.

5. Utilities

Several utility directives and procedures have been written to simplify the writing of procedures for the library. See Section 5.4 of the Guide to the Genstat Command Language, Part 1, Syntax and Data Management for details. Please use the utility procedures, where possible, instead of writing your own code. This will help to prevent the Library becoming unduly large. Also some of the utility procedures may be replaced by directives (with identical syntax) to improve efficiency in future releases of Genstat.

In particular please use the CAPTION directive for titles within the output, as they will then be compatible with the titles printed by the Genstat directives. The titles themselves should be in “sentence” style i.e. with a capital letter at the start of the whole title and in proper names but not elsewhere (see TXCONSTRUCT for details).

6. Side effects

A procedure may change some aspects of the Genstat environment, for example the default number of units or the information printed for a Genstat diagnostic. This may be a specified purpose of the procedure; but if it is an unwanted side-effect, you should set the RESTORE option in the PROCEDURE statement to indicate those that need to be restored on leaving the procedure. Alternatively, you can use directive GET to store information about the current settings on entry to the procedure, and directive SET to reset them later.

7. Diagnostics

If faulty information is supplied to a procedure, failure will occur during execution of one of the statements of the procedure, and Genstat will give an error diagnostic. The error message produced by Genstat will explain what has gone wrong at that particular statement, but it may not be clear to the user of the procedure how this has arisen from the input supplied nor how to correct the mistake. The OPTION and PARAMETER directives allow many aspects of the arguments of the procedure to be checked automatically. If you need to make your own checks, you should indicate the faults using the FAULT directive. Explanations from FAULT should be punctuated as in ordinary text. So, all of these except notes or messages should start with a capital letter, and they should all end with a full stop. (Genstat automatically adds the prefix “* Note:” or “* Message:” to a note or message.)

You may also want to use directive SET to suppress the printing of diagnostics from some statements, and then use GET to obtain the fault code so that you can either issue your own error message or correct the situation. (Remember that you can use directive DISPLAY to print the diagnostic if it is one that you cannot handle.) However, you must reset the printing of diagnostics before the end of the procedure (see Section 6 above).

8. Documentation

A help page on each procedure in the Library is entered into Genstat’s on-line help system. This should include the following information:

    Purpose a very brief description of the purpose of the procedure;
    Authors the list of authors of the procedure, with their addresses to include in the Authors of procedures help page;
    Options specifications of the syntax of the options;
    Parameters specifications of the syntax of the parameters;
    Description a detailed description of the purpose of the procedure, with an overview of how to use it, ending with lists of the names of its options and of its parameters;
    Method a description of the method, with references to published articles or papers for more detail;
    Restrict information about whether the procedure takes account of restrictions on input vectors (see the RESTRICT directive);
    References references of articles, books or papers mentioned in the Description or Method sections;
    See also lists of similar or associated directives, procedures and/or functions;
    Example a simple example illustrating how to use the procedure.

9. Reporting of errors

Reports of errors in a Library procedure should be sent to

support@vsni.co.uk

The report should consist of Genstat output illustrating the fault (with an explanation of why it is believed that this is incorrect), and a listing of the input and data files of the program that produced the output.

10. Updates and corrections

By submitting your procedures for inclusion in the Library, it is acknowledged that you grant VSNi full rights of use, including the right to make bug corrections or enhancements to take account of new facilities within Genstat itself or in statistical methodology generally. We will try to contact you to check any serious changes. Note though, that you are able to check for changes made in each new Genstat release, by accessing the source code using the LIBEXAMPLE procedure. If another author makes substantial enhancements, we may acknowledge their contribution by adding them to the list of authors. However, we will try to contact you first, to check whether you would prefer to make the enhancements yourself.

Updated on March 7, 2019

Was this article helpful?