1. Home
  2. CSPRO procedure

CSPRO procedure

Reads a data set from a CSPro survey data file and dictionary, and loads it into Genstat or puts it into a spreadsheet file (D.B. Baird).


PRINT = string token What to print (catalogue); default cata
FACMETHOD = string token Which factors to create (convertall, keepandconvertall, none, noranges); default keep
MISSINGCODES = string tokens Which special values to convert to Genstat missing values (missing, na); default miss
FVALUESETS = string token Whether form to a set of columns containing all the valueset information (yes, no); default no
SUBITEMS = string token Whether to create a set of columns for the sub-items (yes, no); default no
MERGE = string token Whether to merge the records into a single set of columns all of the same length (yes, no); default no
FUNKNOWNGROUP = string token Whether to create a specific level for values not in the value set, rather than setting them to missing values (yes, no); default no
INCLUDEEXTRA = string token Whether to include a row of column descriptions in the Excel output file after the column heading row (yes, no); default no
WARNONEMPTYGROUPS = string token Whether to warn that groups in a factor are empty and offer to remove them when loading the data from a saved GWB file (yes, no); default no
DUPLICATELABELS = string token What to do with factor groups that have identical labels (combine, ignore, rename); default comb
SCOPE = string token Whether to read the data into global data structures or into data structures local to a procedure calling CSPRO (local, global); default loca
INOPTIONS = text Optional extra input options to be passed to the Dataload.dll
OUTOPTIONS = text Optional extra output options to be passed to the Dataload.dll


FILENAME = text Survey data file to be read
DICTIONARY = text Survey dictionary for interpreting the data file
OUTFILENAME = text Name of the output file to be created, if required
SURVEYLEVEL = scalar Level of the survey (1, 2 or 3) to read; default 1
RECORDS = scalar or variate Defines the records to be read within the SURVEYLEVEL; by default they are all read
ITEMS = text Names of the survey items to be read
ISAVE = text or pointer Saves the identifiers of the columns that are created


CSPRO reads data from a CSPro survey data file and dictionary, specified by the FILENAME and DICTIONARY parameters. If DICTIONARY is not set, CSPRO will look for a file with the same name as FILENAME but with a .dcf extension. You can save the data in either a Genstat workbook (.gwb) or an Excel spreadsheet (.xls), by setting the OUTFILENAME option to the name of the file to create; if OUTFILENAME is not specifed, a temporary file is used to read into Genstat and this is deleted afterwards. The SAVE parameter can save a pointer containing the structures that have been created.

CSPro surveys can have up to three levels, but most have only a single level. By default, CSPRO reads data from the first level, but you can set the SURVEYLEVEL parameter to read level 2 or 3. The RECORDS parameter specifies which records to read within the SURVEYLEVEL; by default they are all read. The ITEMS parameter can supply a text containing the names of the items to be read. You can append the character ! to the name if you want to force the item to be read as a factor, or # of you want to force it to be read as a variate. This then overrides the setting of the FACMETHOD option for that item (see below).

Setting option SUBITEM=yes allows an item to be broken down. For example, the item ID may be RRVVII (e.g. 120113), with the first two digits giving the region, the next two giving the village, and last two the individual. The sub items RR, VV and II would then also be created as separate columns region, village and individual. Dates are entered like this: e.g. YYYYMMDD with sub-items year, month and day.

By default, each record will have its own set of columns and keys, possibly with different lengths. The keys will be stored in a pointer with element numbers indicating the corresponding records (e.g. Village[1], Village[2] etc.). Alternatively, if you set MERGE=yes, the columns from the different records are merged together based on the common id columns. The merged columns will then all have the same length, with just one set of keys.

A item in a CSPro file can have one or more value sets associated with it. The value set provides mappings from values to groups which are labelled. These are more general than Genstat factors, as either series or ranges of values can be put into a single group: e.g. (1, 3 and 5) or (1 <= x <= 3) Groups can be marked as representing either a missing value or a not-applicable (NA) response. The MISSINGCODES option indicates which of these should be converted into missing values in Genstat; the default is to convert only the groups that represent missing values, and leave the non-applicable groups. By default, values that do not belong to any of the groups defined by a value set are set to missing. However, you can set FUNKNOWNGROUP=yes to create a new level of the resulting factor for these for unallocated values.

The FACMETHOD option controls how the CSPro value sets are converted to factors, using the following settings:

    none no columns are read as factors,
    noranges only columns with single entries per group are read as factors,
    keepandconvertall the original columns are included (as variates) and, in addition, a factor is created for each value set defined for a column, and
    convertall a factor is created for each value set defined for a column, but the original column is not included (so information is lost when groups with series or ranges of values are lumped into a single group).

Note, as mentioned above, the ITEMS parameter can be used to override FACMETHOD for individual survey items.

If you are saving to an Excel file, you can include the column descriptions by setting option INCLUDEEXTRA=yes. If you are saving to a GWB file, you can set option WARNONEMPTYGROUPS=yes to arrange for a dialogue to appear when the file is loaded into the Genstat client, offering to remove any factor groups that have no observations.

CSPro does not insist that each value set item must have a unique label. The DUPLICATELABELS option allows you to choose what to do with any duplicates, by selecting one of the following settings:

    combine combines the items into a single group,
    ignore ignores duplicate labels, and suppresses the warnings that would occur for duplicate factor labels if the data are saved and then reread from a GWB file,
    rename renames the duplicate occurrences by adding a suffix to make the labels are unique.

Setting option FVALUESETS=yes, creates an extra set of 7 columns (Record, Item, ValueSet, From, To, Label, Special) containing all the valueset information. Record, Item and Valueset are factors giving the names of the record, item and value set for each group, From and To are variates giving the ranges (or single value if To is set to a missing value) for each group, Label is a text giving the label for each group, and Special is a factor indicating whether the group is a special item (Missing, NA or Other).

When CSPRO is used within a procedure, the SCOPE option controls whether the structures are created locally in the procedure (default), or globally in the main program.

The options INOPTIONS and OUTOPTIONS are provided to pass extra input or output options to the Dataload.dll. These are only for very specialized use.




The request is passed to the Dataload.dll library which reads the CSPro file and returns the data via a Genstat spreadsheet book.

See also

Directive: READ.


Commands for: Input and output, Survey analysis.


CAPTION 'CSPRO examples'; STYLE=meta
CSPRO   FILE='%GENDIR%/Examples/CSPRO.dat';\
        DICTIONARY='%GENDIR%/Examples/CSPRO.dcf'; ISAVE=pData
        DICTIONARY='%GENDIR%/Examples/CSPRO.dcf'; RECORD=1
        P08_RES95,P09_ATTEND,P10_HIGH_GR,P11_LITERACY,P12_WORKING] Inc_Cols
CSPRO   FILE='%GENDIR%/Examples/CSPRO.dat';\
        DICTIONARY='%GENDIR%/Examples/CSPRO.dcf'; ITEMS=Inc_Cols
Updated on June 20, 2019

Was this article helpful?