CSPRO procedure

Reads a data set from a CSPro survey data file and dictionary, and loads it into Genstat or puts it into a spreadsheet file (D.B. Baird).

Options

`PRINT` = string token	What to print (`catalogue`); default `cata`
`FACMETHOD` = string token	Which factors to create (`convertall`, `keepandconvertall`, `none`, `noranges`); default `keep`
`MISSINGCODES` = string tokens	Which special values to convert to Genstat missing values (`missing`, `na`); default `miss`
`FVALUESETS` = string token	Whether form to a set of columns containing all the valueset information (`yes`, `no`); default `no`
`SUBITEMS` = string token	Whether to create a set of columns for the sub-items (`yes`, `no`); default `no`
`MERGE` = string token	Whether to merge the records into a single set of columns all of the same length (`yes`, `no`); default `no`
`FUNKNOWNGROUP` = string token	Whether to create a specific level for values not in the value set, rather than setting them to missing values (`yes`, `no`); default `no`
`INCLUDEEXTRA` = string token	Whether to include a row of column descriptions in the Excel output file after the column heading row (`yes`, `no`); default `no`
`WARNONEMPTYGROUPS` = string token	Whether to warn that groups in a factor are empty and offer to remove them when loading the data from a saved GWB file (`yes`, `no`); default `no`
`DUPLICATELABELS` = string token	What to do with factor groups that have identical labels (`combine`, `ignore`, `rename`); default `comb`
`SCOPE` = string token	Whether to read the data into global data structures or into data structures local to a procedure calling `CSPRO` (`local`, `global`); default `loca`
`INOPTIONS` = text	Optional extra input options to be passed to the `Dataload.dll`
`OUTOPTIONS` = text	Optional extra output options to be passed to the `Dataload.dll`

Parameters

`FILENAME` = text	Survey data file to be read
`DICTIONARY` = text	Survey dictionary for interpreting the data file
`OUTFILENAME` = text	Name of the output file to be created, if required
`SURVEYLEVEL` = scalar	Level of the survey (1, 2 or 3) to read; default 1
`RECORDS` = scalar or variate	Defines the records to be read within the `SURVEYLEVEL`; by default they are all read
`ITEMS` = text	Names of the survey items to be read
`ISAVE` = text or pointer	Saves the identifiers of the columns that are created

Description

CSPRO reads data from a CSPro survey data file and dictionary, specified by the FILENAME and DICTIONARY parameters. If DICTIONARY is not set, CSPRO will look for a file with the same name as FILENAME but with a .dcf extension. You can save the data in either a Genstat workbook (.gwb) or an Excel spreadsheet (.xls), by setting the OUTFILENAME option to the name of the file to create; if OUTFILENAME is not specifed, a temporary file is used to read into Genstat and this is deleted afterwards. The SAVE parameter can save a pointer containing the structures that have been created.

CSPro surveys can have up to three levels, but most have only a single level. By default, CSPRO reads data from the first level, but you can set the SURVEYLEVEL parameter to read level 2 or 3. The RECORDS parameter specifies which records to read within the SURVEYLEVEL; by default they are all read. The ITEMS parameter can supply a text containing the names of the items to be read. You can append the character ! to the name if you want to force the item to be read as a factor, or # of you want to force it to be read as a variate. This then overrides the setting of the FACMETHOD option for that item (see below).

Setting option SUBITEM=yes allows an item to be broken down. For example, the item ID may be RRVVII (e.g. 120113), with the first two digits giving the region, the next two giving the village, and last two the individual. The sub items RR, VV and II would then also be created as separate columns region, village and individual. Dates are entered like this: e.g. YYYYMMDD with sub-items year, month and day.

By default, each record will have its own set of columns and keys, possibly with different lengths. The keys will be stored in a pointer with element numbers indicating the corresponding records (e.g. Village[1], Village[2] etc.). Alternatively, if you set MERGE=yes, the columns from the different records are merged together based on the common id columns. The merged columns will then all have the same length, with just one set of keys.

A item in a CSPro file can have one or more value sets associated with it. The value set provides mappings from values to groups which are labelled. These are more general than Genstat factors, as either series or ranges of values can be put into a single group: e.g. (1, 3 and 5) or (1 <= x <= 3) Groups can be marked as representing either a missing value or a not-applicable (NA) response. The MISSINGCODES option indicates which of these should be converted into missing values in Genstat; the default is to convert only the groups that represent missing values, and leave the non-applicable groups. By default, values that do not belong to any of the groups defined by a value set are set to missing. However, you can set FUNKNOWNGROUP=yes to create a new level of the resulting factor for these for unallocated values.

The FACMETHOD option controls how the CSPro value sets are converted to factors, using the following settings:

`none`	no columns are read as factors,
`noranges`	only columns with single entries per group are read as factors,
`keepandconvertall`	the original columns are included (as variates) and, in addition, a factor is created for each value set defined for a column, and
`convertall`	a factor is created for each value set defined for a column, but the original column is not included (so information is lost when groups with series or ranges of values are lumped into a single group).

Note, as mentioned above, the ITEMS parameter can be used to override FACMETHOD for individual survey items.

If you are saving to an Excel file, you can include the column descriptions by setting option INCLUDEEXTRA=yes. If you are saving to a GWB file, you can set option WARNONEMPTYGROUPS=yes to arrange for a dialogue to appear when the file is loaded into the Genstat client, offering to remove any factor groups that have no observations.

CSPro does not insist that each value set item must have a unique label. The DUPLICATELABELS option allows you to choose what to do with any duplicates, by selecting one of the following settings:

`combine`	combines the items into a single group,
`ignore`	ignores duplicate labels, and suppresses the warnings that would occur for duplicate factor labels if the data are saved and then reread from a GWB file,
`rename`	renames the duplicate occurrences by adding a suffix to make the labels are unique.

Setting option FVALUESETS=yes, creates an extra set of 7 columns (Record, Item, ValueSet, From, To, Label, Special) containing all the valueset information. Record, Item and Valueset are factors giving the names of the record, item and value set for each group, From and To are variates giving the ranges (or single value if To is set to a missing value) for each group, Label is a text giving the label for each group, and Special is a factor indicating whether the group is a special item (Missing, NA or Other).

When CSPRO is used within a procedure, the SCOPE option controls whether the structures are created locally in the procedure (default), or globally in the main program.

The options INOPTIONS and OUTOPTIONS are provided to pass extra input or output options to the Dataload.dll. These are only for very specialized use.

Options: PRINT, FACMETHOD, MISSINGCODES, FVALUESETS, SUBITEMS, MERGE, FUNKNOWNGROUP, INCLUDEEXTRA, WARNONEMPTYGROUPS, DUPLICATELABELS, SCOPE, INOPTIONS, OUTOPTIONS.

Parameters: FILENAME, DICTIONARY, OUTFILENAME, SURVEYLEVEL, RECORDS, ITEMS, ISAVE.

Method

The request is passed to the Dataload.dll library which reads the CSPro file and returns the data via a Genstat spreadsheet book.

Example

CAPTION 'CSPRO examples'; STYLE=meta
CSPRO   FILE='%GENDIR%/Examples/CSPRO.dat';\
        DICTIONARY='%GENDIR%/Examples/CSPRO.dcf'; ISAVE=pData
CSPRO   [FACMETHOD=none; SUBITEMS=yes; MISSINGCODES=missing,na]\
        FILE='%GENDIR%/Examples/CSPRO.dat';\
        DICTIONARY='%GENDIR%/Examples/CSPRO.dcf'; RECORD=1
TEXT 	  [VALUES=LINE,P02_REL,P03_SEX,P04_AGE,P05_MS,P06_MOTHER,P07_BIRTH,\
        P08_RES95,P09_ATTEND,P10_HIGH_GR,P11_LITERACY,P12_WORKING] Inc_Cols
CSPRO   FILE='%GENDIR%/Examples/CSPRO.dat';\
        DICTIONARY='%GENDIR%/Examples/CSPRO.dcf'; ITEMS=Inc_Cols

Updated on June 20, 2019

Was this article helpful?

Yes No