Reads a data set from a CSPro survey data file and dictionary, and loads it into Genstat or puts it into a spreadsheet file (D.B. Baird).
|What to print (
||Which factors to create (
||Which special values to convert to Genstat missing values (
||Whether form to a set of columns containing all the valueset information (
||Whether to create a set of columns for the sub-items (
||Whether to merge the records into a single set of columns all of the same length (
||Whether to create a specific level for values not in the value set, rather than setting them to missing values (
||Whether to include a row of column descriptions in the Excel output file after the column heading row (
||Whether to warn that groups in a factor are empty and offer to remove them when loading the data from a saved GWB file (
||What to do with factor groups that have identical labels (
||Whether to read the data into global data structures or into data structures local to a procedure calling
||Optional extra input options to be passed to the
||Optional extra output options to be passed to the
||Survey data file to be read|
||Survey dictionary for interpreting the data file|
||Name of the output file to be created, if required|
||Level of the survey (1, 2 or 3) to read; default 1|
||Defines the records to be read within the
||Names of the survey items to be read|
||Saves the identifiers of the columns that are created|
CSPRO reads data from a CSPro survey data file and dictionary, specified by the
DICTIONARY parameters. If
DICTIONARY is not set,
CSPRO will look for a file with the same name as
FILENAME but with a
.dcf extension. You can save the data in either a Genstat workbook (
.gwb) or an Excel spreadsheet (.xls), by setting the
OUTFILENAME option to the name of the file to create; if
OUTFILENAME is not specifed, a temporary file is used to read into Genstat and this is deleted afterwards. The
SAVE parameter can save a pointer containing the structures that have been created.
CSPro surveys can have up to three levels, but most have only a single level. By default,
CSPRO reads data from the first level, but you can set the
SURVEYLEVEL parameter to read level 2 or 3. The
RECORDS parameter specifies which records to read within the
SURVEYLEVEL; by default they are all read. The
ITEMS parameter can supply a text containing the names of the items to be read. You can append the character
! to the name if you want to force the item to be read as a factor, or
# of you want to force it to be read as a variate. This then overrides the setting of the
FACMETHOD option for that item (see below).
SUBITEM=yes allows an item to be broken down. For example, the item
ID may be
RRVVII (e.g. 120113), with the first two digits giving the region, the next two giving the village, and last two the individual. The sub items
II would then also be created as separate columns
individual. Dates are entered like this: e.g.
YYYYMMDD with sub-items year, month and day.
By default, each record will have its own set of columns and keys, possibly with different lengths. The keys will be stored in a pointer with element numbers indicating the corresponding records (e.g.
Village etc.). Alternatively, if you set
MERGE=yes, the columns from the different records are merged together based on the common id columns. The merged columns will then all have the same length, with just one set of keys.
A item in a CSPro file can have one or more value sets associated with it. The value set provides mappings from values to groups which are labelled. These are more general than Genstat factors, as either series or ranges of values can be put into a single group: e.g. (1, 3 and 5) or (1 <=
x <= 3) Groups can be marked as representing either a missing value or a not-applicable (NA) response. The
MISSINGCODES option indicates which of these should be converted into missing values in Genstat; the default is to convert only the groups that represent missing values, and leave the non-applicable groups. By default, values that do not belong to any of the groups defined by a value set are set to missing. However, you can set
FUNKNOWNGROUP=yes to create a new level of the resulting factor for these for unallocated values.
FACMETHOD option controls how the CSPro value sets are converted to factors, using the following settings:
||no columns are read as factors,|
||only columns with single entries per group are read as factors,|
||the original columns are included (as variates) and, in addition, a factor is created for each value set defined for a column, and|
||a factor is created for each value set defined for a column, but the original column is not included (so information is lost when groups with series or ranges of values are lumped into a single group).|
Note, as mentioned above, the
ITEMS parameter can be used to override
FACMETHOD for individual survey items.
If you are saving to an Excel file, you can include the column descriptions by setting option
INCLUDEEXTRA=yes. If you are saving to a GWB file, you can set option
WARNONEMPTYGROUPS=yes to arrange for a dialogue to appear when the file is loaded into the Genstat client, offering to remove any factor groups that have no observations.
CSPro does not insist that each value set item must have a unique label. The
DUPLICATELABELS option allows you to choose what to do with any duplicates, by selecting one of the following settings:
||combines the items into a single group,|
||ignores duplicate labels, and suppresses the warnings that would occur for duplicate factor labels if the data are saved and then reread from a GWB file,|
||renames the duplicate occurrences by adding a suffix to make the labels are unique.|
FVALUESETS=yes, creates an extra set of 7 columns (
Special) containing all the valueset information. Record, Item and Valueset are factors giving the names of the record, item and value set for each group,
To are variates giving the ranges (or single value if
To is set to a missing value) for each group,
Label is a text giving the label for each group, and
Special is a factor indicating whether the group is a special item (Missing, NA or Other).
CSPRO is used within a procedure, the
SCOPE option controls whether the structures are created locally in the procedure (default), or globally in the main program.
OUTOPTIONS are provided to pass extra input or output options to the
Dataload.dll. These are only for very specialized use.
The request is passed to the
Dataload.dll library which reads the CSPro file and returns the data via a Genstat spreadsheet book.
CAPTION 'CSPRO examples'; STYLE=meta CSPRO FILE='%GENDIR%/Examples/CSPRO.dat';\ DICTIONARY='%GENDIR%/Examples/CSPRO.dcf'; ISAVE=pData CSPRO [FACMETHOD=none; SUBITEMS=yes; MISSINGCODES=missing,na]\ FILE='%GENDIR%/Examples/CSPRO.dat';\ DICTIONARY='%GENDIR%/Examples/CSPRO.dcf'; RECORD=1 TEXT [VALUES=LINE,P02_REL,P03_SEX,P04_AGE,P05_MS,P06_MOTHER,P07_BIRTH,\ P08_RES95,P09_ATTEND,P10_HIGH_GR,P11_LITERACY,P12_WORKING] Inc_Cols CSPRO FILE='%GENDIR%/Examples/CSPRO.dat';\ DICTIONARY='%GENDIR%/Examples/CSPRO.dcf'; ITEMS=Inc_Cols