Select Worksheet

Prior to the introduction of the Excel import wizard, the Select Excel Worksheet dialog was the standard way of importing Excel or Quattro Windows files. This legacy dialog is still available for users who are more comfortable using this import method.

To turn off the Excel import wizard and use the legacy dialog, select Tools | Spreadsheet Options, click the Sheets tab then deselect Use Import Wizard when opening an Excel file.

The Select Excel Worksheet dialog will then display in place of the Excel import wizard whenever you open an Excel or Quattro Windows file (WQ1, WB1/2/3). An explanation of each option on this dialog is given further down the page.

Both of these file formats can contain several worksheets. Worksheets and named ranges are shown in the preview pane prefixed by S: and R: respectively (see image below).

If a single worksheet or named range is to be opened you can specify a sub-range of the data to be loaded.

If named ranges are specified within the file then these have the advantage that both the sheet and cell range are specified permanently in the Excel file. So, these are automatically updated if further rows or columns are inserted into the cell range.

Columns whose names are suffixed with a ! character will be automatically converted into factors. Similarly, if column names end with a $ or # character, then these will be converted into a text and variate respectively. As well as supplying a conversion character, column format information can be appended to a column name. For example, a suffix of :0, :1, :2 … :9 specifies that the column should be displayed with the given fixed number of decimal places. A suffix of :D specifies the column is to be displayed in date format. This will use Genstat’s default date format. Similarly, a suffix of :T specifies that the column is to be displayed in time format.

The order of factor levels and labels can be supplied within an Excel file by adding a comment to the cell containing the column name. Within the comment the factor levels must start with a ! and labels must start with !t on a new line. The levels and labels should then be entered within brackets, for example, for levels this could be !(100,90,50,10) and for labels this could be !T(Control,A,B,C) or !t(‘Control’,’A’,’B’,’C’). The order of the items in the comment will define the order of the levels or labels in the factor. If a column just contains ordinal values (i.e. 1…n), then the comment can still be used to assign labels or levels to these groups. The first item in the comment will define the level or label for group 1 in the factor and so on.

A column description can also be supplied in the comment as a separate line, but this must not start with an exclamation mark.

The following example shows two comments that have been entered into columns within Excel (using the Excel menu item Insert | Comment). Each column will be opened in Genstat as a factor using the defined levels/labels and will include a column description.

Select all

This can be used to select all the items within the list of worksheets and named ranges.

Restricted cell range

You can load a subset of the data from a worksheet by specifying the position of diagonally opposite corners of the rectangle that spans the required range. For example B12:E18 will load seven rows and five columns starting at row 12 of the second column.

Partial column or row information can be provided also:

Just the starting cell can be given (e.g. B12) and all cells
to the lower right will be used.
Just column letters provided (e.g. B:E), and all rows in these
columns will be used.
Just row numbers provided (e.g. 12:18), and all columns with
data in these rows will be used.

Read title from cell range

When selected, the cell range (e.g. A2:B3) or named range (which must be located on the same sheet as the data) specified in the space provided will be used to read text for the spreadsheet title. Note that text for the different cells within this range will be concatenated.

Select columns for inclusion

Lets you select a subset of columns from the worksheet. This opens the Select Columns dialog once the data has been read in.

Sort factor levels

When selected, columns loaded as factors will have their levels (or labels for a text column) sorted into ascending order.

Suggest columns to be factors

When selected, if columns contain few unique repeated values then you will be prompted with a dialog where these columns can be converted to factors. The Suggest converting columns with <= N unique items option on the Tools | Spreadsheet Options | Columns tab can be used to control the limit for the number of unique items.

Drop empty rows

When selected, any rows that do not include any data are removed from the resulting spreadsheet.

Remove empty columns

By default, any columns that do not contain data are removed from the Genstat spreadsheet. However, these columns can be retained in the spreadsheet by unselecting this option.

Data contains variates & factors only

When selected, any column containing text will be converted to either a variate or factor. If the number of labels within the spreadsheet column is less than 20% of the total number of rows, then the column will be converted into a variate, otherwise the column will be converted into a factor.

Skip first X non-empty rows

When selected, the first specified number of rows will be excluded from the spreadsheet. This is useful to exclude any comments that made have been made in the spreadsheets. Note that text in these rows can still be used for column names or descriptions.

Read column names from file

Controls whether column names are specified within the file.

Yes if all labels	Read the column names from the row specified in the Column names in row option if all the cells in this row contain labels only or are missing. A default column name will be generated for a missing cell.
Yes	Use the cells specified in the Column names in row option. If a cell contains a number, then it will be prefixed with “_” to ensure it is a valid Genstat identifier name.
No	Do not read the column names from the file and generate default names for the columns.

Column names in row

Specifies the row number to use for column names.

Column descriptions in row

When selected, the cells in the specified row number will be used for column descriptions (the EXTRA keyword for Genstat structures).

Row numbers

Controls whether the row numbers are relative to the data range or relate to a whole worksheet.

Relative	Row numbers start from the top of the restricted cell range.
Absolute	Row numbers start from the top of the worksheet and thus correspond to the absolute row numbering in Excel/Quattro.

Text to number

This controls how text labels are interpreted as numbers when a label is changed to a value when converting to a variate.

Strict	Only labels that contain just numeric data are converted, otherwise they are set to a missing value (e.g. ’10’ -> 10, ‘1O’ -> *.)
Single	Single characters and labels containing just numeric data are interpreted as numeric values, otherwise they are set to a missing value (e.g. o,O -> 0, i,I,l,L -> 1, s,S-> 2, z,Z -> 5, comma -> decimal point, ’10’ -> 10, ‘Io’ -> *).
Common	Multiple characters and labels containing just numeric data are interpreted as numeric values, otherwise they are set to a missing value (e.g. ‘Io’ -> 10, ’23X’ -> *).
Standard	Multiple characters and labels containing just numeric data are interpreted as numeric values, but text is truncated at the end of a number (e.g. ’23X’ -> 23, ‘A2X3’ -> *).
Lax	Any digits are read from the text (e.g. ‘A2X3’ -> 23).

Missing value text

By default, empty cells or cells with a text consisting of a single asterisk * are read in as missing values (unless the column is a text column, in which case the * is read in as a text value ‘*’). If an alternative string to * has been used as a missing value place holder in the worksheet then entering this text here will cause this to be interpreted as a missing value in numerical columns. For example, a user may have entered ‘M’ for a missing value or may have added a note ‘Died’ in a column of live weights.

Add to book

This provides a list of book that the new spreadsheet can be added to. The default book that spreadsheets are added to can be specified using the Tools | Spreadsheet Options | Books dialog.

Action buttons

OK	Open the Excel file and close the dialog. If any factor columns contain a high proportion of missing values you will be prompted with a dialog that can be used to automatically fill the empty factor cells.
Cancel	Close the dialog without opening the file.