1. Home
  2. Merging Spreadsheets

Merging Spreadsheets

Merging allows you to combine data sets held in separate spreadsheets, as long as they have one column with the same name that can be used as a matching indicator to merge them together.

If you have two spreadsheets open in Genstat you can merge them together in different orders or at different levels of aggregation using the Merge menu. You can also merge data from more than two files.

Merged spreadsheets do not need to be the same file format e.g. you can merge a .csv file with an Excel or Genstat .gsh file.

Merging two open spreadsheets

Before you open the spreadsheets in Genstat ensure that each one has a column with the same name.

  1. Open the two spreadsheets you want to merge. To do this from the menu select File | Open then navigate to your spreadsheet files, select both by holding down Ctrl while clicking with the mouse, then click Open.

  2. Genstat will warn you that the second spreadsheet contains a column with the same name as a column in the first spreadsheet. This is what we want so click OK to close the prompt.

  3. Click anywhere in the first spreadsheet then from the menu select Spread | Manipulate | Merge. Genstat automatically puts the name of the second spreadsheet in the Merge data from sheet.
  4. Use Matching column and With column to select two columns of the same name.

  5. Set options as required (see Merge options below) or accept the defaults and click OK.

The first spreadsheet is updated with the merged data. In the images below the column ID was used as the matching column. Where an ID was found in one spreadsheet, but not the other, missing values are used to complete the rows.

Spreadsheet before merge Spreadsheet after merging

Merge options - two open spreadsheets

Merge data from sheet Lets you select which open spreadsheet to merge data from.

  • Number of columns to match – Specify how many pairs of columns are to be matched between the two spreadsheets. A column within one spreadsheet should only be selected once. Columns matched together can be selected from the two dropdown lists Matching column and With column described below.
  • Matching column – The columns in the first spreadsheet whose values are to be used in matching rows in the second spreadsheet.
  • With column – The columns in the second spreadsheet whose values are to be used in matching rows in the first spreadsheet.
Select columns to transfer

This opens a new dialog that lets you specify a subset of columns to be merged into the first spreadsheet.

  1. Move items from Available columns into Selected columns by double-clicking them. You can select multiple items by holding down Ctrl while clicking with the mouse, then click  to move them all across in one action.
  2. When you have selected your columns click OK to return to the previous dialog.

Extra rows from merged sheet Lets you specify what to do with unmatched rows from the second spreadsheet.

  • Do not transfer these rows – keep the current rows in the first spreadsheet and do not add in any new rows.
  • Add in extra rows only if matched – add an extra row from the second spreadsheet, if another row already contains this value, i.e. if there are two rows with a value 102 in spreadsheet 2, and only 1 in spreadsheet 1, then the second 102 row would only be added if this option was chosen.
  • Transfer all extra rows – add all extra rows from the second spreadsheet.
Tolerance on numeric matches

Specifies a positive value to be used as a tolerance when comparing two numerical values. If two values differ by less than or equal to this tolerance they will be recognized as a match.

Note: the value of 0 only matches identical values. Problems with apparently identical values not matching can occur when these values differ by the least significant digit, with this not being displayed, e.g. the two numbers 0.3333333333333333 and 0.3333333333333332 differ in the least significant digit, but when displayed to fewer than 16 decimal places they will appear to be equal. This problem occurs most often in calculated values, as for example the calculations 1/3 and 400/600 can give different results in the least significant bit in their binary representation.

Update existing columns in sheet
  • Update columns (by name) – columns with the same name will be updated, with the values from the second spreadsheet being placed into columns in the first spreadsheet. Otherwise columns from the second spreadsheet will just be appended. The default of using matching column names can be modified by selecting Specify matches (see explanation a few sentences further down this page).
  • Missing/empty cells only – only replace missing values in the matched columns in the first spreadsheet.
  • Insert non-missing values only– do not replace values in the first spreadsheet with missing values in the second spreadsheet.
Specify matches Select this to specify pairs of columns to be matched using the Matches button.
Matches

This opens a new dialog that lets you specify matches between columns in the two spreadsheets. Available columns contains columns from the first spreadsheet, which will have data added to it. For each column being transferred, a column from the original spreadsheet is selected. If no column in the original spreadsheet is selected, then the merged column will be added in as a new column in the current spreadsheet.

  1. Move items from Available columns into Match column with by double-clicking them. You can select multiple items by holding down Ctrl while clicking with the mouse, then click  to move them all across in one action.
  2. When you have selected your columns click OK to return to the previous menu.

Merging multiple spreadsheets

Before you open the spreadsheets in Genstat ensure that each one has a column with the same name.

  1. From the menu select Spread | New | Merge Multiple Files.
  2. Click  to navigate to your spreadsheet files.

  3. Select each file by holding down Ctrl while clicking with the mouse, then click Open.

  4. In the Identifying columns for merge field, type the names of all columns you want to use to identify matching rows between spreadsheets, pressing the keyboard Enter key after each name to put it on a new line.

    Each identifying column must appear in all files that are being merged. You can enter a maximum of 8 identifying columns.

  5. Set options as required (see Merge options below) or accept the defaults and click Open.

The spreadsheet opens to display the merged data.

Merge options - multiple spreadsheets

Filename

Lets you select which files to include in the merge. For multi-paged Genstat books (GWB) or Excel files, you will be prompted for each of these files in turn to select one or more of the sheets within the file to merge.
The files are merged in the order that they are displayed in the list. The following buttons can be used to add, remove or change the order of the files in the merge list.

  • Add – add the file in the Filename field into the list of files.
  • Change – replace the selected the file in the list with the name in the Filename field.
  • Remove – remove the selected file from the list.
  • Up – move the selected file up one place in the list.
  • Down – move the selected file down one place in the list.
  • Bottom – move the selected file to the last place in the list.
Identifying columns for merge Lets you specify the names of all columns you want to use to identify matching rows between spreadsheets. Each name must appear on a separate row. Each identifying column must appear in all files that are being merged. You can enter a maximum of 8 identifying columns.
Check columns for factor values If selected, Genstat checks all columns to see if they contain repeated values. If columns appear to contain categorical data you will be prompted to convert these to factors.
Check columns for date values If selected, Genstat checks all text columns to see if they contain data in date format. If any columns appear to contain dates you will be prompted to convert these to a date format.
Set as active sheet

This sets the new spreadsheet that is created as the active spreadsheet. Setting an active spreadsheet provides a method of avoiding multiple data updates to the server when more than one spreadsheet is open within Genstat. When a spreadsheet is set as the active spreadsheet only data from that spreadsheet is automatically updated to the Genstat server.

Another advantage of specifying an active spreadsheet is that the Spread menu options will always be enabled even if your spreadsheet does not have the cursor focus.

Add to book Any open spreadsheets will be displayed in this dropdown list. Select a spreadsheet to merge your data with or leave it at the default setting New Book to create a new workbook.

Updated on June 20, 2019

Was this article helpful?