1. Home
  2. Split or Subset a Spreadsheet

Split or Subset a Spreadsheet

Select menu: Spread | Manipulate | Split/Subset

Use this to create new spreadsheets based on a subset of rows from an existing spreadsheet.

  1. After you have imported your data, from the menu select 
    Spread | Manipulate | Split/Subset.
  2. Fill in the fields as required then click OK.

Operation

  • Split into multiple sheets – Form a new spreadsheet from the specified rows, and a second spreadsheet from the remaining rows. When a spreadsheet is split using Factor groups, one spreadsheet for each factor level will be formed. If a selection of groups has been defined, then only the selected groups will have a spreadsheet formed for them.
  • Subset to a single sheet – Form a new spreadsheet from the specified rows.

Split sheet using

  • Factor Groups – The Factor dropdown list specifies a factor to use to split the rows. If subsetting, the rows in the selected groups selected in the Factor groups to keep list will appear in the new sheet. If splitting, each selected group displayed in the Factor groups to keep list will be split into different spreadsheets. Note that if the spreadsheet contains no factors then this item will not be available. See Understanding Factors within a Spreadsheet for more help on factors.
  • Restriction – The current restriction is used to define the rows that are placed into the
    two spreadsheets, if splitting. The included rows are placed in the first spreadsheet and the excluded rows into the second. If subsetting, the Keep included rows option in the Restriction list field will take the included rows for the subset, and the Keep excluded rows option will take the excluded rows.
    This item is only available if the spreadsheet has a restriction/filter currently defined. See Spreadsheet Restrict/Filter Menu for help on defining restrictions/filters.
  • Selection – The current row selection is used to define the rows that are placed into the two spreadsheets. If splitting, the selected rows are placed in the first
    spreadsheet and the unselected rows into the second. If subsetting, the Keep selected rows option in the Selection list field will use the selected rows for the subset, and the Keep unselected rows option will use the unselected rows.
    This item is only available if the spreadsheet has a row selection currently defined. See how to select data or cells and Spreadsheet Select Menu for help on defining selections.
  • Values in a Column – The Values in dropdown list specifies a column whose values are used to split the rows. The comparison used in the selection criterion is chosen from the dropdown list of Equal to, Less than and Greater than.
    The value to compare with is specified in the field following the comparison
    operator. For the Equal to comparison, multiple values may be given in a comma or space separated list (e.g. 1,2,5,8). If subsetting, rows that satisfy the criterion will appear in the new sheet, and if splitting, the unselected rows will be placed in a second spreadsheet.
  • Random Sampling – A random selection of rows will be selected for the new spreadsheet. If splitting, any row not selected in the random sample will appear in the second spreadsheet. The Number of samples field specifies how many random
    samples will be used. The value can be interpreted as a percentage of the number of rows in the spreadsheet by selecting the % option. If the Sample with replacement option is selected, then at each random selection of a row, all the available rows are eligible, otherwise only rows not yet selected are eligible for selection. If a column in the Weighting dropdown list is selected, then the values in the selected column will be used in a weighted random sample. Rows with a weight value ≤ 0 or missing will not be included in the random sample. The Seed field defines an integer value that will be used to start the randomization. If a value of * is given for the seed, a value from the computer’s clock will be used for the seed. If sampling without replacement is used, then the number of samples must be less than the number or rows in the spreadsheet (or < 100%).

Create unique column names

When selected, then columns in each spreadsheet will have new names generated for them so that they are unique, otherwise the columns will have the same names as the original spreadsheet.

Use factor labels in column names

If splitting the spreadsheet using Factor groups, and Create unique column names is selected, then the names of the factor groups will be used in creating the new unique column names, rather than the default of appending _1, _2, etc, to the original column name. For example, if the factor labels are ‘One’, ‘Two’, ‘Three’, then a column X would generate new column names of X_One, X_Two, and X_Three.

See also

Generate a Random Subset from a Spreadsheet
Randomize Rows in a Spreadsheet
Duplicate a Spreadsheet
Spreadsheet Restrict/Filter Menu
Spreadsheet Select Menu
Spreadsheet Delete Menu
Spreadsheet Manipulate Menu

The SUBSET procedure can be used in the command language split data.

Updated on March 13, 2019

Was this article helpful?