1. Home
  2. Splitting Values into new Columns (parsing)

Splitting Values into new Columns (parsing)

Text strings or numeric values in a column can be parsed (split) into several new columns. There are two menus that allow you to create new text columns from other columns. Spread | Calculate | Text Split takes sections of text from an existing column and splits it into new columns. 

Spread | Calculate | Combine Text combines text from two or more columns into a new text column. Sometimes you may need to use combinations of these two menus to get the text you require. For example, you could split a column into separate text columns to extract just the parts of the text you need, then put these separate text items back into a single text column.

These menus do not just work on text columns but can be used for the text displayed in a variate or factor column (e.g. the factor labels). One application of this menu is to split a factor with two-part labels into two factors whose levels are specified by the first and second parts respectively. For example a factor with labels ‘N5’, ‘N10,’ ‘P5’, ‘P10’ could be converted into two factors with labels ‘N’, ‘P’ and ‘5’, ’10’ respectively.

You can specify where to split the values using specific characters, such as spaces or hyphens, or specify a location. The cell values resulting from the split can be truncated to a specific length using the width specification for each result column. The resulting columns can be automatically converted into factors if required.

  1. From the menu select Spread | Calculate | Text Split.

     

  2. Select which column you want to split from the Split column dropdown list.
  3. Choose an option for splitting the text from the Split using list e.g. if the column contains a number and a text separated by a space you can split these where the space occurs.
  4. Select the Number of splits to save to specify the number of new columns to create.
  5. Enter the names of the new columns into the Save in column names fields.
  6. Set other options as required then click OK to perform the split and close the dialog, or click Apply to keep the dialog open for another split operation.

In the image below, the text in the column Treatment has been split into three new columns, which have then been converted to factors.

Options

Split column

Lists the columns in the spreadsheet (text or factors) that can be split.

Split using

These options specify how to split the characters in each cell into the new columns.

  • Space delimiter – Split the text using spaces, e.g. ‘A B C’ would split to ‘A’, ‘B’, ‘C’
  • Comma delimiter – Split the text using commas, e.g. ‘A,B,C’ would split to ‘A’, ‘B’, ‘C’
  • Other delimiters – Split the text using delimiters specified in the Separator list, which will become visible when you select this option.
  • Specified positions – Split the text at specified positions, e.g. the string ‘ABCDE’ split at positions 1, 2 and 5 would result in ‘A’, ‘BCD’ and ‘E’. Set the positions in the Position list and specify the direction of counting using the Positions counted from option below.
  • String delimiter – Split the text using the complete string entered into Separator string, which will become visible when you select this option. For example, if ‘to’ was entered as the separator string then ‘Medium to High’ would be split to ‘Medium’ and ‘High’ (with spaces trimmed as described below).

Position list

Specify a comma or space separated list of numbers to identify the positions where to split the text. For example, the list 1,2,4 would split the string ‘ABCDE’ into ‘A’,’BC’ and ‘DE’.

Separator list

Specify a string of characters that identify where the breaks are to occur. For example, +-= would split ‘2+3=5’ to ‘2’, ‘3’ and ‘5’ (if the delimiters are not being retained as part of the split text).

Separator string

When the String delimiter option has been selected as the split method this specifies where the breaks are to occur. For example, ‘and’ would split ‘1 and 2 and 3’ to ‘1’, ‘2’, ‘3’.

Treat multiple delimiters as one

One or more delimiters (e.g. multiple spaces) are treated as a single break point in splitting the text. For example, if the delimiter is a space then the string ‘A B’ would be split into two items ‘A’ and ‘B’. However, if this item was not selected then, ‘A B’ would be split into three items ‘A’,’ ‘ and ‘B’ (the second split is a single space).

Keep delimiters

When selected, each delimiter will be retained in the split text. For example, if the string ‘A+B’ is split using the separator ‘+’, then this would be split into ‘A+’ and ‘B’. You can choose whether to include the separator on the left or right of the split text using the Delimiter stays on option.

Delimiter stays on

When the Keep delimiters option is selected, this specifies whether the delimiter is to be retained on the left or right of the split at that position. For example, if Left is selected and the delimiter is a colon ‘:’ then the string ‘A:B’ would be split into ‘A:’ and ‘B’. Alternatively, if Right is selected then the string ‘A:B’ would be split into ‘A’ and ‘:B’.

Positions counted from

When Specified Start Positions or Specified End Positions is selected as the split method, this specifies whether the positions are counted from the left or right of the text. For example, in ‘ABCDE’, ‘A’ is in position 1 when counted from the left, and in position 5 when counted from the right.

Number of splits to save

This specifies the number of splits to save. If a text splits into more items than specified here, then the splits exceeding this number will be discarded. If fewer items are created, the columns after this will contain blank entries.

Save in column names

Lets you select or provide the name of a column to save the split text into. If a column already exists in the spreadsheet, then a warning will appear before any column is over written.

Width

Setting this item for each column specifies a maximum width of the resulting text items. Items longer than this will be truncated to this length.

Change text case in new columns

These options control how the case of the text in the new columns is changed.

  • Unchanged – The case of the text is unchanged from the original column.
  • Lower – The case of the text is changed to lower case (e.g. ‘ABC’ -> ‘abc’).
  • Upper – The case of the text is changed to upper case (e.g. ‘abc’ -> ‘ABC’).
  • Title – The case of the text is changed to title case, where the first letter in each word (i.e. the first character, and those following a space) are made upper case and all other lower case (e.g. ‘ABC’ -> ‘Abc’, ‘low rate’ -> ‘Low Rate’).

Trim spaces from split columns

Removes leading, trailing and duplicate spaces from the resulting split up text

Convert saved columns to factors

Converts the resulting text columns into factors with labels.

Updated on March 1, 2019

Was this article helpful?