1. Home
  2. TXBREAK directive

TXBREAK directive

Breaks up a text structure into individual words.

Option

SEPARATOR = text Defines the characters separating the words in the original text; default ' ,;:.'

Parameters

TEXT = texts Text to break into words
WORDS = texts Saves the words contained in each text (in the order in which they occur)
COLUMNS = variates Saves the number of the column in the TEXT where each word began
LINES = variates Saves the number of the line where each word was found
PLACESINLINES = variates Saves the place of each word (first, second &c) within the line where it was found

Description

The TXBREAK directive forms a text containing all the words (including duplicates) found in a text. The original text to break up is supplied by the TEXT parameter, and the WORDS parameter saves a text storing the words that it contains. The words are stored in the order in which they occur in the original text (but, for example, you could use the SORT directive to sort them into alphabetic order). The LINES parameter can save a variate recording the line in the original text where each one was found. The COLUMNS parameter can save a variate recording the column where each word began, and the PLACESINLINES parameter can save a variate giving the place of each word (first, second &c) within the line where it was found.

By default, the words are assumed to be separated from one another by spaces or by any of the standard punctuation characters (comma, semi-colon, colon, full stop). However, you can use the SEPARATOR option to specify some other characters. For example, you could put SEPARATOR=' ,;:.?' to allow question marks as well. These characters are all removed from the words when they are stored.

Option: SEPARATOR.

Parameters: TEXT, WORDS, COLUMNS, LINES, PLACESINLINES.

Action with RESTRICT

TXBREAK takes account of any restrictions on the original text, and omits the words in the restricted lines.

See also

Directives: TEXT, CONCATENATE, EDIT, TXCONSTRUCT, TXFIND, TXPOSITION, TXREPLACE.

Procedure: TXSPLIT.

Functions: CHARACTERS, GETFIRST, GETLAST, GETPOSITION, POSITION.

Commands for: Calculations and manipulation.

Example

"Example 1:4.7.3, 1:4.7.4 and 1:4.7.6"
TEXT Intro6; VALUES=!t(\
'Genstat has very comprehensive facilities for Analysis of Variance.',\
'Almost all of these can be accessed using custom menus. In this',\
'chapter, we start with the simplest design, a one-way completely',\
'randomized experiment, before introducing factorial experiments,',\
'which have more than one treatment or fixed effect. We use an',\
'experiment with a randomized block design to show how to deal with',\
'blocks, which involve more than one stratum or source of error in',\
'the analysis, and extend this idea by analysing a split-plot design.',\
'Many other types of design can also be analysed by Genstat, and',\
'details are available in Chapter 4 of Part 2 of the Guide to',\
'Genstat. We also introduce some of Genstat''s extensive facilities',\
'for creating designed experiments, available from the Design option',\
'of the Stats menu.')
TXPOSITION Intro6; SUBTEXT='Genstat'; POSITION=Where
TXPOSITION Intro6; SUBTEXT='Genstat'; POSITION=Next; SKIP=Where
PRINT      Where,Next; DECIMALS=0
TXFIND     [DISTINCT=left,right] Intro6; SUBTEXT='the';\
           COLUMN=column; LINE=line
PRINT      [SQUASH=yes] line,column & Intro6$[line] & '!'; FIELD=column
FOR [NTIMES=999]
  TXFIND   [DISTINCT=left,right] Intro6; SUBTEXT='the';\
           COLUMN=column; LINE=line; ICOLUMN=column+1; ILINE=line
  EXIT     line .EQ. 0
  PRINT    [SQUASH=yes] line,column & Intro6$[line] & '!'; FIELD=column
ENDFOR
TXBREAK  Intro6; WORDS=Words
GROUP    [CASE=ignored; REDEFINE=yes] Words
TABULATE [PRINT=count; classification=Words]
Updated on June 17, 2019

Was this article helpful?