Locates strings within the lines of a text structure.
Options
CASE = string token |
Whether to treat the case of letters as significant when searching for lines of the SUBTEXT within the TEXT (significant , ignored ); default sign |
---|---|
REVERSE = string token |
Whether to reverse the search to work from the end of the lines of the TEXT (yes , no ); default no |
MULTISPACES = string token |
Whether to treat differences between multiple spaces and single spaces as significant, or to treat them all like a single space (significant , ignored ); default sign |
DISTINCT = string tokens |
Whether to require the SUBTEXT to have one or more separators to its left or right within the TEXT (left , right ); default * |
SEPARATOR = text |
Characters to use as separators; default ' ,;:.' |
Parameters
TEXT = texts |
Texts whose strings are to be searched |
---|---|
SUBTEXT = texts |
Specifies a string or strings to find in each TEXT |
POSITION = variates |
Position of the SUBTEXT strings within the TEXT |
WIDTH = scalars or variates |
Right-most character(s) to search in the lines of each TEXT ; default * searches up to the end of each line |
SKIP = scalars or variates |
Number of characters to skip at the left-hand side of the lines of each TEXT ; default 0 |
Description
The TXPOSITION
directive allows you to search for strings of characters within the lines of a Genstat text structure. The text to search is specified by the TEXT
parameter, and the SUBTEXT
parameter specifies the strings that are to be found. You can set SUBTEXT
to a single string (or to a text with just one line), if you want to search for the same string of characters within every line of the TEXT
. You can set SUBTEXT
to a text with as many lines as TEXT
, if you want to search for different characters in each line of the TEXT
. Finally, you can set TEXT
to a single string, and SUBTEXT
to a text with several lines, if you want to search the same string to see which of several strings might occur there. The POSITION
parameter can save a variate storing the position of the first character of the SUBTEXT
string(s) in each of the TEXT
lines, or zero if the string has not been found.
TXPOSITION
usually takes account of the case of letters (small or capital) in the strings when comparing SUBTEXT
with TEXT
. So for example 'GenStat'
would not match with 'Genstat'
. However, you can set option CASE=ignored
to ignore differences in case. It will usually also treat multiple spaces as significant, but you can set option MULTISPACE=ignored
to treat them all like a single space. By default, the search is from left to right (i.e. from the start to the end of each line of TEXT
), but you can set option REVERSE=yes
to search from right to left.
The SKIP
parameter allows you to skip characters at the start of the lines of TEXT
. You can supply a scalar to skip the same number of characters in every line, or a variate if you want to make different skips in each line. (So, once you have found a SUBTEXT
string, you can set SKIP
to its position and check whether it occurs again.) Similarly the WIDTH
parameter specifies the right-most character(s) of the TEXT
lines to search.
Option DISTINCT
is useful if you are looking for distinct words or phrases. The left
setting requires each SUBTEXT
string to begin either at the start of the relevant line of TEXT
, or to be preceded in that line by a separator (such as a space or comma). Similarly, the right
setting requires the SUBTEXT
to end within the line of TEXT
with a separator (or to be at the end of the line). The separators are specified by the SEPARATOR
option.
Options: CASE
, REVERSE
, MULTISPACES
, DISTINCT
, SEPARATOR
.
Parameters: TEXT
, SUBTEXT
, POSITION
, WIDTH
, SKIP
.
Action with RESTRICT
TXPOSITION
takes account of restrictions on any of the TEXT
or SUBTEXT
texts, and will search only the lines that are not excluded by the restriction. The values of the POSITION
variate in the restricted units are left unchanged.
See also
Directives: TEXT
, CONCATENATE
, EDIT
, GETLOCATIONS
, TXBREAK
, TXCONSTRUCT
, TXFIND
, TXREPLACE
.
Functions: CHARACTERS
, GETFIRST
, GETLAST
, GETPOSITION
, POSITION
.
Commands for: Calculations and manipulation.
Example
" Example 1:4.7.3, 1:4.7.4 and 1:4.7.6" TEXT Intro6; VALUES=!t(\ 'Genstat has very comprehensive facilities for Analysis of Variance.',\ 'Almost all of these can be accessed using custom menus. In this',\ 'chapter, we start with the simplest design, a one-way completely',\ 'randomized experiment, before introducing factorial experiments,',\ 'which have more than one treatment or fixed effect. We use an',\ 'experiment with a randomized block design to show how to deal with',\ 'blocks, which involve more than one stratum or source of error in',\ 'the analysis, and extend this idea by analysing a split-plot design.',\ 'Many other types of design can also be analysed by Genstat, and',\ 'details are available in Chapter 4 of Part 2 of the Guide to',\ 'Genstat. We also introduce some of Genstat''s extensive facilities',\ 'for creating designed experiments, available from the Design option',\ 'of the Stats menu.') TXPOSITION Intro6; SUBTEXT='Genstat'; POSITION=Where TXPOSITION Intro6; SUBTEXT='Genstat'; POSITION=Next; SKIP=Where PRINT Where,Next; DECIMALS=0 TXFIND [DISTINCT=left,right] Intro6; SUBTEXT='the';\ COLUMN=column; LINE=line PRINT [SQUASH=yes] line,column & Intro6$[line] & '!'; FIELD=column FOR [NTIMES=999] TXFIND [DISTINCT=left,right] Intro6; SUBTEXT='the';\ COLUMN=column; LINE=line; ICOLUMN=column+1; ILINE=line EXIT line .EQ. 0 PRINT [SQUASH=yes] line,column & Intro6$[line] & '!'; FIELD=column ENDFOR TXBREAK Intro6; WORDS=Words GROUP [CASE=ignored; REDEFINE=yes] Words TABULATE [PRINT=count; classification=Words]