There are 4 other files named select.rnh in the archive. Click here to see a list.
SELECT is a separately compiled SIMULA CLASS to enable searching
of texts or files applying BOOLEAN search criteria like
meaning that you want to find all lines or records which
contain either the word "SWEDEN" or the word
"DENMARK" but not the word "COPENHAGEN".
SELECT contains procedures to
a) Convert a text containing such a BOOLEAN search
formula into a formula tree suitable for fast searching.
b) Procedures for applying such a formula tree to a text or
part or whole of a text array.
The allowed operators in the formula are +(=OR), _&(=AND),
-(=NOT) and the left and right parenthesises, ( and ).
If parenthesises do not indicate otherwise, + is assumed to have
the lowest priority, followed by _&, - and the parenthesises.
These operator characters can be changed by the user.
SELECT does not contain any input/output procedures.
Accessible attributes of the CLASS select:
.left margin 10
BOOLEAN PROCEDURE BUILD__CONDITION
Translates a text string like "SWEDEN+DENMARK_&-COPENHAGEN" into
a formula tree
This tree can later be used to scan a text through the procedures
LINE__SCAN and ARRAY__SCAN. BUILD__CONDITION
will return FALSE if the input formula had bad syntax,
e.g_. with non-balancing parenthesises. In that case, an appropriate error
message is returned in the text SELECT__ERRMESS.
Parameters to BUILD__CONDITION:
REF (OPERATOR) TREE__TOP returns the formula tree.
TEXT SELECTOR contains the input formula text.
SELECTOR need not contain any operators, in which case it simply means
a search for that whole text. If SELECTOR == NOTEXT, then TREE__TOP
returns NONE, which means a formula matching all scanned texts.
BOOLEAN CASESHIFT: If TRUE, upper and lower case characters will be regarded
BOOLEAN PROCEDURE LINE__SCAN
sapplies a scanning formula to a TEXT. Returns TRUE if the
formula is satisfied by segments of the TEXT. The formula
"SWEDEN+DENMARK_&-COPENHAGEN" is for example TRUE for the TEXT
"DENMARK IS A EUROPEAN COUNTRY" but not TRUE for the TEXT
"COPENHAGEN IS THE CAPITAL OF DENMARK".
Parameters to LINE__SCAN:
REF (OPERATOR) TREE__TOP is a formula-tree produced by BUILD__CONDITION.
If this parameter is NONE, LINE__SCAN will always return TRUE.
TEXT INLINE is the text to which the formula is to be applied.
If TREE__TOP =/= NONE AND INLINE == NOTEXT, then FALSE will
BOOLEAN PROCEDURE ARRAY__SCAN
is similar to LINE__SCAN, but will apply the formula to several
lines of TEXT comprising part or the whole of a TEXT array.
Parameters to ARRAY__SCAN:
REF (OPERATOR) TREE__TOP is the formula produced by BUILD__CONDITION.
TEXT ARRAY LINES is the array of text lines to which the formula is to
INTEGER I1, I2 are the lower and upper bound of the lines in the TEXT ARRAY
to which the formula is to be applied. I1 may be larger than the
absolute lower bound of the array, and I2 may be less than the absolute upper
bound of the array, if you only want to apply the formula to part of the
lines in the whole array. If TREE__TOP == NONE, TRUE will always
be returned. If TREE__TOP =/= NONE AND I2 < I1,
then FALSE will always be returned.
will print a formula tree on SYSOUT in the format
i.e_. fully parenthesized to show any default assumptions of operator
priorities. Outimage should be called immediately after TREE__PRINT.
Parameters to TREE__PRINT:
REF (OPERATOR) TREE__TOP is the formula tree to be output.
will tell the package which characters are to be used as delimiters
in formulas input to BUILD__CONDITION. A default assumption is made
if SET__OPERATOR__CHARACTERS is not called from your program.
Parameters to SET__OPERATOR__CHARACTERS:
TEXT T: This TEXT should always be of length 5 and contain the
following five characters:
.left margin 20;.nofill
( LEFT PARENTHESIS
) RIGHT PARENTHESIS
.left margin 10;.fill
If BUILD_CONDITION finds a syntax error in the formula, this TEXT
will return an appropriate error message. You can thus combine
SELECT with SAFEIO and write e.g_.:
request("Give selection criteria:",
Is the qualification to be used when you declare formula
tree variables in your program, e.g_.:
.skip;.left margin 20
REF (OPERATOR) selector1, selector2;
.left margin 10
.nofill;.left margin 10;.p -10,2,32
OPTIONS(/l); COMMENT demonstration example for SELECT;
COMMENT this program will list all lines in an input file
which satisfy a selection formula;
EXTERNAL TEXT PROCEDURE rest, upcase;
EXTERNAL TEXT PROCEDURE scanto, from, conc;
EXTERNAL CHARACTER PROCEDURE findtrigger;
EXTERNAL BOOLEAN PROCEDURE frontcompare, puttext;
EXTERNAL INTEGER PROCEDURE scanint, search;
EXTERNAL CLASS select;
REF (operator) formula;
outtext("Input selection formula:"); breakoutimage;
IF NOT build__condition(formula,
BEGIN outtext(select__errmess); GOTO ask__for__formula;
INSPECT NEW infile("Infile *") DO
open(blanks(150)); sysout.image:- image;
WHILE NOT endfile DO
IF line__scan(formula,image.strip) THEN outimage;
.fill;.left margin 10
You need not consider this section to make the select package work,
only if you want to make your programs more efficient.
Scanning of non-significant texts takes time, especially for
complex formulas requiring many scans of the text. In such
a case you can often save time by only applying LINE__SCAN
or ARRAY__SCAN to those parts of your text which contain information,
e.g_. by supressing non-significant blanks. The simplest case
of this is to strip your texts before scanning them.
LINE__SCAN on a long text is more efficient than ARRAY__SCAN
on several shorter texts, especially if CASESHIFT is TRUE.
Sometimes, you can let your array elements be subtexts of a
common main text, and apply LINE__SCAN on part or whole of this main
text instead of ARRAY__SCAN on the array.
However, ARRAY__SCAN is faster than LINE__SCAN if the total
length of the scanned texts becomes smaller, e.g_. if use of
ARRAY__SCAN allows you to avoid scanning of blanks at the end of
lines in the text.
LINECOPY__BUFFER is a text attribute to select used to keep
copies of the texts to be scanned when caseshift = TRUE and when
the number of array elements to be scanned by ARRAY__SCAN
is larger than 10.
Sometimes you can improve efficiency by assigning values yourself
to this buffer. Do not make assignments to it too often.
If you want to write your own procedures similar to LINE__SCAN and
ARRAY__SCAN you need access to this attribute of SELECT, which
internally refers to the lines to be scanned.
[END OF SELECT.HLP]