Google
 

Trailing-Edge - PDP-10 Archives - decuslib10-05 - 43,50337/23/select.rnh
There are 4 other files named select.rnh in the archive. Click here to see a list.
.center
SELECT
.center
======
.skip
SELECT is a separately compiled SIMULA CLASS to enable searching
of texts or files applying BOOLEAN search criteria like
.skip
"SWEDEN+DENMARK_&-COPENHAGEN"
.skip
meaning that you want to find all lines or records which
contain either the word "SWEDEN" or the word
"DENMARK" but not the word "COPENHAGEN".
.skip
SELECT contains procedures to
.skip
a) Convert a text containing such a BOOLEAN search
formula into a formula tree suitable for fast searching.
.skip
b) Procedures for applying such a formula tree to a text or
part or whole of a text array.
.skip
The allowed operators in the formula are +(=OR), _&(=AND),
-(=NOT) and the left and right parenthesises, ( and ).
If parenthesises do not indicate otherwise, + is assumed to have
the lowest priority, followed by _&, - and the parenthesises.
These operator characters can be changed by the user.
.skip
SELECT does not contain any input/output procedures.
.skip
Accessible attributes of the CLASS select:
.left margin 10
.p -10,2,5
BOOLEAN PROCEDURE BUILD__CONDITION
.skip
Translates a text string like "SWEDEN+DENMARK_&-COPENHAGEN" into
a formula tree
.nofill
                      (OR)
                 SWEDEN   (AND)
                    DENMARK   (NOT)
                            COPENHAGEN

.fill
.skip
This tree can later be used to scan a text through the procedures
LINE__SCAN and ARRAY__SCAN. BUILD__CONDITION
will return FALSE if the input formula had bad syntax,
e.g_. with non-balancing parenthesises. In that case, an appropriate error
message is returned in the text SELECT__ERRMESS.
.skip
Parameters to BUILD__CONDITION:
.skip
REF (OPERATOR) TREE__TOP returns the formula tree.
.skip
TEXT SELECTOR contains the input formula text.
SELECTOR need not contain any operators, in which case it simply means
a search for that whole text. If SELECTOR == NOTEXT, then TREE__TOP
returns NONE, which means a formula matching all scanned texts.
.skip
BOOLEAN CASESHIFT: If TRUE, upper and lower case characters will be regarded
as identical.
.p -10,2,5
BOOLEAN PROCEDURE LINE__SCAN
.skip
sapplies a scanning formula to a TEXT. Returns TRUE if the
formula is satisfied by segments of the TEXT. The formula
"SWEDEN+DENMARK_&-COPENHAGEN" is for example TRUE for the TEXT
"DENMARK IS A EUROPEAN COUNTRY" but not TRUE for the TEXT
"COPENHAGEN IS THE CAPITAL OF DENMARK".
.skip
Parameters to LINE__SCAN:
.skip
REF (OPERATOR) TREE__TOP is a formula-tree produced by BUILD__CONDITION.
If this parameter is NONE, LINE__SCAN will always return TRUE.
.skip
TEXT INLINE is the text to which the formula is to be applied.
If TREE__TOP =/= NONE AND INLINE == NOTEXT, then FALSE will
be returned.
.p
BOOLEAN PROCEDURE ARRAY__SCAN
.skip
is similar to LINE__SCAN, but will apply the formula to several
lines of TEXT comprising part or the whole of a TEXT array.
.skip
Parameters to ARRAY__SCAN:
.skip
REF (OPERATOR) TREE__TOP is the formula produced by BUILD__CONDITION.
.skip
TEXT ARRAY LINES is the array of text lines to which the formula is to
be applied.
.skip
INTEGER I1, I2 are the lower and upper bound of the lines in the TEXT ARRAY
to which the formula is to be applied. I1 may be larger than the
absolute lower bound of the array, and I2 may be less than the absolute upper
bound of the array, if you only want to apply the formula to part of the
lines in the whole array. If TREE__TOP == NONE, TRUE will always
be returned. If TREE__TOP =/= NONE AND I2 < I1,
then FALSE will always be returned.
.p
PROCEDURE TREE__PRINT
.skip
will print a formula tree on SYSOUT in the format
.break
(SWEDEN+(DENMARK_&(-COPENHAGEN)))
.break
i.e_. fully parenthesized to show any default assumptions of operator
priorities. Outimage should be called immediately after TREE__PRINT.
.skip
Parameters to TREE__PRINT:
.skip
REF (OPERATOR) TREE__TOP is the formula tree to be output.
.p
PROCEDURE SET__OPERATOR__CHARACTERS
.skip
will tell the package which characters are to be used as delimiters
in formulas input to BUILD__CONDITION. A default assumption is made
if SET__OPERATOR__CHARACTERS is not called from your program.
.skip
Parameters to SET__OPERATOR__CHARACTERS:
.skip
TEXT T: This TEXT should always be of length 5 and contain the
following five characters:
.skip
.left margin 20;.nofill
Default Character
   +    OR
   _&    AND
   -    NOT
   (    LEFT PARENTHESIS
   )    RIGHT PARENTHESIS
.left margin 10;.fill
.p
TEXT SELECT__ERRMESS
.skip
If BUILD_CONDITION finds a syntax error in the formula, this TEXT
will return an appropriate error message. You can thus combine
SELECT with SAFEIO and write e.g_.:
.nofill
.skip
        request("Give selection criteria:",
        NOTEXT,textinput(line1_selector,
        build_condition(condition,selector,caseshift)),
        select_errmess,myhelp("SELECT"));
.fill
.p
CLASS OPERATOR
.skip
Is the qualification to be used when you declare formula
tree variables in your program, e.g_.:
.skip;.left margin 20
REF (OPERATOR) selector1, selector2;
.left margin 10
.nofill;.left margin 10;.p -10,2,32
OPTIONS(/l); COMMENT demonstration example for SELECT;
COMMENT this program will list all lines in an input file
which satisfy a selection formula;
BEGIN
  EXTERNAL TEXT PROCEDURE rest, upcase;
  EXTERNAL TEXT PROCEDURE scanto, from, conc;
  EXTERNAL CHARACTER PROCEDURE findtrigger;
  EXTERNAL BOOLEAN PROCEDURE frontcompare, puttext;
  EXTERNAL INTEGER PROCEDURE scanint, search;
  EXTERNAL CLASS select;
  select BEGIN
    REF (operator) formula;
    LINECOPY__BUFFER:- blanks(150);
    ask__for__formula:
    outtext("Input selection formula:"); breakoutimage;
    inimage;
    IF NOT build__condition(formula,
    sysin.image.strip,TRUE) THEN
    BEGIN outtext(select__errmess); GOTO ask__for__formula;
    END;
    tree__print(formula); outimage;
    INSPECT NEW infile("Infile *") DO
    BEGIN
      open(blanks(150)); sysout.image:- image;
      WHILE NOT endfile DO
      BEGIN
        inimage;
        IF line__scan(formula,image.strip) THEN outimage;
      END;
      close;
    END;
  END;
END;
.fill;.left margin 10
.p
EFFICIENCY CONSIDERATIONS:
.skip
You need not consider this section to make the select package work,
only if you want to make your programs more efficient.
.skip
Scanning of non-significant texts takes time, especially for
complex formulas requiring many scans of the text. In such
a case you can often save time by only applying LINE__SCAN
or ARRAY__SCAN to those parts of your text which contain information,
e.g_. by supressing non-significant blanks. The simplest case
of this is to strip your texts before scanning them.
.skip
LINE__SCAN on a long text is more efficient than ARRAY__SCAN
on several shorter texts, especially if CASESHIFT is TRUE.
Sometimes, you can let your array elements be subtexts of a
common main text, and apply LINE__SCAN on part or whole of this main
text instead of ARRAY__SCAN on the array.
However, ARRAY__SCAN is faster than LINE__SCAN if the total
length of the scanned texts becomes smaller, e.g_. if use of
ARRAY__SCAN allows you to avoid scanning of blanks at the end of
lines in the text.
.skip
.p -5,2,5
TEXT LINECOPY__BUFFER
.skip
LINECOPY__BUFFER is a text attribute to select used to keep
copies of the texts to be scanned when caseshift = TRUE and when
the number of array elements to be scanned by ARRAY__SCAN
is larger than 10.
.skip
Sometimes you can improve efficiency by assigning values yourself
to this buffer. Do not make assignments to it too often.
.p
TEXT LINE
.skip
If you want to write your own procedures similar to LINE__SCAN and
ARRAY__SCAN you need access to this attribute of SELECT, which
internally refers to the lines to be scanned.
.skip;.fill;.center
[END OF SELECT.HLP]