Google
 

Trailing-Edge - PDP-10 Archives - scratch - 10,7/unscsp/strlib/strlib.mem
There are 5 other files named strlib.mem in the archive. Click here to see a list.

















The information in this document is subject to change  without  notice
and  should  not  be  construed  as  a  comitment by Digital Equipment
Corporation.  Digital Equipment Corporation assumes no  responsibility
for and errors that may appear in this document.

The software described in this document is furnished under  a  license
and  may  be  used or copied only in accordance with the terms of such
license.

Digital Equipment Corporation assumes no responsibility for the use or
reliability  of  its  software  on  equipment  that is not supplied by
DIGITAL.

     Copyright (C) 1974,1979 by Digital Equipment Corporation

The following are trademarks of Digital Equipment Corporation:

DIGITAL        DECsystem-10   MASSBUS
DEC            DECtape        OMNIBUS
PDP            DIBOL          OS/8
DECUS          EDUSYSTEM      PHA
UNIBUS         FLIP CHIP      RSTS
COMPUTER LABS  FOCAL          RSX
COMTEX         INDAC          TYPESET-8
DDT            LAB-8          TYPESET-10
DECCOMM        DECsystem-20   TYPESET-11
                                                                Page 2


                          table of contents


  1.0 overview                                             3
  1.1 precepts and document organization                   3
  1.2 arrays, strings, and string operations               4
  1.3 usage of "strlib"                                    6
  2.0 declarative conventions and the string data-types    8
  2.1 storage allocation                                   8
  2.2 data-typing of strings                               8
  2.3 string pointers                                     10
  2.4 bounds checking                                     11
  3.0 the level-1 routines                                12
  3.1 the comparative routines                            14
  3.2 the copying routines                                15
  3.3 routines which return substrings                    16
  3.4 the routines which search strings                   18
  3.5 miscellaneous routines                              20
  4.0 level-2 related terminology                         24
  4.1 subsidiary values                                   24
  4.2 data-directed routines -- "mode" values             24
  4.3 character search terminology                        25
  4.4 comparing strings of different lengths              26
  5.0 the level-2 routines                                27
  5.1 the data-directed comparative routine               27
  5.2 the data-directed copying routines                  29
  5.3 the data-directed string searching routine          32
  5.4 character-searching                                 35
  5.5 conversions and mappings                            39
  6.0 error conditions                                    44
  6.1 the defined conditions                              42
  7.0 implementation characteristics                      46
  7.1 "strlib" configurations                             46
  8.0 a programming example                               48
                                                                Page 3


1.0 overview

this document describes the character string manipulation facilities
provided for the fortran-10 user by the string manipulation package,
"strlib".  this initial section is devoted to outlining the interface
between the fortran user and "strlib" and to developing the
operational primitives upon which most string usage is based.

1.1 precepts and document organization

historically, fortran has been word-oriented.  but whereas the line
between "word-machines" and "character-machines" has softened under
the pressure of user needs, fortran has lagged behind.  

          consequently the bulk of the string
          manipulating capabilities one would like
          to have must be grafted onto fortran in
          a manner which is essentially
          transparent to the source language
          syntax.

in other words, one must use subprograms in lieu of string
manipulating statements, and one must establish conventions
by which the existing data-typing and storage allocating
mechanisms of fortran (e.g.  "integer" statement,
"dimension" statement) can be used to describe strings to a
string manipulation package.  the declarative conventions
employed by "strlib", and also the role of literals, are
discussed in section two of this document.

the routines constituting the string manipulation package
are divided into two groups.  the two groups will
respectively be labeled the level-1 routines and the level-2
routines.  the distinction is made purely for expositional
clarity.  the level-1 routines are intended to provide the
"basic" string manipulation capabilities, and the level-2
routines provide either more specialized routines or more
efficient mechanisms for performing certain operations.
however the added capability of level-2 is achieved at the
cost of a more complicated user interface and additional
terminology.  section three is devoted to developing the
level-1 routines and sections four and five will
respectively introduce the additional terminology and
describe the level-2 routines.

section six is important as a reference since it contains a
description of each of the run-time warnings which "strlib"
can generate.  it also tells how to control the amount of
error checking which is done.  section seven can be skipped
by most readers since it is used to delve into some of the
internal workings of string and to suggest how "strlib" can
be used in other than the fortran environment.

section eight contains two commented programming examples
which illustrate many of the capabilities within "strlib".
                                                                Page 4


1.2 arrays, strings, and string operations

a string is to a character approximately what an array is to
a word.  and in fact, even though the character-referencing
routines have not been given names that reflect it, one can
think of a string as an array of characters.  for
expositional purposes this analogy will be taken advantage
of -- i.e.  the familiar subscript notation will be used to
denote the characters of a string and to introduce the basic
string operations.
(note:  within this document a string constant (i.e.
 literal) will be represented as it is within a fortran
 program -- enclosed in single quotes.  for example, 'zzz'
 is a string constant in the same sense that 3 is a numeric
 constant.  also the term, "length of a string", will be
 used interchangeably with the term, "the number of
 characters in a string" -- e.g.  the length of 'abcde' is
 five.)

1.2.1 concatenation


def.  1.1 concatenation is the string operation for
combining a group of strings together into a single string.
(!!) will be used to denote the infix concatenation
operator.

in terms of arrays, "c = a !!  b" means the following:

                dimension a(5),b(6),c(11)
                do 100 i=1,5
        100     c(i)=a(i)
                do 200 i=6,11
        200     c(i)=b(i-5)

thus b(1) is made to immediately follow a(5) within (c).
similarly " 'aaabbbccc' = 'aaa' !!  'bbb' !!  'ccc' ".

1.2.2 lexical comparisons

just as it is useful to compare numbers, it is useful to
compare strings.  however the mechanism of comparison is
slightly different in that string comparison is character by
character and left justified.  on the other hand there are
six basic lexical relational operators just as there are six
numeric relational operators.  that is, one string can be --
equal to, not equal to, greater than, less than, greater
than or equal to, or less than or equal to -- a second
string.

in the descriptions which follow, it is always assumed that
the two strings are of equal length.  

def.  1.2.  two strings are considered equal if each
character in the first string is equal to the corresponding
                                                                Page 5


character in the second string (e.g.  'abd' = 'abd', but
'abc' not = 'abd').

def.  1.3 thru 1.7.  the comparative rule -- in terms of
arrays -- for ".op." equaling one of .ne., .lt., .gt., .ge.,
or .le.  is as follows:

                dimension a(n),b(n)
                do 100 i=1,n
                if a(i) .op. b(i) goto success
                if a(i) .ne. b(i) goto failure
        100     continue

in other words "a .op.  b" succeeds if and only if the first
encountered unequal pair of characters is related by ".op.".
for example, 'abd' is greater than 'abcc' because the first
unequal character pair -- respectively 'd' and 'c' -- is
such that the character in the first string is greater than
the character in the second string.
(note:  for strings, each a(i) and each b(i) is constrained
 to be in the range, zero to 2**n-1, where (n) is the number
 of bits in a character (i.e.  for ascii, the range is
 0-127)).

(note:  all comparative routines use the full character set.
 for instance, capital "a" is in no way considered equal to
 little "a").

1.2.3 parts of strings

the converse of the concatenation operation is the ability
to deal with parts of a string.  

def.  1.8.  a substring of a string is any contiguous group
of characters within the containing string.

for instance, 'bbb' is a substring 'aaabbbccc'.  in the
array example following, (b) is caused to equal the 2nd thru
5th elements of (a).

                dimension a(11),b(4)
                do 100 i=2,5
        100     b(i-1)=a(i)

1.2.4 input/output

although input/output is not strictly a string operation,
the word orientation of fortran does make it necessary to
make some provision of word-orienting strings for output and
"un-word-orienting" them for input.  the concept in "strlib"
central to this issue is that of "storage block", and
storage blocks are discussed in sections 2.1 and 2.2.
additionally the routines which most closely relate to this
issue are "bldstr" for input (see section 3.5) and "alnstr"
for output (see section 3.1).
                                                                Page 6


1.2.5 the null string

def.  1.9.  the null string is any string with length of
zero.

the null string can be used locate a point within a string,
and the usefulness of this will be seen in the examples of
sections three and five.  a user can create a null string in
several ways.  the three direct ways -- in the sense that
one explicitly sets string length to zero -- are to use
"bldstr", "setstr", or "vecstr" which are described in
section three.

1.3 usage of "strlib"

the capabilities of the string manipulation package "strlib"
are accessible to the fortran programmer as user functions
and/or user subroutines, and the package exists as a library
file named "string.rel".  for example, to use "strlib" from
"ccl" one could type:

         .load user.prg,string/lib

and to use it from "link" one could type:

         .r link
         *user.prg,string/sea/go

the location of "strlib" is obviously installation
dependent, but normally one would expect it to be either
"sys:" or "new:".

within "strlib" a naming convention is upheld for the
routine names.  each routine name consists of three
descriptive letters followed by either "str" or "chr",
whichever is applicable.  for example, there is "copstr" and
"fndchr".

an alphabetical list of the level-1 routines in "strlib" is
as follows:

 aftstr         allstr          alnstr          befstr
 bldstr         bndstr          catstr          copstr
 eqlstr         geqstr          gtrstr          lenstr
 leqstr         lesstr          relstr          repstr
 revstr         setstr          trcstr          vecstr
 whistr         appstr

the list of level-2 routines is:

 aftchr         allchr          befchr          chkstr
 cmbstr         cmpstr          cnvstr          fndchr
 fndstr         mapstr          tabstr          taostr
 tazstr         tofstr          tonstr          whichr
 copchr
                                                                Page 7


2.0 declarative conventions and the string data-types

just as their are several numeric data types (e.g.  real,
complex), it is useful to define more than one string data
type.  but as noted in section one, it is necessary to use
the existing data types of fortran in conjunction with
conventions while achieving this.  similarly it is necessary
to use the existing storage allocation mechanisms of
fortran, and these are outlined next.

2.1 storage allocation

within fortran one can allocate storage in essentially three
ways:

1) in a data-typing statement (i.e.  integer, real)
2) in a array dimensioning statement (e.g.  dimension
   statement)
3) by using a constant (or literal) in an executable
   statement.

aside from obvious length restrictions as regards
unsubscripted scalars, all of these mechanisms can be used
to allocate strings.  however it is necessary to recall the
word orientation of fortran;  if one uses the statement
"dimension a(6)" to allocate space for a string, one has
actually allocated space for thirty (ascii) characters since
there are five characters per word.

2.2 data-typing of strings

before enumerating the string data-types, argument passing
must be discussed since it is via this mechanism that all
communication between the fortran programmer and "strlib"
occurs.  there are four classes of arguments that one will
have occasion to pass to "strlib".

1) fixed-point numbers.  
2) storage blocks.  these can be created by any of the
   mechanisms discussed in section 2.1 and can be declared
   with any data type.
3) bit masks.  these are one word quantities in which each
   bit position carries independent information -- only a
   user of level-2 routines need be cognizant of this sort
   of argument.
4) strings.  again note that it is only for arguments that
   are expected to be strings that "strlib" will do any
   data-type checking.

the data-typing conventions are as follows:

1) a string argument which has been typed either "real" or
   "integer" will be treated as a string of length five
   (irrespective of any dimensioning).  for example:
                                                                Page 8


                integer istr
                real    rstr
                rstr='happy'
                istr='days'
                call a-string-routine (rstr)
                call a-string-routine (istr)

   the first call passes the string 'happy', and the second
   call passes the string 'days '.  note that fortran will
   blank pad in the assignment " istr='days' ".  
   similarly " istr='' " will set "istr" equal to five
   blanks.

2) a string argument which has been typed "double precision"
   will be treated as a string of length ten (irrespective
   of any dimensioning).

3) a string argument which has been typed "logical" will be
   treated as a data-varying string.  a data-varying string
   has the property that the length of the string is stored
   in the word preceding the string.  
   (note:  the maximum possible length that can be specified
    for a data-varying string is 2**18-1 characters.)

   storage is normally allocated for a data-varying string
   by a dimensioning statement of some sort.  also one must
   be careful to allocate room for the character count as
   well.
   consider an example:

                dimension l(7)
                logical l
        100     l(1)=9
        200     call a-string-routine( l(2) )
        300     l(1)=5
        400     call a-string-routine( l(2) )

   the above program excerpt allocates a word for the
   character count (i.e.  l(1)) and space for a string of
   thirty characters starting at l(2).  the statement at 100
   causes the call at 200 to treat l(2) as the starting
   point of a string which currently contains nine
   characters, and the statement at 300 causes the
   invocation of "a-string-routine" at 400 to treat l(2) as
   a string of length five.

4) a string argument which has been typed "complex" will be
   treated as a "string pointer".  the idea of a string
   pointer is very important to the internal workings of the
   string manipulation package, but it is possible to give a
   casual user a "black box" description of what string
   pointers make possible without going into the nature of a
   string pointer.  this will be done here and a more
   detailed description will be delayed until the next
   section, section 2.3.
                                                                Page 9


   the power of string pointers is inherent in the way
   fortran-10 currently defines the concept of "function".
   within fortran, a function is a subprogram which
   (computes and) returns a value in hardware registers zero
   (and one).  consequently to avoid the necessity of having
   to say something like:

        call copy-string(tstring,any-string,2,13)
        iyesno=equal-string(tstring,string2)

   as opposed to:

        iyesno=eql-string(sub-string(any-string,2,13),string2)

   one must be able to pass the information communicated by
   an arbitrary length string within an actual argument
   containing one or two words.  and this is exactly what a
   string ptr does allow.  also string pointers allow
   references to (sub)strings to be made without copying the
   substring into a user variable or temporary area.

   there are several routines in "strlib" which return
   string pointers (see sections three and five).  to use
   these routines as "black boxes" (i.e.  as though they
   return strings), one need only do two things.

   a) declare these routines as "complex".
   b) use these routines only as arguments to other "strlib"
   routines.

5) a fortran literal (i.e.  hollerith constant) will be
   treated by each of the routines of "strlib" as a string
   constant.  however the word orientation of fortran is
   such that all literals are given a length which is a
   multiple of five characters.  for example 'aaabbbc' will
   be right padded with three blanks so that its length will
   be ten.  to circumvent the unavailability of exact length
   string constants, there is a routine in "strlib" to
   truncate (strip) trailing blanks from a string, and it is
   called "trcstr".  note that "trcstr" is one of the string
   pointer returning routines.

(note:  a literal may appear anywhere any other string can.
 consequently "strlib" will attempt to overwrite a literal
 which is used as the destination of one of the string
 modifying routines.)

2.3 string pointers

what information precisely describes a string?  there are
essentially two things one must know.  first of all, one
must know where the string starts -- i.e.  the address of
its first character.  secondly since a string can be an
arbitrary number of characters, one has to know its length.
consequently a string pointer contains a byte pointer (as
                                                               Page 10


its first word) since this is the decsystem-10's mechanism
of dealing with characters (i.e bytes), and its second word
contains the number of characters in the string pointed at.

what does a string pointer point at?  it points at a user
allocated storage block.  in other words, the processes of
declaring a string pointer and allocating storage for the
string pointed at are independent of one another.  in order
to transform a storage block and a desired initial length
into a string pointer, one must use the routine called
"bldstr".  this routine is described in detail in section
three.  for example:

                dimension block(11)
                complex bldstr,strptr
                data initl/26/
                strptr=bldstr(block,initl,0)

this example builds a string pointer which points at "block"
and has initial length of twenty-six.  note that there is
room for 11*5 = 55 characters however.

2.4 bounds checking

just as it is possible to reference an array element which
is out of bounds of the array's storage allocation, it is
possible to reference or attempt to modify a character which
is outside of the allocated length of a string.  a user of
"strlib" can cause bounds checking of string usage to occur,
when string pointers or data-varying strings are used, by
specifying a maximum length for the string when a call to
either "bldstr" or "setstr" is made -- see section 3.5 for
details.  also for string variables typed "integer", "real"
and "double precision", bounds checking is always done as a
side effect of the fixed-length nature of these sorts of
string variables.

whenever a routine of "strlib" detects an out-of-bounds
reference it will print a warning message on the user's
console -- see section six for a description of the warning
messages.
                                                               Page 11


3.0 the level-1 routines

the routines in "strlib" communicate information to their
callers in one of four ways.  the rest of this section will
be used to introduce these methods of communication and
enumerate the routines which fall into each class (including
level-2 routines).  then in sections 3.1 thru 3.5 the
level-1 routines will be grouped by function and described
individually.

3.01 string-modifying routines

since arbitrary length strings can not be returned by
functions, text movement has to occur by modification of one
of the arguments specified in the invokation of a
string-modifying routine.  for each of the string modifying
routines, the string to be modified is the first argument of
the routine.

the second point about the string-modifying routines is that
each is (potentially) aware of an attempt to overflow the
destination string.  when one does overflow the
destination-string in a call to one of these routines, the
not out-of-bounds string movement does occur, and the
routine will return a zero to indicate that string movement
was not completed.  completion of string movement is
signalled by returning -1 rather than 0.  for example, let
s1 be a string whose maximum is 12 characters and let s2 be
a string whose maximum is 20 characters:

        integer modif-routine
        i1=modif-routine(s1,'is 15 characs ')
        i2=modif-routine(s2,'is 15 characs ')

the first call would overflow (s1) and leave it equal to 'is
15 charac' and set i1 to equal 0.  the second call would run
to completion and leave i2 equal to -1 and s2 equal to 'is
15 characs '.

the string modifying routines are:

 alnstr         appstr          catstr          chkstr
 cmbstr         cnvstr          copchr          copstr
 mapstr         repstr          revstr

(note:  "alnstr" is a special case in that the destination
 is a storage block rather than a string.
 note:  "copstr" and "copchr" are special in that they
 return no completion value.
 note:  "revstr" returns a string pointer rather than a
 completion value.
 note:  "repstr" leaves the destination string unchanged if
 the replace cannot succeed.
 note:  if one has no need to worry about out-of-bounds
 references, one can invoke a string-modifying routine as a
                                                               Page 12


 subroutine rather than as a function if that is desired.)

3.02 routines which return a truth value or integer

these routines should be declared as integer functions;
they return information rather than strings.  the simplest
example is "lenstr" which simply returns the number of
characters in its argument.  consider some examples:

        integer eqlstr,lenstr
        if (eqlstr(string1,string2)) go to are-equal
        i=eqlstr(string1,string2)
        if (i.eq.-1) go to are-equal
        j=lenstr(string1)
        if (j.gt.lenstr(string2)) goto s1-longer

the information returning routines are:

 cmpstr         eqlstr          fndchr          fndstr
 geqstr         gtrstr          lenstr          leqstr
 lesstr         neqstr

(note:  some of these routines return subsidiary information
 -- e.g.  a second integer.  the mechanism by which this is
 done and the description of each type of subsidiary
 information will be presented in section five.  for the
 purposes of section three, each level-1 routine returns
 either an integer or a truth-value).

3.03 routines which return string pointers

this class of routines was introduced in sections 2.2 and
2.3.  the following are the routines which return string
pointers:

 aftchr         aftstr          allchr          allstr
 befchr         befstr          bldstr          bndstr
 relstr         revstr          trcstr          vecstr
 whichr         whistr

(note:  allchr and allstr are special cases in that each
 actually sets up three string pointers -- one of which is a
 return value and two of which are arguments passed to them
 by the user.
 note:  some of these routines can "fail".  when one does,
 double zero is returned rather a string pointer.)

3.04 routines invoked as subroutines rather than functions

these routines are somewhat miscellaneous.  their
commonality is that each communicates with its caller only
by modifying one of its arguments.  the routines in this
class are:

 setstr         tabstr          taostr          tazstr
                                                               Page 13


 tofstr         tonstr

(note:  "copstr", and "cnvstr" under certain circumstances,
 can be viewed as belonging to this class.)

3.1 the comparative routines

these six routines implement the relational operators
discussed in section 1.2.2.  each of them returns ".true."
(ie.  -1) if a comparison succeeds, and ".false." (ie.  0)
if it fails.

when the length of the two strings differs, the shorter
string is padded with blanks until it is equal in length
with the longer string.  the comparison then precedes by the
rules outlined in defs.  1.2 thru 1.7.

eqlstr

     usage:  i = eqlstr(string1,string2,0)
     (i) will be set to "true" if string1 is lexically equal
     to string2 and "false" otherwise.

geqstr

     usage:  i = geqstr(string1,string2,0)
     (i) will be set to "true" if string1 is lexically
     greater than or equal to string2 and "false" otherwise.

gtrstr

     usage:  i = gtrstr(string1,string2,0)
     (i) will be set to "true" if string1 is lexically
     greater than string2 and "false" otherwise.

leqstr

     usage:  i = leqstr(string1,string2,0)
     (i) will be set to "true" if string1 is lexically less
     than or equal to string2 and "false" otherwise.

lssstr

     usage:  i = lssstr(string1,string2,0)
     (i) will be set to "true" if string1 is lexically less
     than string2 and "false" otherwise.

neqstr

     usage:  i = neqstr(string1,string2,0)
     (i) will be set to "true" if string1 is lexically not
     equal to string2 and "false" otherwise.

examples:
        integer eqlstr, lesstr, geqstr
                                                               Page 14


        real*8 dstr
        data dstr/'aaa       '/
        istr = 'aaa'
        astr = 'bbb'
        if (eqlstr(istr,astr,0)) goto succes
        if (lesstr(istr,astr,0)) goto succes
        if (geqstr(istr,dstr,0)) goto succes

the first "goto" will not be taken and the second will be
taken.  the third will be taken because "istr" will be
padded with blanks to a length of ten.

(note:  the non-zero values of the third argument of each of
 these routines are discussed in section 5.1.)

3.2 the copying routines

these routines will be described in terms of the
concatenation operation defined in section 1.2.1.

copstr

     usage:  call copstr(dest-string,s1)
     this is the simplest copying routine.  it implements
     the assignment statement, " dest-string = s1 ".

appstr

     usage:  i = appstr(dest-string,s1)
     this routine implements efficiently the assignment
     statement,
     " dest-string = dest-string !!  s1 ".

catstr

     usage:  i = catstr(dest-string,n,s1,s2,...,sn)
     this routine implements the assignment,

               " dest-string = s1 !! s2 !! ... !! sn ",

     where the second argument (n) is a count of the number
     of strings to be concatenated.

alnstr

     usage:  i = alnstr(storage-block,word-cnt,n,s1,...sn)
     this routine is designed to facilitate the process of
     using fortran formatted i/o to output arbitrary
     strings.  it implements the same function as "catstr"
     with two differences.  first, "storage-block" and
     "word-cnt" are used in place of "dest-string" to
     identify a destination string.  this destination string
     starts at "storage block", is word aligned, and has
     length in characters of "word-cnt * 5".  secondly, if
     the combined length of the source strings is less than
                                                               Page 15


     "word-cnt * 5", the string created at "storage-block"
     will be blank padded until its length is "word-cnt *
     5".

     the utility of "alnstr" hinges upon two facts.  the
     fortran format code "a" always assumes that a sequence
     of characters starts at the left end of a word.
     strings that start on an arbitrary character boundary
     are thus difficult to deal with.  it is also difficult
     to dynamically specify the length of a string -- hence
     the need to blank pad to a defined length.

examples:

        logical l1(11)
        real*8 d1
        complex c1
        dimension storag(20),outblk(10)
        complex bldstr
        integer alnstr,appstr,catstr
        data l1/22,'a data-varying string  '/
        data d1/'a d-p one'/
        call copstr (istring,' abc')
        c1 = bldstr(storag,0,0)
        i = appstr(l1(2),d1)
        if (.not. i) goto fail
        i = catstr(c1, 3, l1(2), istring,'more')
        if (.not. i) goto fail
        i = alnstr(outblk, 10, 1, l1(2))
        write (1,101) outblk
101     format(1h ,10a5)

after executing the above program excerpt, (l1) would equal
'a data-varying string a d-p one', (c1) would point at a
string equal to 'a data-varying string a d-p one abc more ',
and the write statement would have generated a record equal
to:
         ' a data-varying string a d-p one                '.

also note the spaces following 'abc' and 'more' in the
literal above, this is caused by the fact that "integer"
string variables have length five and literals are padded to
a length which is a multiple of five.  also note the blank
padding exhibited in the execution of the write statement
because of "alstr".  lastly, note the "fail" checks for
their format even though they were not strictly necessary
since no bounds checking was set up for any of the
destination strings.

3.3 routines which return substrings

these routines (and the level-2 routines) are predicated on
the notion that string length has a geometrical basis --
that a character can be viewed as having a left side and a
right side.  in other words, the length of a string can be
                                                               Page 16


viewed as being the distance from the left side of the first
character to the right side of the last character in the
string.  what this means is that one must substitute the
concept of "position" for the concept of "subscript" when
discussing substrings.  (note however that in many
circumstances the two concepts are equivalent.)

def 3.1.  position -- a position is an index which locates a
particular point in a string.  the index assigned to the
point immediately preceding the first character in a string
is one.  in general, the (i)th position in a string is the
point to the right of the (i-1)th character and to the left
of the (i)th character in a string.

def 3.2 starting position -- the starting position of a
substring is the point to the left of the first character in
the string.  for example, within "example", the starting
position of "xamp" is two.

def 3.3.  ending position -- the ending position of a
substring is the point to the right of the last character in
the string.  obviously this is equivalent to the point to
the left of the (last + 1)th character, so that the ending
position of "xamp" (see def 3.2) is (5+1) = 6.

(note the identity:  string-length = ending-position -
 starting-position.
 note:  henceforth "pos1" will be used as shorthand for
 starting-position, and "pos2" will be used as shorthand for
 ending-position.)

relstr

     usage:  string-ptr = relstr(string1, addrel)
     this routine returns a string pointer which points at a
     string whose starting-position within string1 equals
     "addrel + 1" and whose length is "string1-length -
     addrel" and whose maximum is equal to:
     "string1-maximum - addrel".

vecstr

     usage:  string-ptr = vecstr(string1, pos1, length)
     this routine returns a string pointer which points at a
     string whose starting position within string1 is
     "pos1", whose length is "length", and whose maximum is:
                   "maximum of string1 - pos1 + 1".

bndstr

     usage:  string-ptr = bndstr(string1, pos1, pos2)
     this routine returns a string pointer which points at
     the substring whose starting-position within string1 is
     "pos1", whose length is "pos2 - pos1", and whose
     maximum is "string1-length - pos1 + 1".  however if
                                                               Page 17


     "pos2" is zero, "bndstr" will default "pos2" to equal
     the ending position of "string1" -- ie.  string1-length
     + 1.

examples:
        complex relstr, vecstr, bndstr
        complex sp1,sp2,sp3,sp4
        sp1 = relstr('1122334455',2)
        sp2 = vecstr('1122334455',1,6)
        sp3 = bndstr('1122334455',7,9)
        sp4 = bndstr('1122334455',7,0)

this excerpt causes sp1 to point at '22334455' and have
length eight.  sp2 will point at '112233' and have length
six.  sp3 will point at '44' and have length two.  lastly
sp4 will point at '4455' and have length four since the zero
third argument will be defaulted to eleven -- the ending
position of '1122334455'.

3.4 the routines which search strings

each of these routines will perform the identical search if
passed the same group of search-related arguments.  the way
they differ from one another is in the value of the string
pointer(s) they return.

the form of the search-related arguments of this class of
routine is:

              search-routine(host-string,n,s1,s2,...,sn)

where host-string is the string being searched, (n) is the
count of the search strings, and s1 thru sn are the search
strings.  the search can be described as follows:

                do 100 i=1,length-of-host
                do 100 j=1,num-of-search-strings
                if (host(i) .eq. s(j,1)) goto compar
        100     continue
                goto search-failed
        compar: if (eqlstr(vecstr(host,i,lenstr(s(j)),
                          s(j))) goto search-succeeded
                goto 100

s(j) is informal notation for the (j)th search string, and
s(j,1) is informal notation for the first character of the
(j)th search string, and host(i) is notation for the (i)th
character of the host.  the program excerpt illustrates that
the search works as a parellel search, finding the search
string which occurs earliest in the host-string.  for
example:

           search-routine('0123456789', 2, '56789','01234')

will find '01234' within the host string.
                                                               Page 18


(note:  if none of the search-strings are found within the
 host string, a search routine will return double zero
 rather than a valid string pointer value).

befstr

     usage:  string-ptr = befstr(host, n, s1,s2,...,sn)
     the arguments are as described above.  this routine
     returns a string pointer which points at the string
     within the host which precedes the found substring
     within the host string.

whistr

     usage:  string-ptr = whistr(host, n, s1,s2,...,sn)
     the arguments are as described above.  this routine
     returns a string pointer to the string which was found
     in the host string.
     (note:  it is assumed that this routine would be used
      only when there is more than one search string and one
      wants to know "which" search string was found).

aftstr

     usage:  string-ptr = aftstr(host, n, s1,s2,...,sn)
     the arguments are as described above.  this routine
     returns a string pointer to the string within the host
     string which is after the matched string.

allstr

     usage:  string-ptr = allstr(host,bef-ptr,aft-ptr,
                                   n,s1,.,sn)
     besides bef-ptr and aft-ptr, the arguments are as
     above.  this routine combines the functions of the
     three preceding routines.  it returns the same string
     pointer as "whistr", sets up bef-ptr to be the value
     that "befstr" would have returned, and sets up aft-ptr
     to be the value that "aftstr" would have returned.
     (note:  if "allstr" fails, bef-ptr and aft-ptr are left
      unchanged).

     usage:  string-ptr = allstr(host,bef-ptr,0,n,s1,...,sn)
     this usage changes the success behavior of "allstr".
     setting the "aft-ptr" argument to a fixed-point zero
     causes "allstr" to return just two string pointer
     values:
     1) string-ptr is set to found-string !!  rest-of-string
     2) bef-ptr is set as in the first usage.

     usage:  string-ptr = allstr(host,0,aft-ptr,n,s1,...,sn)
     this usage is similar to usage-two.  this time,
     however:
     1) string-ptr is set to beginning-of-string !!
     found-string
                                                               Page 19


     2) aft-ptr is set as in the first usage.

examples:
        complex sp1,sp2,sp3,sp4,sp5,sp6,sp7,sp8,sp9
        complex allstr,befstr,aftstr,whistr
        logical l1(2),l2(2),l3(2)
        real*8 digits
        data digits/'0123456789'/
        data l1/0,0/    !the null string
        data l2/2,'34'/
        data l3/2,'23'/
        sp1 = befstr('0123456789', 1, '34567')
        sp2 = aftstr('0123456789', 1, '34567')
        sp3 = whistr('0123456789', 1, '34567')
        sp4 = allstr('0123456789', sp5, sp6, 1, '34567')

        ******* end of part one ********

        sp1 = befstr(digits, 3, '56789', l2(2), l3(2))
        sp2 = befstr(digits, 3, l3(2), l2(2), '56789')
        sp3 = aftstr(digits, 1, 'abcde')
        sp4 = aftstr(digits, 1, '012')
        sp5 = whistr(digits, 2, l2(2), '34567')
        sp6 = whistr(digits, 2, '34567', l2(2))

        ****** end of part 2 ********

        sp1 = allstr(digits, sp2, sp3, 1, l1(2))
        sp4 = allstr(digits, sp5, sp6, 2, '01234', l1(2))
        sp7 = allstr(digits, sp8, sp9, 2, '23456', l1(2))

        ********** end of part 3 *********

        sp1 = allstr(digits,sp2,0, 1,l2(2))
        sp3 = allstr(digits,0,sp4,0, 1,l3(2))

after executing part one, sp1 would point at '012';  sp2
would point at '89';  sp3 would point at '34567';  sp4 would
equal sp3;  sp5 would equal sp1;  and sp6 would equal sp2.

after executing part two, sp1 and sp2 would point at '01'.
in both cases, l3 would be the matched string because
search-string-order only comes into play when more than one
search string starts at the same place in the host.  in
particular, sp5 would point at '34' and sp6 would point at
'34567' after the execution of the two "whistr"s.  after
execution of the two "aftstr"s, sp3 would equal zero since
'abcde' is not within "digits", and similarly sp4 would
equal zero because the search string is actually equal to
'012  '.

the calls of part three deal with the null string.  since
the null string matches anything, it would match the point
immediately preceding the first character in "digits", and
consequently both sp1 and sp2 would be set to a null string
                                                               Page 20


pointing to that position.  conversely sp3 would point at
'0123456789' -- the entirety of "digits".  however in the
next call to "allstr", sp4 would be caused to point at
'01234' since search order would cause the '01234' to be
encountered before the null string l1.  this call would also
set sp5 and sp6:  to the null string and '56789'
respectively.  the result of the third call to "allstr"
would be the same as the result of the first call to
"allstr" in the sense that sp7 would equal sp1 and so on.
this is the case because '23456' does not start at the
beginning of "digits", and hence the parellel search would
encounter the null string, l1, first.

the calls of part four show the affect of setting either the
bef-ptr or aft-ptr argument to zero.  in the first call, sp1
is set to '3456789';  and sp2 is set to '12'.
in the second call, sp3 is set to '123';  and sp4 is set to
'456789'.

3.5 miscellaneous routines

bldstr

     usage:  str-ptr = bldstr(storage-blk, length, maximum)
     this routine returns a string pointer which points at
     the beginning of storage-blk.  the string pointed at is
     given a length of "length" and a maximum of "maximum".
     specifying a maximum of zero is the mechanism for
     specifying no maximum.

lenstr

     usage:  i = lenstr(string1)
     (i) is set to the length, in characters, of string1.
     (note again that lenstr('literal') will return 10
      rather than 7).

repstr

     usage:  i = repstr(string1, string2, string3)
     this routine causes string1 to be modified such that
     string2, which is a substring within string1, is
     replaced by string3.  (if one makes the simplifying
     assumption that the value of string2 occurs within
     string1 in only one place,) one can describe "repstr"
     as follows -- lettng s1, s2, s3 be short for string1,
     string2, string3:

            s1 = befstr(s1,1,s2) !! s3 !! aftstr(s1,1,s2)

     if the replacement of string2 with string3 would cause
     the maximum of string1 to be exceeded, string1 will not
     be modified at all and (i) will be set to zero rather
     than -1.
     (note:  repstr('12345', '23', 'bc') is meaningless
                                                               Page 21


      because '23' is not a substring of '12345';  for even
      though the value '23' is within '12345', the literal
      '23' is totally distinct from literal '12345' and has
      a totally distinct starting address.)

revstr

     usage:  string-ptr = revstr(string1, string2 or 0)
     this routine will reverse the source string in the
     sense that the first character of the source string
     will be made the last character of the destination
     string and vice versa, the second -- the next to last,
     and so on.  
     if the second argument is 0, string1 will be treated as
     both the source string and destination string.  if
     argument two is non-zero, string2 will be treated as
     the source string and string1 will be treated as the
     destination string.  
     (note:  differences in length between string1 and
      string2 are ignored since the returned string pointer
      will correctly identify the length of the string
      reversed.  however if the maximum of string1 is less
      than the length of string2, the reversal will not
      occur and the string pointer will be set to double
      zero.)

setstr

     usage:  call setstr(string1, length, maximum)
     this routine provides the suggested mechanism for
     (initializing and) setting either the length or maximum
     length of a string.  if "length" is non-negative,
     string1 will be given a length of "length";  and if
     "maximum" is non-negative, string1 will be given a
     maximum of "maximum".  however if the data-type of
     string1 is not "complex" or "logical", this routine
     will act as a no-op.
     (note:  specifying a maximum of 0 is the mechanism for
      specifying no maximum at all.
      note:  a 4th argument can be specified when string1 is
      a string pointer.  this argument is used to create
      non-ascii strings.  in particular it causes "setstr"
      to set the byte size of string1 to the 4th argument.
      accordingly one would usually expect it to be 6 -- for
      sixbit strings.).

trcstr

     usage:  string-ptr = trcstr(string1)
     this routine returns a string pointer to a substring of
     string1 such that string1 and the substring start at
     the same character and the substring has no trailing
     blanks.  because this is such a basic routine, a short
     non-standard name is provided in addition to "trcstr".
     one can invoke this routine as "np" -- no padding.
                                                               Page 22


examples:
        integer repstr
        complex revstr,bldstr, trcstr
        complex sp1,sp2,sp3
        logical l1(2),l2(3),anynul(2)
        dimension inblk(5)
        data l1,l2/4, '1234', 7, 'abcdefg'/
        data anynul/0,0/
        read (1,101) inblk
    101 format(4a5)
        sp1 = bldstr (inblk,20, 20)
        i = lenstr(sp1)
        sp2 = revstr(l2(2), l1(2))
        sp3 = revstr(l1(2), 0)
        call setstr (l2(2), 6, 0)
        i = repstr(sp1, vecstr(sp1,2,0), 'abc')
        if (i) goto succes
        call setstr(sp1, -1, 25)
    300 i = repstr(sp1, vecstr(sp1,2,0), 'abc')
    301 i = repstr(sp1, vecstr(sp1,2,0), trcstr('abc'))
        i = repstr(l1(2), vecstr(l1(2),2,2), anynul(2))

the call to "bldstr" would point sp1 at "inblk" and give it
a length and maximum of twenty characters.  the succeeding
statement would set (i) to 20.  

the first "revstr" would set l2 to '4321efg', but would
point sp2 at '4321'.  the second "revstr" would reverse l1
in place and leave it equal to '4321'.  lastly the call to
"setstr" has the effect of truncating l2 by one and leaving
it equal to '4321ef'.

the group of calls to "repstr" attempts to illustrate the
role of the null string as well as the nature of "repstr".
the first call will fail because it is an attempt to replace
a zero-length string with a string of length five when the
destination string is already at its maximum.  this is
"gotten around" by using "setstr" to increase sp1's maximum,
while leaving its current length untouched, and repeating
the call to "repstr".  after that call, 'abc ' would be the
2nd thru 6th characters of the string pointed at by sp1, and
that string would have length of 25.  on the other hand, if
the statement at 301 had been executed rather than the one
at 300, only the 'abc' would be inserted in the destination
string, and its new length would be 23.  the last "repstr"
truncates a destination string, removing its second and
third characters and leaving it equal to '41'.
                                                               Page 23


4.0 level-2 related terminology

4.1 subsidiary values

most of the information-returning routines make available
more than one piece of information to their caller.  the
primary piece of information can always be "gotten at" by
declaring the routine as an integer function as noted in
section three.  when one wishes to get at the subsidiary
information, one must do something analogous to the
following:

        complex c1
        dimension ic (2)
        equivalence (c1,ic)
        c1=eqlstr(string1, string2, 0)

after execution of "eqlstr", ic(1) would contain either
"true" or "false", and ic(2) would contain one of -1, 0,1
depending upon whether lenstr(string1) was greater than,
equal to, or less than lenstr(string2).
(note:  henceforth the primary value of an information
 returning routine will be referred to as r0, and the
 subsidiary value as r1).
(note:  each of the routines which can potentially return a
 subsidiary value has been given a second name of the form
 <id>(sts or chs) where the trailing "s" stands for
 subsidiary.  for instance, "fndchr" can also be invoked as
 "fndchs").

4.2 data-directed routines -- "mode" values

"mode" is an argument common to several of the level-2
routines.  it is a bit mask which controls the direction of
processing within a particular routine.  in all cases,
"mode" consists of some number of 1-bit switches which can
be set independently (either by or-ing or adding switches
together).  a number of the switches are antonym pairs--for
example if the "append" bit is on, string combination is
"append" mode;  if the same bit is off, string combination
is "copy" (or overwrite) mode.  when a particular bit
defines an antonym pair, the off condition will be noted in
parenthesis.

defined switches (by routine):

for cmbstr and chkstr
                        append(copy) = 1 (i.e. bit 35 is on)
                        numeric (character) = 4 (bit 33)
                        pad = 8 (bit 32)

for cmpstr
                        ignore = 1 (bit 35)
                        exact = 2 (bit 34)
                        if neither "ignore" nor "exact" are
                                                               Page 24


                        set
                        "padded" is implied.
                        translate = 4 (bit 33)
                        trace = 8 (bit 32)

for fndstr
                        idxend (idxbegin) = 1 (bit 35)
                        anchor = 2 (bit 34)
                        partial = 4 (bit 33)
                        multiple = 8 (bit 32)
                        which (length) = 16 (bit 31)

for fndchr
                        idxend (idxbegin) = 1 (bit 35)
                        anchor = 2 (bit 34)
                        partial = 4 (bit 33)
                        backwards (forwards) = 2 (bit 32)

for mapstr
                        toascii (tosixbit) = 1 (bit 35)
                        bounds = 2 (bit 34)
                        translate = 4 (bit 33)
                        yesbound (nobound) = 8 (bit 32)
for cnvstr
                        toascii (tonumeric) = 1 (bit 35)
                        zeropad (blankpad) = 2 (bit 34)
                        nofill = 4 (bit 33)
                        always = 8 (bit 32)

(note:  the fortran "include" file "string.for" contains a
 parameter statement defining a symbolic value for each of
 the numeric mode values defined above.  the symbols in
 "string.for" are the symbols used above, or if these are
 too long, the first six characters thereof).

(note:  in section five, the pseudo-mode "others" will be
 used to indicate that the switch setting being described is
 not affected by other unrelated switches being turned on).

4.3 character search terminology

def.  4.1.  bit index -- there are 36 bits in a word and
each is assigned an index;  bit 0 of a word is the sign bit
of a word, and bit 35 of a word is the far right bit in a
word.

def.  4.2.  a bit-vector of length (n) is the sequence of
bits consisting of the (i)th bit of (n) consecutive words.
clearly there are 36 bit-vectors in any storage block -- one
corresponding to each bit index.

def 4.3.  a boolean character table(bct) is a bit vector of
length (128) whose purpose is to encode an arbitrary group
of (distinct) characters in such a way as to make the
execution speed of the analogue of:
                                                               Page 25


              search-routine(host, n, char1, c2,..., cn)

independent of the value of (n).

def.  4.4 let bct(storage-block, bit-index) denote the
specific boolean character table starting at storage-block
and consisting of the (bit-index)th bit of each word in the
storage-block.

def.  4.5 let the notation:

              bct(block,bit-index) = c1 !! c2 !! ... cn

indicate that bct(block,bit-index) is the encoded analogue
of search string list, (n, c1, c2,...,cn).

(note:  the routines which manipulate boolean character
 tables and the character searching routines are described
 in section 5.4).

4.4 comparing strings of different lengths

as noted earlier, the only "difficult" situation that can
occur in comparing two strings is that they are not the same
length but are equal for the extent of the shorter string.
in section 3.1, one method of reacting to inequality of
length was introduced, namely padding the shorter string
with blanks.  at this point, two other reactions will be
defined.

def.  4.6 an "exact-style" comparison will consider two
strings equal only if they are identical, i.e.  contain the
same characters and have the same length.  additionally, if
the two strings are lexically equal for the extent of the
shorter, the longer string will be treated as lexically
greater than the shorter string.

def.  4.7 in an "ignore-style" comparison, the mechanism of
obtaining equality of string lengths is to truncate the
longer string to the length of the shorter string rather
than to pad the shorter string.
                                                               Page 26


5.0 the level-2 routines

the level-2 routines are grouped approximately in the same
manner as the level-1 routines were.  however, if
applicable, several usages will be presented for a routine,
and features common to all usages for a particular routine
will be described before its list of "usage:" paragraphs.

5.1 the data-directed comparative routine

each usage of "cmpstr", the data-directed comparative
routine, contains an argument, "code", which determines the
relational operator which is to be applied to the strings
being compared.  "code" is an integer from 0 to 5 such that
code equaling:

        0 ==> the operator is "equal"
        1 ==> the operator is "not equal"
        2 ==> the operator is "greater than or equal"
        3 ==> the operator is "less than or equal"
        4 ==> the operator is "greater than"
        5 ==> the operator is "less than"

(note:  "string.for" also contains parameter statements for
 each of the "code"s defined above).

"cmpstr" returns a subsidiary value in r1 which indicates
the relative lengths of the two strings it compared.  in r1,
"cmpstr" will return:  

        -1 if string1 is shorter than string2
        0 if they are equal in equal
        1 if string1 is greater in length

the routine usages follow:

for mode = not exact and not ignore.

     usage:  cmpstr(string1, string2, code, not exact and
                                   not ignore)
     if the relationship between string1 and string2 in a
     "padded-style" comparison is the relationship denoted
     by "code", the comparision will be considered
     successful (ie.  -1 will be returned);  otherwise the
     comparison will be considered to have failed and will
     return 0 in r0.  for instance, "cmpstr(s1, s2, 2, 0)"
     is completely equivalent to "geqstr(s1, s2, 0)".

for mode = ignore.

     usage:  cmpstr(string1, string2, code, ignore)
     if the relationship between string1 and string2 in a
     "ignore-style" comparison is the relationship denoted
     by "code", the comparision will be considered
     successful (ie.  -1 will be returned);  otherwise the
                                                               Page 27


     comparison will be considered to have failed and will
     return 0 in r0.

for mode = exact.

     usage:  cmpstr(string1,string2, code, exact)
     if the relationship between string1 and string2 in a
     "exact-style" comparison is the relationship denoted by
     "code", the comparision will be considered successful
     (ie.  -1 will be returned);  otherwise the comparison
     will be considered to have failed and will return 0 in
     r0.

for mode = trace.

     usage:  cmpstr(string1, string2, code, trace + others)
     if the relationship between string1 and string2, using
     the specified style of comparison, is the relationship
     denoted by "code", the comparision will be considered
     successful (ie.  -1 will be returned).  
     otherwise the comparison will be considered to have
     failed and will return in r1 the position of the
     character which caused the comparison to fail.

for mode = translate.

     usage:  cmpstr(string1,string2, code, translate +
                                   others, translation)
     this usage will cause each character in string2, for
     the extent of the shorter string, to be translated by
     the numeric value specified in the 5th argument,
     "translation".  the "translate" mode can be used to
     compare numbers to letters, for instance;  or it can be
     used to compare ascii to sixbit strings when the
     translation factor is octal 40, and so on.

examples:
        complex cmpstr,cc
        integer ic(2)
        equivalence (ic,cc)
 100    cc=cmpstr('abcde','xyz',lss,0)
 200    cc=cmpstr('xyz','abcde',lss,0)
 300    cc=cmpstr('abcde','abc',eql,ignore)
 400    cc=cmpstr('abcde','abc',gtr,ignore)
 500    cc=cmpstr('abcde',np('abc'),eql,ignore)
 600    cc=cmpstr('abcde',np('abc'),gtr,ignore)
 700    cc=cmpstr(np('abc'),'abc',eql,exact)
 800    cc=cmpstr(np('abc'),'abc',lss,exact)
 900    cc=cmpstr(np('abc'),'abc',eql,trace)
 1000   cc=cmpstr(np('abc'),'abc',eql,trace+exact)
 1100   cc=cmpstr('abcde','12345',eql,transl,"20)
 1200   cc=cmpstr('12345','abcde',eql,transl,-"20)
 1300   cc=cmpstr('abcd0','1234 ',eql,trans,"20)
 1400   cc=cmpstr('abcd0',np('1234 '),eql,transl,"20)
                                                               Page 28


the call at 100 returns "true" in r0 (ie.  ic(1)) and zero
(to indicate lengths are equal) in r1.  on the other hand
the call at 200 returns "false" in r0.

the call at 300 returns "false" in spite of the "ignore"
mode because the actual length of 'abc' is five.  conversely
the call at 500 does return "true" in r0 and 1 in r1.  the
call at 400 returns "true" in r0 because "d" is lexically
greater than " ".  however the call at 600 returns "false"
since 'abc' equals 'abc' and the "d" is irrelevant.
correspondingly 1 is returned in r1 for 600.

the call at 700 returns "false" because the two strings have
different lengths.  correspondingly -1 is returned in r1.
the call at 800 does return "true" because the two strings
are equal for the extent of the shorter (i.e three
characters), and the first string is the shorter.

the call at 900 simply (re)pads string1 and returns "true"
in r0 and -1 in r1.  the call at 1000 notes the failure also
detected at 700 by returning 4 in r0 to indicate that the
failure was detected during the attempt to compare the
fourth characters of the two strings.

the call at 1100 returns "true" in r0 since the octal code
of "a" is 101 and the octal code of "1" is 61, etc.  the
call at 1200 notes the equivalence of simultaneously
inverting the strings and the sign of the translation.  the
next two calls, 1300 and 1400, show the difference between a
blank trailing character and blank padding when "translate"
is set.  the call at 1300 returns "true" since the codes of
blank and zero are octal 40 and 60, but the call at 1400
fails because string2 has no fifth character and the padding
blank is not translated.

5.2 the data-directed copying routines

"cmbstr" and "chkstr" are essentially generalizations of
"catstr" -- generalizations in the sense that they have more
modes of operation.  but since the exact same modes apply to
the two routines, the modes are described only for "cmbstr".
the only difference between "cmbstr" and "chkstr" is how
they react to an attempt to extend a string beyond its
maximum.  both will return 0 rather than -1, but "chkstr"
will checkpoint the copying operation by modifying two of
its arguments as well.

chkstr

     usage:  i = chkstr(dest-string, others, start-ptr,
     n-left, n, s1,...,sn)
     start-ptr and n-left provide the information to do the
     checkpointing.  for each call, start-ptr should point
     at the character at which one wants string movement to
     start (resume), and n-left should identify which source
                                                               Page 29


     string that character is in.  if s1 is that source
     string, n-left should equal (n);  if s2, then n-left
     should equal (n - 1), et cetera.  under normal
     circumstances, one would continue to call "chkstr" with
     the same arguments only so long as it continued to
     fail.  and after each failure, "chkstr" would itself
     set start-ptr to point at the next character to move
     and n-left to equal the index of the current source
     string.
     conversely, on the first attempt to concatenate the
     source strings, (n) and s1 are the "resume" values.
     consequently it is not necessary that n-left and
     start-ptr be explicitly set up since "chkstr" can get
     the appropriate values from (n) and s1.  one tells
     "chkstr" this is the first time through by setting
     n-left to zero.

cmbstr

for mode = append.

     usage:  i = cmbstr(dest, append + others, n, s1,...sn)
     this routine usage implements the assignment:

                    dest = dest !! s1 !! ... !! sn

for mode = not append.

     usage:  i = cmbstr(dest, not append + others, n,
                                   s1,...,sn)
     this routine usage implements the assignment:

                        dest = s1 !! ... !! sn

for mode = pad.

     usage:  cmbstr(dest,pad + others, n, s1,...sn)
     if the length of "dest" before the call is greater than
     the combined lengths of the source strings, "dest" is
     blank padded to that length after (sn) has been copied
     into (appended to) "dest".  if "dest"s length before
     the call is less than the combined lengths of the
     source strings, it is adjusted upwards to the larger
     value.

for mode = numeric.

     usage:  i = cmbstr(dest, numeric + others, n,
                                   source-array)
     source-array contains a list of characters encoded as
     fixed point numbers, and (n) is the number of items in
     the list.  this usage will cause the items in
     source-array to be decoded, concatenated and copied
     into (appended to) "dest".  for instance, 3 is the
     encodement of control-c and octal 40 is the encodement
                                                               Page 30


     of "blank".

examples:
        complex sp1,relstr
        integer chkstr
        logical l1(3)
        data s1,s2,s3,s4/'1122','3344','5566','7788'/
        data left/0/
        call setstr(l1(2), 0, 7)
 100    if (chkstr(l1(2),0, sp1,left, 4,s1,s2,s3,s4)) return
        write (1,101) l1(2),l1(3)
        goto 100
 101    format(1h ,a5,a2)

        ******* part 2 ******

        complex sp1,sp2
        logical l1(9)
        integer onelet,sevlet(6)
        data sevlet/"101,"103,"105,"15,"12, 0/
        data onelet/"10/
        data l1(1),l1(2)/5, 'start'/
        call cmbstr(l1(2), append,2,'more1','more2')
        call setstr(l1(2),40,-1)
        call cmbstr(l1(2),pad,2,'first half','second half')
        call cmbstr(sp1,numeric,1,onelet)
        call cmbstr(sp2,numeric, 5,sevlet)

part one shows how to use "chkstr".  the three-statement
loop starting at 100 is keyed on the return value of
"chkstr".  also note that "left" is initialized to zero in a
data statement.
the result of executing part one is to write out seven
characters three times.  in particular '1122 33' is written
out the first time;  '44 5566' is written out the second
time;  and ' 7788 ' is written out the third time.

after the execution of part two, sp1 will equal "backspace"
and sp2 will equal 'ace<crlf>'.  as regards the other two
calls to "cmbstr", the first will set the length of l1 to 15
and its value to 'startmore1more2', and the second will set
its length to 40 and its value to:
              'first halfsecond half                 '.


there is another level-2 copying routine, and it is called
"copchr".  it's sole purpose is to deal efficiently with
single bytes of arbitrary size.

copchr

     usage:  call copchr(str-ptr1,index1,str-ptr2,index2)
     this routine implements the assignment, string1(index1)
     = string2(index2), where string-ptr1 points at the
     string starting at string1 (i.e.  the first character
                                                               Page 31


     of string1 is denoted by string1(1)) -- and similarly
     for string-ptr2 and string2.
     if index1 is less than or equal to 1, 1 is assumed.
     if index2 is less than zero, the potential for
     "negative bytes" exists.  in other words, if index2
     were -3, the third byte of string2 would be picked up
     and its left most bit would left extended -- treated as
     a sign bit.
     if index2 is zero, 1 is assumed.
     "copchr" makes no attempt to detect if either index1 or
     index2 is too large -- out-of-bounds.

examples:
        complex bldstr
        complex sp1,sp2
        sp1=bldstr(i,1,0)
        call setstr(sp1,1,0,36)
        sp2=bldstr('abcde',5,0)
        call copchr(sp1,1,sp2,4)
        call copchr(sp2,2,sp1,1)

        i=-5
        call copchr(sp2,5,sp1,1)
        i=0
        call copchr(sp1,1,sp2,-5)

one of the primary (potential) uses of "copchr" is to deal
with "compressed" numbers.  this can be done by copying such
a number into a byte whose byte size is thirty-six, i.e.  a
full word.  

the first pair of "copchr"s copies a right-justified 'd'
into "i", and then modifies the ascii string to equal
'adcde'.  the second pair of "copchr"s places a compressed
-5 in the third byte of sp2 and then restores that -5 to "i"
after clobbering "i" in the assignment, i=0.

5.3 the data-directed string searching routine

the discussion of string searching at the beginning of
section 3.4 also applies to "fndstr", the data-directed
searching routine.

fndstr

for mode = not idxend.

     usage:  fndstr(host, not idxend, s1)
     this routine usage causes "fndstr" to search the host
     string for s1.  if it is found, the starting-position
     of the matched substring is returned in r0, otherwise 0
     is returned in r0.

for mode = idxend.
                                                               Page 32


     usage:  fndstr (host, idxend, s1)
     this routine usage causes "fndstr" to search the host
     string for s1.  if it is found, the ending-position of
     the matched substring is returned in r0, otherwise 0 is
     returned in r0.

for mode = partial.

     usage:  fndstr(host, partial, pos1, pos2, s1)
     this routine usage causes "fndstr" to search only part
     of the host string for s1, and "bndstr(host, pos1,
     pos2)" is the substring searched.  if s1 is found, the
     starting-position of the matched substring within
     "host" is returned in r0, otherwise 0 is returned in
     r0.
     (note:  as before, pos2 equal to zero means assume the
      ending-position of "host").

 for mode = partial + anchor.

      usage:  fndstr(host, partial + anchor, pos1, pos2, s1)
      processing is as with "partial and not anchor" except
      that it is now only necessary that the first character
      of s1 be within the bounds specified by pos1 and pos2
      -- rather than all of s1.  in other words, this usage
      is the generalized solution of the problem posed by,
      "it's a long string, but its known to start between
      the (pos1)th and (pos2)th characters of the editing
      buffer".

 for mode = anchor + not partia.

      usage:  fndstr(host,anchor + not partia,s1)
      this usage exists as a convenience.  it is equivalent
      to specifying "partia" and "anchor" together and
      setting pos1 to (1) and pos2 to (2).  in other words,
      the first character of the host must match the first
      character of (one of) the search string(s).

 for mode = multiple.

      usage:  fndstr(host, multiple + others, n, s1,...,sn)
      specifying the "multiple" switch tells "fndstr" to
      expect a count (n) and (n) search strings as the last
      arguments in its argument list.  all searching is as
      specified in the earlier usages except that there are
      now (n) search strings rather than one search string.

 for mode = partial + multiple.

     usage:  fndstr(host, multiple + partial + others, pos1,
                                   pos2, n, s1,...,sn)
     this usage is shown explicitly only to show the form of
     the argument list when both partial and multiple are
     specified.
                                                               Page 33


for mode = multiple and which.

     usage:  fndstr(host, which + multiple + others, n,
     s1,...,sn)
     if one of the search strings is found within the host,
     say the (i)th search string, (i) will be returned in
     r1.  otherwise zero will be returned in r1.

for mode = not which.

     usage:  fndstr(host,not which + others, s1)
     if (one of) the search string(s) is found within the
     host, its length will be returned in r1.  otherwise
     zero will be returned in r1.

examples:
        complex cc,ccext
        integer ic(2),icext(2)
        equivalence (cc,ic),(ccext,icext)
        complex fndstr,np
        logical l1(2)
        real*8 digits,filnam
        data l1/3,'123'/
        data filnam/'file.ext'/
        data digits/'0123456789'/
        cc=fndstr(digits,multip+which,2,'345',l1(2))
        cc=fndstr(digits,multip,2,'345',l1(2))
        mode=multip+which+partia
        ccext=fndstr(filnam,mode,2,8, 2,np('.'),np('['))
        if (iext(2).eq.2) goto noext

        ******* part two *******

        integer hasdev,hasppn
        real*8 filspc
        complex np,filspc
        integer fndstr
        data filspc/'d:f.x[1,2]'/
        mode=partia+anchor+idxend
        hasdev=fndstr(filspc, mode,2,8,np(':'))
        hasppn=fndstr(filspc,partia+anchor,hasdev,0,np('['))

part one, among other things, shows a potential use of a
subsidiary value.  the if-statement after the search of
"filnam" checks to see whether the filename was ended by a
directory or an extension.  in this case it is ended by an
extension.  note also that an index of 5 is returned in r0.
the first search of "digits" returns 2 in r0 and 2 in r1
also.  the second search of "digits" is identical except for
the subsidiary information returned.  this time the length
of l1, 3, is returned in r1.

the two searches in part2 illustrate how a string can be
"stepped" thru.  the first search of "filspc" returns 3 in
r0 (ie.  the starting-position of the file name).  note that
                                                               Page 34


the minimum number of characters is searched -- assuming no
extraneous blanks.  the second search takes advantage of the
restricted choice provided from the first search by using
"hasdev" as its pos1.  and as noted above, the pos2 of 0
causes the rest of "filspc" to be in the search path.  the
result of this search is to set "hasppn" to 6, and the
combined information of "hasdev" and "hasppn" provides the
starting and ending position of the file name, 'f.x'.


5.4 character-searching

before describing the actual searching routines, the
routines which manipulate boolean character tables will be
delineated.

5.4.1 manipulating boolean character tables

as desscribed in section 4.3, one aspect of identifying a
particular boolean character table is its bit-index.  in
order to make it possible to specify more than one (bct)
simultaneously, all of the character-search related routines
accept the bit-index information in an encoded form.  in
particular, to identify the (i)th boolean character table of
a storage-block, one passes a bit mask which has its (i)th
bit turned on.  for example to pass a bit index of 35, one
would set the mask equal to 1;  and to pass a bit index of
zero, one would set the mask to the octal quantity "400000
000000";  and to pass both simultaneously, one would set the
mask to the octal quantity "400000 000001".

tazstr

     usage:  call tazstr(storage-blk, mask)
     this routine will remove all characters from the
     specified table(s).

taostr

     usage:  call taostr(storage-block, mask)
     this routine will place all characters in the specified
     table(s).

tonstr

     usage:  call tonstr(storage-block, mask, string1)
     this routine will place (add) each of the characters in
     string1 into the specified table(s).

tofstr

     usage:  call tofstr(storage-block, mask,string1)
     this routine will remove each of the characters in
     string1 from the specified table(s).
                                                               Page 35


tabstr

     usage:  call tabstr(strorage-block, mask, string1)
     this routine combines most of the above functions.
     calling "tabstr" with a mask in which exactly one bit
     is off will cause "tabstr" to do the equivalent of:
         call taostr(storage-block, .not.  mask)
         call tofstr(storage-block, .not.  mask, string1)
     calling "tabstr" with a mask in which at least two bits
     are off will cause "tabstr" to do the equivalent of:
     "call tonstr(storage-block, mask, string1)".

5.4.2 character-searching routines

the names of the character searching routines are patterned
after the names of the string searching routines.  for each
string searching routine, xxxstr, there is a corresponding
character searching routine, xxxchr.

the power which derives from the ability to simultaneously
pass more than one (bct) to a character searching routine
(or table manipulating routine) is not plainly apparent.
what it allows one to do is set-operations with groups of
characters (i.e.  one group = one (bct)).  for example, if
bct(block1,1) = "the vowels" and bct(block1,2) = "the
digits", passing a mask set to 3 will cause the character
searching routine to match either the vowels or the digits.

the power inherent in the ability to easily invert a boolean
character table (ie.  tofstr) is also not apparent.  for
instance, if one wished to find the first arbitrary length
sequence of blanks, tabs, carriage returns, and line feeds,
i.e.  "span(these 4 characters)", one could set a (bct) to
these 4 characters and find the first such character with
one of the character searching routines.  then one could set
a second (bct) to all characters but these four characters
and find the 1st occurence of a character from this second
table.  the string between these two poinst would be the
"span".

for each of the descriptions below, let:
bct(block1, unencoded-mask) = c1 !!  c2 !!  ...  !!  cn
where (ci) is an arbitrary character.

befchr

     usage:  string-ptr = befchr(host, block1, mask)
     the output behavior of this routine is completely
     equivalent to the output behavior of:
     "befstr(host, n, c1, c2, ..., cn)".

whichr

     usage:  string-ptr = whichr(host, block1, mask)
     the output behavior of this routine is completely
                                                               Page 36


     equivalent to the output behavior of:
     "whistr(host, n, c1, c2, ..., cn)".

aftchr

     usage:  string-ptr = aftchr(host, block1, mask)
     the output behavior of this routine is completely
     equivalent to the output behavior of:
     "aftstr(host, n, c1, c2, ..., cn)".

allchr

     usage:  string-ptr = allchr(host, block1, mask,
                                   bef-ptr, aft-ptr)
     the output behavior of this routine is completely
     equivalent to the output behavior of:
     "allstr(host, bef-ptr, aft-ptr, n, c1, c2, ..., cn)".

     usage:  string-ptr = allchr(host, block1, mask,
     bef-ptr,0)
     this usage is analogous to the "allstr" usage in which
     the aft-ptr argument is zero.  string-ptr is set to
     latter part of the host starting with the matched
     character, and bef-ptr is set to the part of the host
     before the matched character.

     usage:  string-ptr = allchr(host, block1, mask,
     0,aft-ptr)
     this usage is of course analogous to the similar
     "allstr" usage;  string-ptr is set to the beginning of
     the host thru the matched character, and aft-ptr is set
     to remainder of the host after the matched character.

fndchr

the data-directed character searching routine is called
"fndchr".  its possible modes are similar to "fndstr"s but
there are differences which will be outlined below.  also
"fndchr" returns a different piece of subsidiary information
in r1.  if the character search is successful, "fndchr" will
return in r1 the numeric code of the character which matched
a character in the host, otherwise it will return zero in
r1.

for mode = not idxend.

     usage:  fndchr(host, not idxend, block1, mask)
     the output behavior of this routine is completely
     equivalent to the output behavior of "fndstr(host, not
     idxend + multiple, n, c1, c2, ..., cn)" except for the
     subsidiary information in r1.

for mode = idxend.

     usage:  fndchr(host, idxend, block1, mask)
                                                               Page 37


     the output behavior of this routine is completely
     equivalent to the output behavior of "fndstr(host,
     idxend + multiple, n, c1, c2, ..., cn)" except for the
     subsidiary information in r1.

for mode = partial.

     usage:  fndchr(host, partial, block1, mask, pos1, pos2)
     the output behavior of this routine is completely
     equivalent to the output behavior of "fndstr(host,
     partial + multiple, pos1, pos2, n, c1, c2, ..., cn)"
     except for the subsidiary information in r1.
     (note:  as before, pos2 equaling zero means assume the
      ending-position of the host).

for mode = anchor + not partia.

     usage:  fndchr(host,anchor + not partia, block1, mask)
     this usage exists solely as a convenience.  it is
     equivalent to specifying "anchor" and "partia" together
     and setting pos1 to (1) and pos2 to (2).

for mode = backwards.

     usage:  fndchr(host, backwards + others, block1, mask)
     this switch setting will cause "fndchr" to find the
     last occurence within "host" of a character in
     bct(block1, unencoded-mask), rather than the first.
     and as the switch name suggests, "fndchr" does this by
     searching the host string from its last character to
     its first character rather than vice versa.

examples:
        integer fndchr,innrp,innlp,mask
        complex sp1,sp2,sp3
        complex bndstr,aftchr,befchr
        dimension table(128)
        real*8 numlst, expres
        data numlst /'  +73   24'/
        data expres /'(i+(j+k)-)'/
        mask="1
        call tabstr(table,mask,'aeiou')
        mask=.not. 2
        call tabstr(table,mask,' ')
        call tazstr(table,1)
        call tonstr(table,1,'1234567890')
        call tonstr(table,1,'.-+')
        call tofstr(table,2,'   ')      !a tab
        call taostr(table,4)
        call tofstr(table,4,'+-1234567890')

        sp1=aftchr(numlst,table,2)
        sp2=befchr(sp1,table,4)

        call    tabstr(table,8,np('('))
                                                               Page 38


        call tabstr(table,16,np( ')' ))
        innlp=fndchr(expres,backwa, table,8)
        innrp=fndchr(expres,partia+idxend, table, 16)
        sp3 = bndstr(expres, innlp,innrp)

        sp1 = allchr(expres,table,8,sp2,0)
        innlp = fndchr(expres,anchor,table,8)

the first "tabstr" will set up a table which will match any
of "a", "e", "i", "o", "u", and the second "tabstr" will set
up a (bct) which will match the first non-blank character
encountered.  note that both (bct)'s are in the same
128-word block.

the "tazstr" and "tonstr" respectively zero the table
containing the vowels and replace it with a table containing
the digits.  the second "tonstr" illustrates that a (bct)
can be added to by including the signs within the digit
table.  the first "tofstr" shows this same principle in
converse by adding non-tab into the concept of non-blank.
lastly the "taostr" and "tofstr" create a new table which
contains everything but the digits.

the calls to "aftchr" and "befchr" are used to set up sp2 to
point at '+73'.  they use tables 34 (mask=2) and 33 (mask=4)
to respectively strip off leading blanks and then find the
first non-numeric character in the left-truncated string
pointed at by sp1.

the calls dealing with "express" show how to find the
innermost parenthesized expression in a string.  in
particular sp3 will be caused to point at '(j+k)'.  the
technique used to setup sp3 is to find the rightmost left
parenthesis by searching backwards thru "expres".  and then
using that as a context, search forward until the first
right parenthesis is found.  note the use of "idxend" to
insure that the positions actually returned are before the
left parenthesis and after the right parenthesis.

the last two calls cause sp1 to be set to 'i+(j+k)-)', sp2
to '(', and innlp to 0.  note that the second of these calls
is intuitively equivalent to saying, "is the character i am
looking at any of those i am interested in".

5.5 conversions and mappings

the routines, "cnvstr" and "mapstr", respectively implement
string <---> numeric transformations and string <---> string
transformations.  "cnvstr" will be described first.

cnvstr

for mode = not toascii.

     usage:  i = cnvstr(integer1, string1, base, not
                                                               Page 39


                                   toascii)
     this routine usage will convert string1 into a
     fixed-point number and copy it into integer1, where
     string1 is the string representation of a number in
     base, "base".  as regards string1 format, it may
     contain leading blanks and may optionally have a minus
     sign immediately preceding the high order digit of the
     number.  if string1 is not the representation of a
     legal number in base "base", (i) will be set to zero
     and integer1 will be left unchanged.
     (note:  usually one would expect "base" to be 10 or 8).

for mode = always + not toascii.

     usage:  i = cnvstr(integer1, string1, base, always +
     not toascii)
     same as before except that even if string1 is not the
     representation of a legal number, integer1 is set to
     the "converted" string.  as before, (i) is set to -1
     for a "good" number and 0 for a "bad" number.

for mode = toascii + not zeropad.

     usage:  i = cnvstr(string1, integer1, base, toascii +
                                   not zeropad)
     this routine will convert the fixed-point number,
     integer1, into the string which represents integer1 in
     base, "base".  if the number of characters needed to
     represent integer1 in base "base" is greater than the
     length of string1 at the time of the call, "cnvstr"
     will return 0 -- signalling the failure of the
     conversion.  otherwise it will return -1 and right
     justify the string representation of integer1 within
     string1 with respect to the length of string1 at the
     time of the call (i.e.  the low-order digit of integer1
     will be located at:
     "vecstr(string1,1,lenstr(string1))").
     if there is room, string1 will be padded with leading
     blanks.  if integer1 is negative, the minus sign will
     be to the right of the leading blanks, if any.

for mode = nofill + toascii.

     usage:  i = cnvstr(string1, integer1, base, nofill +
     toascii)
     the conversion is as with the previous usage.  what
     "nofill" causes is the left justification of the
     converted integer within string1.  additionally the
     length of string1 will be adjusted so that there are no
     trailing characters after the low order digit of the
     converted number.  failure will be signalled only if
     the converted number requires more characters than the
     maximum of string1.

for mode = zeropad + toascii.
                                                               Page 40


     usage:  i = cnvstr(string1,integer1, base, zeropad +
                                   toascii)
     this usage is as with "zeropad" turned off except that
     "cnvstr" will generate leading zeroes rather than
     leading blanks when there is room.  note also that if a
     minus sign is needed it will be to the left of the
     leading zeroes rather than to their right.

mapstr

the usages of "mapstr" will be described now.  note that
"mapstr" conforms to the rules described in section 3.2 for
bounds checking and setting up a completion value as the
return variable.

for mode = translate.

     usage:  i = mapstr(string1, string2, translation,
                                   translate)
     with this switch setting, "mapstr" will, while copying
     string2 to string1, translate each of the characters in
     string2 by the fixed-point number, "translation".  for
     instance one could use this routine to convert a string
     of lower-case letters to a string of upper-case letters
     or vice versa.  in particular, one would respectively
     set "translation" to -32 and 32.

for mode = translate + bounds + not yesbound.

     usage: i = mapstr(string1, string2, translation,
               translate + bounds + not yesbound, bounding)
     this setting is a generalization of the previous switch
     setting.  a character in string2 will be translated
     only if it is not between the bounds specified by
     "bounding".  "bounding"s left half contains the lower
     bound and its right half contains the upper bound.  a
     character is considered outside of the bounds only if
     it is less than the lower bound or greater than the
     upper bound.  in other words the bounds are inclusive.
     this sort of call can be used to convert a mixed group
     of upper and lower case characters to either all upper
     or all lower case.

for mode = translate + bounds + yesbound.

     usage: i = mapstr(string1, string2, translation,
               translate + bounds + yesbound, bounding)
     this switch setting is identical to the previous
     setting except that translation only occurs if the
     character is in the range specified by "bounding"
     rather than outside of it.

for mode = toascii.

     usage:  i = mapstr(string1, string2, 0, toascii)
                                                               Page 41


     this switch setting will cause the character (byte)
     size of string2 to be forced to 6 and the byte size of
     string1 to be forced to 7.  and then sixbit to ascii
     conversion will be done.
      (note:  the result of attempting to convert ascii
      characters in the two ranges below octal 40 and above
      octal 140 is undefined).

for mode = not toascii.

     usage:  i = mapstr(string1, string2, 0, not toascii)
     this switch setting will force byte sizes to 6 and 7
     respectively and then do ascii to sixbit conversion of
     string2 to string1.

     examples:
        dimension sparea(2)
        complex sp1
        integer cnvstr,mapstr
        integer istr,inum
        i=cnvstr(inum,'  -12',10,0)
        i=cnvstr(inum,'  -12',8,0)
        i=cnvstr(inum,' -12 ',10,0)
        i=cnvstr(istr,-20,10, toasci)
        i=cnvstr(istr,-20,8,toasci)
        i=cnvstr(istr,-20,10,zeropa+toasci)
        i=cnvstr(istr,-20,10,nofill+toasci)

        i=mapstr(sp1,'12325',"40,transl)
        mode = transl+bounds+yesbou
        i=mapstr(istr,'abc45',"20,"000061 000071)
        i=mapstr(istr,'abc45',-"20,"000101 000111)
        call setstr(sp1,0,12,6)
        i=mapstr(istr,'123456',0,0)

 the first "cnvstr" sets "inum" to -12 while the second
 "cnvstr" sets its to -10 since "cnvstr" was told to treat
 the '-12' as an octal number.  the third "cnvstr" will fail
 (ie.  set (i) to 0) since the trailing blank is spurious.
 also "inum" would be left with its old value.

 the first time "istr" is set, it will be set to ' -20'.
 the second time it will be represented octally and set to '
 -24'.  the third string setting "cnvstr" will set "istr" to
 '-0024', and the last one will set it to '-24' followed by
 two unknown characters since an integer string variable
 cannot have its length set to other than five.  if sp1 had
 been the destination string, it would have pointed at '-24'
 and had a length of 3.

 the first "mapstr" will set "istr" to 'abcde', as will the
 second.  note that the octal code of "1" is 60 and "9", 71.
 the third "mapstr" will translate in the other direction
 and set "istr" to '12345'.  finally the last "mapstr" will
 create a sixbit string corresponding to the ascii '123456'.
                                                               Page 42


 note that the byte size of sp1 was previously set to 6 by
 the "setstr".
                                                               Page 43


6.0 error conditions

it is possible to control the error detection and error
message facilities of "strlib" in two ways.

1) assembly parameters -- if the symbol, "check" is given a
   non-zero value, no error checking will be done by any of
   the entry points of "strlib".
   if the symbol, "messag", is given a non-zero value,
   checks will be made and overflows corrected -- but no
   error messages will be generated.

2) load-time parameter -- if the global symbol, "str.nw" is
   appropriately used in a link-10 "define" switch, message
   generation can be turned off:

         .r link
         *exampl,string/search/define:str.nw:-1

   note that setting str.nw to -1 does note affect the
   process of overflow detection and correction.

6.1 the defined conditions

all messages generated by "strlib" are warnings in the sense
that they are "%" messages.  each message is associated with
a standard six character mnemonic of the form, "str<code>".
the three-letter codes are organized such that they can be
quite useful in debugging.  corresponding to each code is a
global symbol of the form <code>$ such that this is the
location branched to when the condition is encountered.

the messages:

1) %strllz.  length less than zero
   an entry point was passed a string length which was less
   than zero;  the string length is set to zero.

2) %strlem.  length exceeds maximum
   an entry point was passed a string length which exceeded
   the maximum for that string.  the string length is caused
   to equal to the maximum.

3) %strnss.  no source strings (count under 1)
   the count passed to one of the concatenative or
   string-search routines was less than one.  in this event
   the failure path is taken by the called routine.

4) %strciv.  code invalid value (not 0-5)
   "cmpstr" was passed an illegal code.  this causes the
   failure path to be taken.

5) %strspe.  2nd position past end of string
   in either "fndchr" or "fndstr", pos2 was greater than is
   sensible.  it is reduced to the largest meaningful value.
                                                               Page 44


6) %strsli.  1st position such that string length increased
   one of the string relative functions was used to generate
   a superstring rather than a substring.  the change is
   simply allowed to occur.

7) %strfes.  1st position exceeds second
   in effect a zero length string was being used as a host
   string;  the failure path is taken.

8) %struof.  under or overflow of string pointer length or
   maximum
   one of the string relative functions was caused to create
   an illegal value for maximum or length.  less than zero
   values are set to zero, and greater than 2**18-1 values
   are set to 2**18-1.

9) %strmli.  maximum and length inconsistent
   "bldstr" was passed inconsistent values;  it will
   increase the maximum to match the length passed it.

10) %strrpu.  replacement unsuccessful:  
   a second message is printed after this message (e.g.
   %strlem).  in any event, the failure path is taken.

11) %streps.  end of substring past end of string.
   one of the string relative functions was used to create a
   string pointer which was partially out-of-bounds of the
   original string.  the user is simply allowed to do this.

12) %stridt.  string argument has illegal data type - null
   string assumed
                                                               Page 45


7.0 implementation characteristics

each routine in "strlib" can be invoked by any program which
makes use of the standard calling sequence (e.g.  fortran-10
and cobol).  additionally, note that any routine can be
called (if one does not need the return value) or invoked as
a function.  in any event though, all routines always
preserve registers two and up.

the internal format of a string pointer is as follows:

word 1: byte pointer to first character in the string
        such that an "ildb" would load it.
word 2: left side is 0 or the maximum allowed length
        right is the current length

fortran literals are actually a special case of asciz
strings -- strings that are terminated by a nul character.
if one is programming in macro, for instance, "lenstr" and
"setstr" can be used to access and set the length of an
asciz string.  data-type information, including type asciz,
is communicated in an argument list.  for a description of
the mechanism, see an appendix of the fortran language
manual.  note also that if no type code is specified for a
string argument, it is assumed to be a string pointer.

the full argument type table is:  (bits 9-12)

        0/      string pointer
        1/      data-varying string
        2/      integer string (ie. length always 5)
        3/      illegal
        4/      real string (ie. length always 5)
        5/      illegal
        6/      illegal
        7/      illegal
        10/     double string (ie. length always 10)
        11/     same as 10.
        12/     illegal
        13/     illegal
        14/     string ptr
        15/     string ptr
        16/     illegal
        17/     asciz string (ie. a literal)

7.1 "strlib" configurations

normally the routines of "strlib" will load into the low
segment, but by setting the assembly parameter "high" to
zero, the routines of "strlib" will reside exclusively in
the high segment.  in point of fact, one could build a .shr
file containing all of string if one wished.

with field image "strlib", one has complete byte size
generality in the sense that all routines will work
                                                               Page 46


correctly with all valid byte sizes (1-36).  if one wishes
to deal solely with ascii strings, one can set the assembly
switch "anysiz" to 1.

although bounds checking will have no effect if one never
passes a maximum length to either "setstr" or "bldstr",
there must still be checks to see if a maximum has been
specified.  if one wishes to eliminate the concept of bounds
checking, so to speak, from "strlib", one sets the assembly
switch "bnd.ch" to 1.
                                                               Page 47


8.0 a programming example

the following has been abstracted from the running of an
actual control file.

.r fortra

**exastr,tty:=exastr

00001           complex sp1,trcstr,bldstr
00002           integer allrep
00003           dimension l1(5)
00004           data l1/'aaaaaaaaaa'/
00005   
00006           sp1=bldstr(l1,10,0)
00007           i=allrep(sp1, trcstr('aa'),bldstr(
'1111',5,0))
00008           if (.not. i) type 88
00009           type 101,l1,sp1
00010   88      format(' bombed')
00011   101     format(1h ,5a5,2o12)
00012           end

subprograms called

allrep
bldstr  trcstr  


%ftnwrn   main.         no fatal errors and 1 warnings

00001           integer function allrep(sp1,sp2,sp3)
00002           complex sp1,sp2,sp3,tp
00003           integer pos1,len2,len3
00004           integer fndstr, repstr,lenstr,newpos
00005           complex vecstr, bndstr
00006   
00007           len2 = lenstr(sp2)
00008           len3 = lenstr(sp3)
00009           pos1 = 1
00010   
00011           allrep=-1
00012   10      tp=bndstr(sp1, pos1, 0)
00013           newpos=fndstr(tp,0,sp2)
00014           if (newpos.eq.0) return
00015           if (.not. repstr(sp1,
vecstr(tp,newpos,len2), sp3)) go to 88
00016           pos1 = pos1 + newpos - 1 + len3
00017           go to 10
00018   
00019   88      allrep=0
00020   89      format('?arpfai. aborted')
00021           end

subprograms called
                                                               Page 48


lenstr
bndstr  repstr  vecstr  fndstr  

allrep  no errors detected

.ex exastr,string/lib
link:   loading
[lnkxct exastr execution]
1111 1111 1111 1111 1111 440700000143000000000031

end of execution
cpu time: 0.05  elapsed time: 0.08
exit

this example consists of a "testing" main program and a
"user-written" string manipulation program which will
replace all occurences of a given string with a second
string.  it is important to note that "allrep" expects its
three arguments to be string pointers.  this is the case
because a fortran subprogram cannot do data-type checking as
"string can.  similarly generic library routines like "sin"
can do numeric data type checking but fortran subprograms
had better receive exclusively real values or exclusively
double precision values.

the call to "allrep" in the main program asks "allrep" to
set every occurence of 'aa' in sp1 to '1111 '.  in overview,
the technique "allrep" uses to accomplish this is to search
a shorter and shorter substring of sp1 until all occurences
of the second argument have been found.  note also that sp1,
rather than a temporary, must appear in the call to "repstr"
so that the length of sp1 will be appropriately adjusted
(ie.  by 3) each time 'aa' is found.

the statements of the loop:

00012) sets up the host string (for the "fndstr" which
       follows) so that its first character is immediately
       after the last character of the previously found
       substring and its last character is the last
       character of sp1 (ie.  the string pointed at by sp1).

00013) searches the constructed substring of sp1 for an
       occurence of sp2 (ie.  'aa').

00014) if "newpos" is set to zero, no occurence of 'aa' was
       found this time thru the loop, and we are finished.

00015) the "repstr" should not fail, but if it does the
       branch to "88" will be taken.  the "vecstr" will
       return the substring of sp1 which is the current
       occurence of 'aa', and sp3 points at '1111 '.

00016) adjusts pos1 past the inserted '1111 ' by adding its
       length (ie.  len3) and its offset within "tp" (ie.
                                                               Page 49


       newpos - 1).

the output shown from executing "exastr" is the new value of
sp1 followed by the octal representation of the string
pointer, sp1.  the "31" is a decimal 25 -- the length of
sp1.