PDP-10 Archive: decus/20-0004/shallo.txt from decus_20tap1

Trailing-Edge - PDP-10 Archives - decus_20tap1_198111 - decus/20-0004/shallo.txt

There are no other files named shallo.txt in the archive.





                                        Date:  22 April 1976

to:       INTERLISP Users

From:     W. Teitelman

Subject:  Shallow Binding in INTERLISP-10

XEROX

Overview


     Beginning with the 12-APR-76 release, INTERLISP-10 will

use  the  "shallow binding" scheme for binding variables, as

opposed to the "deep binding" scheme previously used.  Under

the deep binding scheme, associated with each variable was a

special cell called its value cell which contained its  "top

level"  value.   When  a variable was rebound, its new value

was stored on the stack.   Thus,  the  current  value  of  a

variable  was  that  found  in  its  value  cell only if the

variable had not been rebound.  To obtain  the  value  of  a

variable,   the  stack  would  be  searched  to  find  where

(whether) the variable was bound, a process which  could  be

time  consuming, especially if the depth of the computation,

i.e.  stack, was great.


     In order to  minimize  this  lookup  time,  whenever  a

compiled  function  was entered, the stack would be searched

(once only) for the bindings of all of  the  variables  used

freely   by   that  function.   This  in  turn  led  to  the

introduction of the  LOCALFREEVAR  kluge  to  override  this

lookup  for variables that were used freely somewhere within

                                                    Page   2



a block but nevertheless  were  local  to  the  block,  i.e.

bound elsewhere within the block.  Similarly, the concept of

a global variable  was  introduced  into  the  deep  binding

system  in  an  attempt  to  circumvent  the  lookup time by

informing the compiler not to bother  to  search  the  stack

when  the  indicated  variable was referenced, but simply to

always reference its value cell (top level value).  This  in

turn  led  to  the  introduction of RESETVAR, so that global

variables could be rebound, albeit expensively.


     All of this goes away  under  shallow  binding.   Under

shallow  binding,  the current value of a variable is always

stored in its value cell.  When a variable is  rebound,  its

old  value  is stored on the stack, and the new value put in

its value cell.  Thus, obtaining the value of or  setting  a

variable  is  fast  and  independent  of  the  depth  of the

computation.   Essentially,  under  shallow  binding   every

variable acts like a global variable does with deep binding.

In addition, whereas the benefit of declaring a variable  to

be  global  under  deep binding is only realized in compiled

code, in the shallow binding system, variable lookup time is

eliminated   for  interpreted  as  well  as  compiled  code.

Shallow binding  also  makes  it  less  important  to  block

compile,   since   one  of  the  main  advantages  of  block

compilation in the deep binding system is the  reduction  of

time  spend  in free variable lookup, because free variables

are looked up only once for the entire block.  (However,  it

                                                    Page   3



is  still  the  case  that function calls within a block are

much faster than calls to a function outside of a block.)The

cost  of shallow binding is that bind and unbinding are more

complicated  operations  than  under  deep  binding.   As  a

result,  stack  operations  involving context switches, e.g.

RETFROM,  RETEVAL,  STKEVAL,  etc.,  are   less   efficient.

However,  preliminary  studies  indicated that far more time

was spent in variable lookup then in variable binding.   The

timing comparisons between the two systems bear this out.



Timing Comparisons


     Here are some timing results of several benchmark  test

cases (time in seconds):

(1) DWIMIFYing a fairly large function with lots  of  record
expressions and iterative statements
                    deep                shallow
block compiled      5.3                 5.6
compiled            14.8                7.6

(2) CLISPIFYing a large function
                    deep                shallow
block compiled      5.8                 5.2

(3) doing a MAKEFILE
                    deep                shallow
block compiled      13.6                13.7

(4)  J. Moore's theorem prover was converted to the  shallow
binding system and run on a typical test case.  This program
does not perform any context switching.
                    deep                shallow
compiled            20.9                15.5

(5) Various examples using  FLIP,  a  format  directed  list
processor.   This  program  has  not  been  tuned for either
system (and is in fact ten years old).
                    deep                shallow
interpreted         7.03                4.54
compiled            .83                 .42

                                                    Page   4



block compiled      .49                 .40

(6) Finally, since switching contexts is more  expensive  in
shallow  binding than deep binding, a test case was selected
which would be a "worst case" for shallow binding,  in  that
it would not use any free variables, and do a lot of context
switching.  The example is to  compare  the  leaves  of  two
trees, using coroutine and resume (the example found on page
12.12 of the INTERLISP manual).  The two trees were lists of
length  30,  whose  first  element  was  30 cars deep, whose
second element was 29 cars deep, etc.  The two trees were in
fact  equal,  so  that the program had to run through all of
their leaves.
                    deep                shallow
interpreted         5.36                5.60
compiled            .54                 .92
block compiled      .43                 .75


     In  addition,  two  large  experimental  systems  under

development  at  PARC  that  make heavy use of the spaghetti

stack capability  were  converted  to  the  shallow  binding

system.   Both  ran  slightly  faster  block  compiled under

shallow binding than deep binding, and  considerably  faster

interpreted and compiled normally.



Converting to the Shallow Binding System


     Conversion to  the  shallow  binding  system  is  quite

straightforward.   For  most  programs,  it  simply requires

recompilation.    For   the   rest,   conversion   can    be

semi-automated  using  EDITFNS,  as  described below.  In no

case has conversion taking more than a few hours.  Here  are

the things to look for:

                                                    Page   5



GETTOPVAL and SETTOPVAL


     GETTOPVAL and SETTOPVAL work  in  the  shallow  binding

system.   However, finding the top level value of a variable

involves searching the entire stack looking for bindings  of

this  variable.   If  the variable has been rebound, its top

level value will be stored on the stack at the  place  where

it  was first rebound.  Note that the entire stack will have

to be searched even in the case where the variable  was  not

rebound.   In other words, asking for the top level value of

a variable that you are not rebinding is  very  costly,  and

the  exact  opposite of what you probably intend.  Thus, you

probably will want to change all references to the top level

value  of  a  variable  to simply reference the value of the

variable.  In other words, instead of writing (GETTOPVAL X),

write  (EVAL  X)  or  (EVALV X) (the latter is much faster).

(If the top level value is being referenced as a  result  of

the  variable  being declared a GLOBALVAR, you don't have to

make any changes; GLOBALVAR declarations are simply  ignored

by  the  compiler  under  shallow  binding.)  For  backwards

compatibility with the deep binding  system,  the  functions

GETATOMVAL,  SETATOMVAL, and /SETATOMVAL are defined in both

the new shallow binding system and the current deep  binding

system.   In  deep  binding,  they  reference  the top level

value, i.e.  are  the  same  as  GETTOPVAL,  SETTOPVAL,  and

/SETTOPVAL.   In  the shallow binding system, they reference

the current value.  In both systems, they compile open where

                                                    Page   6



possible.   Thus,  to convert to the shallow binding system,

simply change each occurrence of  GETTOPVAL,  SETTOPVAL  and

/SETTOPVAL   to  GETATOMVAL,  SETATOMVAL,  and  /SETATOMVAL,

respectively.



RESETVAR and RESETSAVE


     RESETVAR works in shallow binding, but  is  inefficient

and  totally unnecessary: you can simply rebind the variable

in question.  Thus, in the shallow binding system,  RESETVAR

could  be  replaced  by  an  equivalent  PROG.  However, for

backwards compatibility, there is a new function, RESETVARS,

which  is defined in both the shallow binding system and the

current deep binding system..  The form of RESETVARS is  the

same  as that of PROG, i.e.  (RESETVARS variables expression

...  expression).  Labels are permitted,  and  a  RETURN  is

required to specify the value of the RESETVARS form.  In the

deep binding system, RESETVARS  compiles  to  an  equivalent

expression  using  RESETVAR.  In the shallow binding system,

RESETVARS is  in  fact  simply  PROG.   Since  RESETSAVE  in

conjunction with RESETLST is occasionally used to rebind and

reset global variables in the deep binding system, you  also

will  need  to  examine  all RESETSAVE expressions to see if

they are being used to reset  a  variable,  and  if  so,  to

rebind  the  variable  using  RESETVARS, and reset it with a

simple SETQ.

                                                    Page   7



RPAQ, RPAQQ, ADDTOVAR


     These functions work on the current value of a variable

in  the  shallow  binding  system.  If you don't use them in

your program, don't worry; PRETTYDEF and LOAD  are  properly

interfaced.   If  you  do use them, you will have to look at

each case and decide what is the proper thing to do.



CAR of an atom



is no longer its value.  CAR of an atom  is  now  undefined,

except  for CAR of NIL, which is guaranteed to be NIL.  (CDR

of an atom continues to be its property list, but you really

should  use  GETPROPLIST instead in case this changes in the

future.  However, you can be assured that CDR  of  NIL  will

always  be  NIL.  ) Thus, you should check your programs for

expressions of the form (CAR (QUOTE --)), (RPLACA (QUOTE --)

--),  and  (/RPLACA  (QUOTE  --)  --).   The first should be

changed to a simple variable  reference,  the  second  to  a

simple  SETQ,  and the third to (/SETATOMVAL (QUOTE --) --).

You undoubtedly will have places where you  "cleverly"  took

CAAR  of  a  list  knowing that CAR was an atom, and CAR the

value of that atom.  (I am still finding some  of  these  in

the INTERLISP code!) There isn't anyway to find these except

to run your programs.

                                                    Page   8



Automating the conversion using EDITFNS


     Thus, to convert to the shallow  binding  system,  load

your  file,  and  perform EDITFNS(filename (EXAM (CAR (QUOTE

--))  (RPLACA  (QUOTE  --)  --)  (/RPLACA  (QUOTE  --)   --)

GETTOPVAL SETTOPVAL /SETTOPVAL RESETVAR RESETSAVE RPAQQ RPAQ

ADDTOVAR)), and  make  the  indicated  changes.   Note  that

(RESETVAR  X  Y  form) needs to be changed to (RESETVARS ((X

Y))  (RETURN  form)).   However,  several  nested   RESETVAR

expressions   can   be  combined  into  a  single  RESETVARS

expression.



Block Compiling under Shallow Binding


     If  you  blockcompile,  you  may  see  some  new  error

messages  of  the  form  ****(X  SHOULD BE A SPECVAR).  This

message means exactly what it says.   The  reason  why  this

message  did  not  occur  previously  is the reason why some

spaghetti-stack programs did not work block compiled.  If  a

variable  is used freely in a block, any function within the

block that binds the variable must make  a  new  frame  that

contains  this  binding.  Otherwise, if you RESUME, RETFROM,

or otherwise change contexts  from  within  the  block,  and

later   come  back,  the  variable  will  not  be  correctly

restored.  If the variable  is  already  declared  to  be  a

LOCALFREEVAR,  you  don't  have  to do anything else (and in

fact  won't  even   see   this   error   message),   because

                                                    Page   9



LOCALFREEVARS  are  the  same  as  SPECVARS  in  the shallow

binding system.  If not, simply declare the variable  to  be

either  a  SPECVAR  or LOCALFREEVAR using the standard block

declaration format.


     GLOBALVAR declarations are treated the same as  SPECVAR

declarations  in  shallow  binding  system,  except that the

compiler will not include the names  of  GLOBALVARS  in  the

list  of  free  variables that it prints when compiling your

function.  



Tuning Block Compiled Programs for Shallow Binding


     Although  there  is  some  difference   in   efficiency

tradeoffs  between  shallow binding and deep binding, unless

you are a fanatic, it probably won't pay  to  bother  tuning

your  programs for shallow binding.  As an exercise, I spent

a lot of  time  tuning  up  things  like  DWIM  for  shallow

binding,  and  the best improvement I was able to obtain was

about 10%.  However, the  process  was  instructive,  and  I

include it here for general education.


     The main inefficiency introduced with  shallow  binding

comes  as  a  result  of  the  fact that binding is now more

expensive, especially if it results in  having  to  make  an

extra  frame  on  an  internal  function  call in the block.

(Strictly speaking, the extra frame is not  an  artifact  of

shallow  binding, but of the spaghetti stack.  If we had not

                                                    Page  10



converted to shallow binding, it would have  been  necessary

to  require  this  extra  frame  in the deep bound spaghetti

system in order to make block compiled code  work  properly,

as  described  earlier.)  For example, suppose FOO binds the

variable X, and calls FIE which uses X freely, and  FOO  and

FIE  are  both  in  the  same block.  Every time that FOO is

called within the block, a new frame will  be  (has  to  be)

made.   If  FIE  is called very often from FOO, i.e.  if FOO

does a lot of computation, then this cost  is  probably  not

significant.   However, if FOO is called often, and FOO only

calls FIE once or twice, it may be worthwhile to make  X  be

an  argument  of  FIE.  Note that this only wins if X is the

only SPECVAR that FOO binds.  If there are others, and a new

frame  is going to be made anyway, don't worry about it; the

time required to bind another variable is not significant.




     MASTERSCOPE is invaluable for finding  these  sorts  of

situations.   For  example,  you  can  say "WHO IN BLOCKS OF

function BINDS ANY IN SPECVARS OF function", and  then  look

at  those  functions which bind only one or two specvars and

see if you can get rid of these bindings.


     Another related inefficiency comes about when you  have

a  variable which really has to be a free variable, and then

also use it elsewhere within the block as a local  variable.

For  example,  suppose  FOO  calls FOO1, FOO2, and FOO3, and

                                                    Page  11



they all share the free variable TAIL,  communicating  among

one  another  by setting and resetting this variable so that

it is awkward to have to  pass  it  back  and  forth  as  an

argument.   Suppose  FOO,  FOO1,  FOO2,  and FOO3 are all in

single block along with the function FIE which also binds  a

variable  named  TAIL.   Each time that FIE is called, a new

frame will have to be made because TAIL, a SPECVAR, is being

rebound.   (As  far  as  the compiler is concerned, it's the

same TAIL.) If FIE doesn't call anybody that uses the  value

of  its  TAIL,  then  this  frame  is  unnecessary  (but the

compiler doesn't know this!).  The best thing to do in  this

case  is  to change the name of TAIL to something else, e.g.

$TAIL.


     For  this  situation,   again   I   found   MASTERSCOPE

invaluable.   Basically,  what  you  want  to  know is which

functions in a block bind a SPECVAR and don't  call  anybody

in  the  block  that  use the variable freely, without going

through a function which rebinds the variable.   I  found  a

lot  of  these in DWIM (but remember eliminating all of them

only  resulted  in  about   10%   improvement)   by   asking

MASTERSCOPE   for  each  SPECVAR  "WHO  IN  functions  BINDS

variable AND IS NOT ON PATH TO  ANY  USING  variable  FREELY

NOTRACE  ANY  BINDING variable".  Since then, Larry Masinter

has included in MASTERSCOPE a function called  MSCHECKBLOCKS

which  given  a  file  name  performs this check, as well as

several other useful checks, e.g.  functions which  are  not

                                                    Page  12



called  by  any  functions  in  the  block  and also are not

entries, functions which are called by functions outside  of

the block and are not entries, etc.  MSCHECKBLOCKS is in the

shallow binding system.  You don't even  have  to  load  the

file.



Other Miscellaneous Differences


     The stack will look a little different when  you  do  a

backtrace  in  the  shallow binding system because there are

many more frames than previously.  As described earlier,  in

the  shallow  binding system, whenever a function in a block

binds a SPECVAR or LOCALFREEVAR, it must make a  new  frame.

If  this  function  is a RETFN, the name of the frame is the

name of the function, i.e.  if you do a STKPOS searching for

that function from below that point on the stack, the search

will stop at this frame.  If the function is  not  a  RETFN,

then  its  name  will not appear on the stack.  Instead, the

function name associated with that  frame,  i.e.   the  name

printed  by  backtrace, will be the result of putting a * at

the beginning and end of the  function  name,  e.g.   *FOO*,

*CLISPATOM*.   Actually,  such "function names" can be given

as arguments  to  the  various  stack  functions,  but  this

procedure is frowned upon.  In particular, the corresponding

programs will not work interpreted.

                                                    Page  13



     When a new frame is created on a internal function call

within  a  block,  only those variables that are SPECVARS or

LOCALFREEVARS  will  have  their  names  stored   with   the

bindings.   For  example,  if  FOO  is  called from within a

block, and binds X and Y, and Y is a SPECVAR, if  you  do  a

STKSCAN for X from below FOO, the search will not see the X,

even though there is a  frame  named  *FOO*  on  the  stack,

containing  a  binding  for  Y.   The  binding for X will be

"unnamed." However, for aid in debugging, BREAK will  supply

the  names  of  such  variables  when it can, using the same

convention of putting a * in front and behind  the  variable

name.   For  example, if you go into a break below FOO (i.e.

below *FOO*), and use the  @  command  to  "point"  at  this

frame, and then type ?= you will see:

*X* = value of X

Y = value of Y

The same procedure is followed when  a  function  is  called

that  is  not inside of a block, but some of whose variables

are declared to be LOCALVARs (not LOCALFREEVARs).