Google
 

Trailing-Edge - PDP-10 Archives - decus_20tap1_198111 - decus/20-0004/shallo.rno
There are no other files named shallo.rno in the archive.
^^
.indent 40
Date:##22 April 1976
.break
.nofill
.tab stops 10
to:	INTERLISP Users
.break
From:	W. Teitelman
.break
Subject:	Shallow Binding in INTERLISP-10
.fill

XEROX
.break
^&Overview\&
.paragraph
Beginning with the 12-APR-76 release, INTERLISP-10 will use the
"shallow binding" scheme for binding variables, as opposed to
the "deep binding" scheme previously used. Under the deep binding
scheme, associated with each variable was a special cell called
its value cell which contained its "top level" value. When a
variable was rebound, its new value was stored on the stack.
Thus, the current value of a variable was that found in its value
cell only if the variable had not been rebound. To obtain the
value of a variable, the stack would be searched to find where
(whether) the variable was bound, a process which could be time
consuming, especially if the depth of the computation, i.e. stack,
was great.
.test page 8
.paragraph
In order to minimize this lookup time, whenever a compiled
function was entered, the stack would be searched (once only)
for the bindings of all of the variables used freely by that
function. This in turn led to the introduction of the LOCALFREEVAR
kluge to override this lookup for variables that were used freely
somewhere within a block but nevertheless were local to the block,
i.e. bound elsewhere within the block. Similarly, the concept
of a global variable was introduced into the deep binding system
in an attempt to circumvent the lookup time by informing the
compiler not to bother to search the stack when the indicated
variable was referenced, but simply to always reference its value
cell (top level value). This in turn led to the introduction
of RESETVAR, so that global variables could be rebound, albeit
expensively.
.test page 8
.paragraph
All of this goes away under shallow binding. Under
shallow binding, the current value of a variable is always stored
in its value cell. When a variable is rebound, its old value
is stored on the stack, and the new value put in its value cell.
Thus, obtaining the value of or setting a variable is fast and
independent of the depth of the computation. Essentially, under
shallow binding every variable acts like a global variable does
with deep binding. In addition, whereas the benefit of declaring
a variable to be global under deep binding is only realized in
compiled code, in the shallow binding system, variable lookup
time is eliminated for interpreted as well as compiled code.
Shallow binding also makes it less important to block compile,
since one of the main advantages of block compilation in the
deep binding system is the reduction of time spend in free variable
lookup, because free variables are looked up only once for the
entire block. (However, it is still the case that function calls
within a block are much faster than calls to a function outside
of a block.)The cost of shallow binding is that bind and unbinding
are more complicated operations than under deep binding. As a
result, stack operations involving context switches, e.g. RETFROM,
RETEVAL, STKEVAL, etc., are less efficient. However, preliminary
studies indicated that far more time was spent in variable lookup
then in variable binding. The timing comparisons between the
two systems bear this out.
.skip

^&Timing Comparisons\&
.test page 8
.paragraph
Here are some timing results of several benchmark test cases
(time in seconds):
.break
.spacing 1
(1) DWIMIFYing a fairly large function with
lots of record expressions and iterative statements
.nofill
.break
.tab stops 20,40
	^&deep\&	^&shallow\&
.break
block compiled	5.3	5.6
.break
compiled	14.8	7.6
.skip
(2) CLISPIFYing a large function
.break
	^&deep\&	^&shallow\&
.break
block compiled	5.8	5.2
.skip
(3) doing a MAKEFILE
.break
	^&deep\&	^&shallow\&
.break
block compiled	13.6	13.7
.skip
.fill
(4)##J.#Moore's theorem prover was converted to the shallow binding
system and run on a typical test case.  This program does not
perform any context switching.
.nofill
	^&deep\&	^&shallow\&
.break
compiled	20.9	15.5
.skip
.fill
(5)  Various examples using FLIP, a format directed list processor.
This program has not been tuned for either system (and is in fact
ten years old).
.nofill
.break
	^&deep\&\	^&shallow\&
.break
interpreted	7.03	4.54
.break
compiled	.83	.42
.break
block compiled	.49	.40
.skip
.fill
 
(6) Finally, since switching contexts is more expensive in shallow
binding than deep binding, a test case was selected which would
be a "worst case" for shallow binding, in that it would not use
any free variables, and do a lot of context switching. The example
is to compare the leaves of two trees, using coroutine and resume
(the example found on page 12.12 of the INTERLISP manual). The
two trees were lists of length 30, whose first element was 30
cars deep, whose second element was 29 cars deep, etc. The two
trees were in fact equal, so that the program had to run through
all of their leaves.
.nofill
.break
	^&deep\&	^&shallow\&
.break
interpreted	5.36	5.60
.break
compiled	.54	.92
.break
block compiled	.43	.75
.skip
.spacing 2
.fill
 
.test page 8
.paragraph
In addition, two large experimental systems under development
at PARC that make heavy use of the spaghetti stack capability
were converted to the shallow binding system. Both ran slightly
faster block compiled under shallow binding than deep binding,
and considerably faster interpreted and compiled normally.
.skip
^&Converting to the Shallow Binding System\&
.test page 8
.paragraph
Conversion to the shallow binding system is quite straightforward.
For most programs, it simply requires recompilation. For the
rest, conversion can be semi-automated using EDITFNS, as described
below. In no case has conversion taking more than a few hours.
Here are the things to look for:
.skip
.test page 8
GETTOPVAL and SETTOPVAL
.test page 8
.paragraph
GETTOPVAL and SETTOPVAL work in the shallow
binding system. However, finding the top level value of a variable
involves searching the entire stack looking for bindings of this
variable. If the variable has been rebound, its top level value
will be stored on the stack at the place where it was first rebound.
Note that the entire stack will have to be searched  even in
the case where the variable was not rebound. In other words,
asking for the top level value of a variable that you are not
rebinding is very costly, and the exact opposite of what you
probably intend. Thus, you probably will want to change all references
to the top level value of a variable to simply reference the
value of the variable. In other words, instead of writing (GETTOPVAL
X), write (EVAL X) or (EVALV X) (the latter is much faster).
(If the top level value is being referenced as a result of the
variable being declared a GLOBALVAR, you don't have to make any
changes; GLOBALVAR declarations are simply ignored by the compiler
under shallow binding.) For backwards compatibility with the
deep binding system, the functions GETATOMVAL, SETATOMVAL, and
/SETATOMVAL are defined in both the new shallow binding system
and the current deep binding system. In deep binding, they reference
the top level value, i.e. are the same as GETTOPVAL, SETTOPVAL,
and /SETTOPVAL. In the shallow binding system, they reference
the current value. In both systems, they compile open where possible.
Thus, to convert to the shallow binding system, simply change
each occurrence of GETTOPVAL, SETTOPVAL and /SETTOPVAL to GETATOMVAL,
SETATOMVAL, and /SETATOMVAL, respectively.
.test page 8
.skip
RESETVAR and RESETSAVE
.paragraph
RESETVAR works in shallow binding, but is inefficient and totally
unnecessary: you can simply rebind the variable in question.
Thus, in the shallow binding system, RESETVAR could be replaced
by an equivalent PROG. However, for backwards compatibility,
there is a new function, RESETVARS, which is defined in both
the shallow binding system and the current deep binding system..
The form of RESETVARS is the same as that of PROG, i.e. (RESETVARS
variables expression ... expression). Labels are permitted, and
a RETURN is required to specify the value of the RESETVARS form.
In the deep binding system, RESETVARS compiles to an equivalent
expression using RESETVAR. In the shallow binding system, RESETVARS
is in fact simply PROG. 

Since RESETSAVE in conjunction with RESETLST is occasionally used
to rebind and reset global variables in the deep binding system,
you also will need to examine all RESETSAVE expressions to see
if they are being used to reset a variable, and if so, to rebind
the variable using RESETVARS, and reset it with a simple SETQ.
.skip
.test page 10
RPAQ, RPAQQ, ADDTOVAR
.paragraph
These functions work on the current value of a variable in the
shallow binding system. If you don't use them in your program,
don't worry; PRETTYDEF and LOAD are properly interfaced. If you
do use them, you will have to look at each case and decide what
is the proper thing to do.
.skip
^&CAR of an atom\&
.skip
is no longer its value. CAR of an atom is now undefined, except
for CAR of NIL, which is guaranteed to be NIL. (CDR of an atom
continues to be its property list, but you really should use
GETPROPLIST instead in case this changes in the future. However,
you can be assured that CDR of NIL will always be NIL. ) Thus,
you should check your programs for expressions of the form (CAR
(QUOTE --)), (RPLACA (QUOTE --) --), and (/RPLACA (QUOTE --)
--). The first should be changed to a simple variable reference,
the second to a simple SETQ, and the third to (/SETATOMVAL (QUOTE
--) --). You undoubtedly will have places where you "cleverly"
took CAAR of a list knowing that CAR was an atom, and CAR the
value of that atom. (I am still finding some of these in the
INTERLISP code!) There isn't anyway to find these except to run
your programs.
.skip
^&Automating the conversion using EDITFNS\&
.paragraph
Thus, to convert to the shallow binding system, load your file,
and perform EDITFNS(filename (EXAM (CAR (QUOTE --)) (RPLACA (QUOTE
--) --) (/RPLACA (QUOTE --) --) GETTOPVAL SETTOPVAL /SETTOPVAL
RESETVAR RESETSAVE RPAQQ RPAQ ADDTOVAR)), and make the indicated
changes. Note that (RESETVAR X Y form) needs to be changed to
(RESETVARS ((X Y)) (RETURN form)). However, several nested RESETVAR
expressions can be combined into a single RESETVARS expression.
.skip
^&Block Compiling under Shallow Binding\&
.test page 8
.paragraph
If you blockcompile, you may see
some new error messages of the form ****(X SHOULD BE A SPECVAR).
This message means exactly what it says. The reason why this
message did not occur previously is the reason why some spaghetti-stack
programs did not work block compiled. If a variable is used freely
in a block, any function within the block that binds the variable
must make a new frame that contains this binding. Otherwise,
if you RESUME, RETFROM, or otherwise change contexts from within
the block, and later come back, the variable will not be correctly
restored. If the variable is already declared to be a LOCALFREEVAR,
you don't have to do anything else (and in fact won't even see
this error message), because LOCALFREEVARS are the same as SPECVARS
in the shallow binding system. If not, simply declare the variable
to be either a SPECVAR or LOCALFREEVAR using the standard block
declaration format.
.paragraph
GLOBALVAR declarations are treated the same
as SPECVAR declarations in shallow binding system, except that 
the compiler will not include the names of GLOBALVARS in the
list of free variables that it prints when compiling your function. 
.skip

^&Tuning Block Compiled Programs for Shallow Binding\&
.test page 8
.paragraph
Although there
is some difference in efficiency tradeoffs between shallow binding
and deep binding, unless you are a fanatic, it probably won't
pay to bother tuning your programs for shallow binding. As an
exercise, I spent a lot of time tuning up things like DWIM for
shallow binding, and the best improvement I was able to obtain
was about 10%. However, the process was instructive, and I include
it here for general education.
.test page 8
.paragraph
The main inefficiency introduced with shallow binding comes as a result of the fact that binding is now more expensive, especially if it results in having to make an extra frame on an internal function call in the block.
(Strictly speaking, the extra frame is not an artifact of shallow binding, but of the spaghetti stack.
If we had not converted to shallow binding, it would have been necessary to require this extra frame in the deep bound spaghetti system in order to make block compiled code work properly, as described earlier.) For example, suppose FOO binds the variable X, and calls FIE which uses X freely, and FOO and FIE are both in the same block.
Every time that FOO is called within the block, a new frame will be (has to be) made.
If FIE is called very often from FOO, i.e.
if FOO does a lot of computation, then this cost is probably not significant.
However, if FOO is called often, and FOO only calls FIE once or twice, it may be worthwhile to make X be an argument of FIE.
Note that this only wins if X is the only SPECVAR that FOO binds.
If there are others, and a new frame is going to be made anyway, don't worry about it; the time required to bind another variable is not significant.
.skip
.paragraph
 MASTERSCOPE is invaluable for finding these sorts of situations.
For example, you can say "WHO IN BLOCKS OF function BINDS ANY IN SPECVARS OF function", and then look at those functions which bind only one or two specvars and see if you can get rid of these bindings.
.test page 8
.paragraph
Another related inefficiency comes about when you have a variable which really has to be a free variable, and then also use it elsewhere within the block as a local variable.
For example, suppose FOO calls FOO1, FOO2, and FOO3, and they all share the free variable TAIL, communicating among one another by setting and resetting this variable so that it is awkward to have to pass it back and forth as an argument.
Suppose FOO, FOO1, FOO2, and FOO3 are all in single block along with the function FIE which also binds a variable named TAIL.
Each time that FIE is called, a new frame will have to be made because TAIL, a SPECVAR, is being rebound.
(As far as the compiler is concerned, it's the same TAIL.) If FIE doesn't call anybody that uses the value of its TAIL, then this frame is unnecessary (but the compiler doesn't know this!).
The best thing to do in this case is to change the name of TAIL to something else, e.g.
$TAIL.
.test page 8
.paragraph
 For this situation, again I found MASTERSCOPE invaluable.
Basically, what you want to know is which functions in a block bind a SPECVAR and don't call anybody in the block that use the variable freely, without going through a function which rebinds the variable.
I found a lot of these in DWIM (but remember eliminating all of them only resulted in about 10% improvement) by asking MASTERSCOPE for each SPECVAR "WHO IN functions BINDS variable AND IS NOT ON PATH TO ANY USING variable FREELY NOTRACE ANY BINDING variable".
Since then, Larry Masinter has included in MASTERSCOPE a function called MSCHECKBLOCKS which given a file name performs this check, as well as several other useful checks, e.g.
functions which are not called by any functions in the block and also are not entries, functions which are called by functions outside of the block and are not entries, etc.
MSCHECKBLOCKS is in the shallow binding system.
You don't even have to load the file.
.skip
^&Other Miscellaneous Differences\&
.test page 8
.paragraph
The stack will look a little different when you do a backtrace
in the shallow binding system because there are many more frames
than previously.
As described earlier, in the shallow binding
system, whenever a function in a block binds a SPECVAR or LOCALFREEVAR,
it must make a new frame.
If this function is a RETFN, the name
of the frame is the name of the function, i.e.
if you do a STKPOS
searching for that function from below that point on the stack,
the search will stop at this frame.
If the function is not a
RETFN, then its name will not appear on the stack.
Instead, the
function name associated with that frame, i.e.
the name printed
by backtrace, will be the result of putting a * at the beginning
and end of the function name, e.g.
*FOO*, *CLISPATOM*.
Actually,
such "function names" can be given as arguments to the various
stack functions, but this procedure is frowned upon.
In particular,
the corresponding programs will not work interpreted.
.test page 8
.paragraph
When a new
frame is created on a internal function call within a block,
only those variables that are SPECVARS or LOCALFREEVARS will
have their names stored with the bindings.
For example, if FOO
is called from within a block, and binds X and Y, and Y is a
SPECVAR, if you do a STKSCAN for X from below FOO, the search
will not see the X, even though there is a frame named *FOO*
on the stack, containing a binding for Y.
The binding for X will
be "unnamed." However, for aid in debugging, BREAK will supply
the names of such variables when it can, using the same convention
of putting a * in front and behind the variable name.
For example,
if you go into a break below FOO (i.e.
below *FOO*), and use
the @ command to "point" at this frame, and then type ?= you
will see:
.break
.spacing 1
*X* = value of X
.skip
Y = value of Y
.skip
.spacing 2
The same procedure is followed when a function is called that
is not inside of a block, but some of whose variables are declared
to be LOCALVARs (not LOCALFREEVARs).