Trailing-Edge - PDP-10 Archives - SRI_NIC_PERM_SRC_1_19910112 - 7/documentation/eapgmg.mem
There are 17 other files named eapgmg.mem in the archive. Click here to see a list.


               Programming Implications and Methodology

                                                            Dan Murphy

The basic architecture of the DECsystem-10  family  of  processors  up
through the KL10 provided an 18-bit, 256K-word addressing space.  This
means in particular that:

     1.  The Program Counter register is 18 bits;

     2.  Each  and  every  instruction  executed  computes  an  18-bit
         effective  address.   The contents of this address may or may
         not be referenced depending on the  actual  instruction,  but
         the  algorithm  for  calculating  it  (including indexing and
         indirecting rules) is the same for all instructions.

The above is true regardless of the size of the physical  core  memory
available  on any particular configuration.  Note also that paging (or
relocation on earlier  processors)  will  determine  if  a  particular
virtual  address  corresponds  to an existing physical word in memory,
but the fundamental size of the virtual address space is a constant.

During the design of the KL10 processor, the need for a larger virtual
address  space  was recognized.  A design was developed which provides
an extended  address  space  to  new  programs  while  still  allowing
existing   unmodified  programs  to  execute  correctly  on  the  same
processor.  Although most of the essential data paths were provided in
the original KL10 implementation, various design changes caused actual
availability of extended addressing operation as described here to  be
deferred to the "model B" KL10 CPU.
                                                                Page 2


Under the extended addressing design, the virtual address space of the
machine  is  now  30  bits,  or 1,073,741,824 words.  Although one can
think of this as a single homogeneous  space,  it  is  generally  more
useful  to  consider  an  address as consisting of two components, the
section number, and the word number.

          0     5  6           17 18                 35
          - - - - --------------------------------------
          !       !              !                     !
          !       !  section     !       word          !
          !       !  12 bits     !       18 bits       !
          - - - - --------------------------------------

The word (more precisely word-within-section)  field  consists  of  18
bits  and  thus  represents  a  256K-word address space similar to the
single address space on earlier machines.  The section number field is
12 bits and thus provides 4096 separate sections, each of 256K words.

Each section is further divided into pages of 512 words each  just  as
on  earlier  machines.   The  paging  facilities  allow the monitor to
independently establish the existence and protection of  each  section
as a unit.

In order to implement this extended 30-bit address space, the PC  must
be  extended  to  hold a full address.  The PC is always considered to
consist of a section field and a word field, and the  incrementing  of
the  PC  never  allows  a  carry  from the word field into the section
field.  That is, if a program allows flow of control to reach the last
word  of a section and the instruction in that location is not a jump,
then the PC will "wrap-around", and the next instruction fetched  will
be word 0 of the same section.  This will in fact be AC 0 as described
below.  The consequence of this is that flow of control of  a  program
cannot  inmplicitly cross section boundaries.  In general, it would be
a programming error to allow the PC to reach the last  location  of  a
section  and  execute  a non-jump instruction or to execute a skipping
instruction in either of the last two locations of a section.
                                                                Page 3


In order to allow efficient use of the extended address space, it  was
necessary  to modify the operation of various machine instructions and
to change the algorithm for the calculation  of  effective  addresses.
Because  these changes have a high probability of causing any existing
program to malfunction, the following convention was adopted:

               If a program is executing in section  0,
               all instructions are executed exactly as
               they would be on a non-extended machine.
               If a program is executing in any section
               other than 0,  the  extended  addressing
               algorithms   are   used   for  effective
               address  calculation   and   instruction

A program is said to be executing in section 0 when the section  field
of  the PC contains 0.  Effective address calculations and instruction
executions are performed exactly as on a non-extended machine;   hence
any  existing  program  will work correctly if run in section 0.  Note
however, that this also implies that no addresses outside of section 0
can be generated, either for data references or for jumps.  That is, a
program executing in section 0 cannot leave section  0  except  via  a
monitor call.*

It is easy however to write or generate code which is compatible  with
both extended and non-extended execution.  This was a specific goal of
the design, and in general requires only that certain precautions must
be  taken regarding previously unused fields.  Since these precautions
must be taken however, one cannot generally assume that  any  existing
program  will  have  observed  them  and  so  execute  correctly in an
extended section.

                  - - - - - - - - - - - - - - - - -

*The only exception to this is the XJRSTF instruction.
                                                                Page 4


Unless  explicitly  stated  otherwise,  everything  in  the  following
discussion  refers to execution of instructions with the PC in a non-0
section;  nothing applies to instructions executed from section 0.

     1.  Instruction format

         The format of a machine instruction  is  the  same  as  on  a
         non-extended  machine.   In particular, the effective address
         computation  is  dependent  on  three  quantities  from   the
         instruction,  the Y (address) field, the X (index) field, and
         the I (indirect) field.  These are 18 bits, 4 bits, and 1 bit

          0        8 9   12 13 14   27 18                 35
          !         !      !  !       !                     !
          ! OPCODE  ! AC   ! I!   X   !         Y           !
          !         !      !  !       !                     !

         Depending on the format of the index and indirect words,  the
         effective  address  algorithm  will  perform either 18-bit or
         30-bit address  computations.   When  a  30-bit  quantity  is
         indicated,  an explicit section number is being specified and
         the address is called  a  global  address.   When  an  18-bit
         quantity  is  indicated,  the section field is defaulted from
         some other quantity (e.g., the PC), and the address  is  thus
         local to the default section and is called a local address.

         In the simplest case, consider an instruction which specifies
         no indexing or indirection.  E.g.,

               3,,400/     MOVE T,1000

         Here the effective address computation yields a local address
         1000,  and the section used for the reference is 3, i.e., the
         section  from  which  the  instruction  itself  was  fetched.
         Precisely stated:

               Whenever an instruction or indirect word
               specifies  a  local address, the default
               section is the section  from  which  the
               word containing that address was taken.

         This means that the default section will  change  during  the
         course   of  an  effective  address  calculation  which  uses
         indirection.  The default section will always be the  section
         from which the last indirect word was fetched.
                                                                Page 5

     2.  Indexing

         The first step in the effective  address  calculation  is  to
         perform  indexing  if  specified  by  the  instruction.   The
         calculation  performed  depends  on  the  contents   of   the
         specified index register:

         1.  If the left half of the contents of X is negative  or  0,
             the  right  half of X is added to Y (from the instruction
             word) to yield an 18-bit local address.

             This  is  consistent  with  indexing  on  a  non-extended
             machine, and means, for example, that the usual AOBJN and
             stack pointer formats can be used for tables  and  stacks
             which are in the same section as the program.

         2.  If the left half of the contents of  X  is  positive  and
             non-0,  the  entire  contents  of  X are added to Y (sign
             extended) to yield a 30-bit global address.

             This means that index  registers  may  be  used  to  hold
             complete  addresses  which  are  referenced  via  indexed
             instructions.  A Y field of 0 will commonly  be  used  to
             reference  exactly  the  address  contained  in X.  Small
             positive or negative offsets (magnitude less than  2**17)
             may   also  be  specified  by  the  Y  field,  e.g.,  for
             referencing data structure items in other sections.

     3.  Indirection

         If indirection is specified by the instruction,  an  indirect
         word is fetched from the address determined by Y and indexing
         (if any).  The indirect word is considered to be "instruction
         format"  if bit 0 is a 1, and "extended format" if bit 0 is a

         1.  Instruction format indirect word (IFIW):
             This word contains Y, X, and I fields of  the  same  size
             and  in  the same position as instructions, i.e., in bits
             13-35.  Bit 1 must be 0 (its use is reserved  for  future
             hardware);   bits  2-12  are  not  used.   The  effective
             address computation continues with the quantities in this
             word  just  as  for  the  original instruction.  That is,
             indexing may be specified and  may  be  local  or  global
             depending  on  the  left  half  of the index, and further
             indirection may be  specified.   Note  that  the  default
             section  for  any  local  addresses  produced  from  this
             indirect word will be the section  from  which  the  word
             itself was fetched.

         2.  Extended Format Indirect Word (EFIW):
             This word contains Y, X, and I  fields  also,  but  in  a
             different  format  such as to allow a full 30-bit address
                                                                Page 6

                0  1  2    5  6           17  18                 35
               !  !  !      !               !                      !
               !  !  !      !   <-----------!   -----------------> !
               !0 ! I!   X  !   (section)   !  Y  (word)           !
               !  !  !      !               !                      !

              If indexing is specified  in  this  indirect  word,  the
              entire  contents  of  X  are  added  to  the 30-bit Y to
              produce a global address.   A  local  address  is  never
              produced by this operation, and the type of operation is
              not dependent on the contents of X.  Hence, either Y  or
              C(X)  may  be used as an address or an offset within the
              extended address space just as is  done  in  the  18-bit
              address space.

              If further indirection is specified, the  next  indirect
              word  is  fetched from Y as modified by indexing if any.
              The next indirect word  may  be  instruction  format  or
              extended  format, and its interpretation does not depend
              on the format of the previous indirect word.

     4.  Some examples

         1.  Simple  instruction  reference  within  the  current   PC

                    3,,400/ MOVE T,1000     ;fetches from 3001000
                            JRST 2000       ;jumps to 3002000

          2.  Local tables may be scanned with standard AOBJN loops:

                            MOVSI X,-SIZ
                    LP:     CAMN T,TABL(X)  ;TABL in current section
                            JRST FOUND
                            AOBJN X,LP
                                                                Page 7

          3.  Global tables may  be  scanned  with  full  address  and

                            MOVEI X,0
                    LP:     CAMN T,@[EFIW TABL,X]   ;TABL(X) in ext fmt
                            JRST FOUND
                            CAIGE X,SIZ-1
                            AOJA X,LP

          4.  Subroutine argument pointer may be passed to  subroutine
              in another section:

                    word in arglist:

                         IFIW @VAR(X)   ;indexing and indirecting
                                        ;if specified will be relative
                                        ;to the section in which this
                                        ;pointer resides, i.e., the
                                        ;section of the caller


All effective address computations yield a 30-bit  address  defaulting
the  section if necessary, as described above.  Immediate instructions
use only the low-order 18-bits  of  this  as  their  operand  however,
setting  the  high-order  18  bits  to 0.  Hence, instructions such as
MOVEI, CAI, etc.  produce identical results regardless of the  section
in which they are executed.

Two immediate instructions are implemented which do retain the section
field of their effective addresses.
                                                                Page 8

     1.  XMOVEI (opcode 415) Extended Move Immediate:

         This instruction loads the entire  30-bit  effective  address
         into  the  designated  AC,  setting  bits  0-5  to  0.  If no
         indexing or indirection is specified, the current PC  section
         will  appear  in  the  section  field  of  the  result.  This
         instruction would replace  MOVEI  in  those  cases  where  an
         address  (rather  than a small constant) is being loaded, and
         the full address is needed.

         Example:  calling a subroutine in another  section  (assuming
         arglist in same section as caller):

                         XMOVEI AP,ARGLIST

                         PUSHJ P,@[EFIW SUBR]

         The subroutine could reference arguments as:

                         MOVE T,@1(AP)

         or could construct argument addresses by:

                         XMOVEI T,@2(AP)

         In both cases, the arglist pointer  would  be  found  in  the
         caller's  section  because  of the global address in AP.  The
         actual section of the effective address is determined by  the
         caller,  and  is implicitly the same as the caller if an IFIW
         is used as the arglist pointer, or is explicitly given if  an
         EFIW is used.

     2.  XHLLI (opcode 501)

         This instruction replaces the left  half  of  the  designated
         accumulator with the section number of its effective address.
         It is convenient where global addresses must be constructed.
                                                                Page 9


Any reference to a local address in the range 0-17(8) will be made  to
the  hardware ACs.  Also, any gobal reference to an address in section
1 in the range 0-17(8) (i.e., 1000000-1000017) will  be  made  to  the
hardware  ACs.   Global references to locations 0-17(8) in any section
other than section 1 will reference memory.  Thus:

     1.  Simple addresses in the usual AC range will be reference  ACs
         as  expected,  e.g.,  MOVE  2,3 will fetch from hardware AC 3
         regardless ot the current section.

     2.  To pass a global pointer to an AC, a section number of 1 must
         be included.

     3.  Very large arrays may lie across  section  boundaries;   they
         will be referenced with global addresses which will always go
         to memory, never to the hardware ACs.

     4.  PC references are always considered local references;   hence
         a  jump instruction which yields an effective address of 0-17
         in any section will cause code to be executed from the ACs.


In addition to  the  differences  in  effective  address  calculation,
certain   instructions   are   affected  in  other  ways  by  extended

     1.  Instructions which store/load the PC;  PUSHJ, POPJ, JSR, JSP.
         These  instructions  store a 30-bit PC without flags and with
         bits 0-5 of the destination word set to 0.  POPJ restores the
         entire  30-bit  PC  from the stack word.  JSA and JRA are not
         affected by extended addressing and store/load only  18  bits
         of PC.  Hence they are not useful for inter-section calls.

     2.  Stack instructions (PUSHJ, POPJ,  PUSH,  POP,  ADJSP)  use  a
         local  or  global  address  for  the  stack  according to the
         contents of the stack register following the same  convention
         as  for  indexing.   That  is,  if the left half of the stack
         pointer  is  0  or  negative  (prior   to   incrementing   or
         decrementing),  a  local  address using the right half of the
         stack pointer is computed and used for the  stack  reference.
         If  the left half of the stack pointer is positive and non-0,
         the entire word is taken as a global address.  In the  latter
         case,  incrementing  and decrementing of the stack pointer is
         done by adding of  subtracting  1,  not  1000001.   Hence,  a
         global  stack  for routines in many sections may be used in a
         similar  manner  to  present  stacks.   Stack  overflow   and
         underfow  protection would be done by making the pages before
         and after stack inaccessible since a space count field is not
         present in a gobal stack pointer.
                                                               Page 10

     3.  Byte  instructions.   Two  formats  of   byte   pointer   are
         recognized by the byte instructions.  The non-extended format
         is identical to the present standard byte pointer format  and
         is  recognized if bit 12 (previously unused) is 0.  If bit 12
         is 1, a two-word extended byte pointer format  is  recognized
         which contains the fields as shown:

          0      5  6       11  12   13      17  18                 35
          !       !           !    !           !                      !
          !  P    !    S      ! 1  !   MBZ     !  AVAIL TO SFTW.      !
          !       !           !    !           !                      !
          !  !  ! !                            !                      !
          !  ! I!X!  <-------------------      !  Y --------------->  !
          !  !  ! !                            !                      !
          0 1  2 5                                                  35

         Note that the second word is identical to the Extended Format
         Indirect  Word  (EFIW).   The right half of the first word is
         specifically reserved  to  software  for  byte  counts,  etc.
         Incrementing  of  this  format  of byte pointer is consistent
         with non-extended format;  the P field is reduced  until  the
         end  of  the  word  is  reached, whereupon the address in the
         second word in incremented.  Incrementing may cause  a  carry
         from  the  word  field  to  the section field of the address;
         hence extended byte arrays may lie across section boundaries.

     4.  Other new or modified instructions are LUUOs, BLT, XBLT, XCT,
         XJRSTF,  XJEN,  XPCW,  XSFM.  Some of these are valid only in
         exec mode.  Consult the System Reference  Manual  or  chapter
         2.2 of the KL10 Functional Specifications for details.


It is possible to generate code which works correctly in both extended
and   non-extended   environments.    Such   code   may  even  utilize
inter-section references when running in an extended environment;   it
is not limited to local addressing.

The basic rule is to observe the  extended  addressing  rules  in  the
construction or computation of:

     1.  Index words:  be sure the left half is cleared or contains  a
         negative  quantity  (e.g., a count) when setting up and using
         an index register.  This will cause a local reference in  the
         extended environment which will produce the same result as on
         the non-extended machine.

     2.  Indirect words:  always set bit 0 of indirect words  used  in
         argument lists, etc.  so as to produce local addresses.
                                                               Page 11

     3.  Stack pointers:  the most  common  present  format  (negative
         count   in  left  half)  works  consistently  under  extended
         addressing;  modifying a program to  use  an  extended  stack
         should  require no change to stack instructions except if the
         code expects to find the processor flags in  the  stacked  PC

When generating addresses to be passed to subroutines or used by other
code,  XMOVEI and XHLLI instructions may be used.  In the non-extended
environment, these opcodes are SETMI (equivalent to  MOVEI)  and  HLLI
respectively, which always load 0 into the left half.


Ultimately, the extended addressing hardware in conjunction  with  the
monitor  should  provide some of the power of general segmentation and
some other useful conventions and facilities.  The following are  some
of the ideas which have been advanced.

     1.  The fact that instructions are generally executed relative to
         the  current  sections suggests that subroutine libraries and
         facilities packages can be written which can  be  dynamically
         loaded  into any available section and used by many programs.
         An important point to observe in  connection  with  this  and
         most  of  the following conventions is that programs must not
         be compiled with fixed section numbers built into  the  code.
         Programs  should  be  built  so  as  to  be loadable into any
         section and to request additional  section  allocations  from
         the monitor as necessary.  This is the only reasonable way to
         ensure  that  conflicting  use  of  particular  sections   is

     2.  An entire section can be mapped by the monitor as quickly  as
         a  single  page.   Hence,  an entire file (up to 256K) can be
         mapped into a section, and the data therein manipulated  with
         ordinary instructions.

     3.  Programs can greatly reduce memory management logic by merely
         assuming  very  large  sizes  for  all data bases.  It is not
         however, very efficient to use many sections, each with  only
         a small amount of data.  A reasonable middle ground should be