Google
 

Trailing-Edge - PDP-10 Archives - BB-H311B-RM - swskit-documentation/bugs.mem
There are 4 other files named bugs.mem in the archive. Click here to see a list.


+---------------+
! D I G I T A L !   I N T E R O F F I C E  M E M O
+---------------+
To: List A
                                Date:   Sep 12, 1979
                                From:   Eric Osman
                                Ext:    6037
                                Dept:   LCG S.E. Dept.
                                Loc:    MR1-2/E37




                BUG Documentation Project



1.0  PURPOSE

The purpose  of  the  project  is  to  generate  a  document
describing the TOPS20 BUGCHKs, BUGHLTs, and BUGINFs.

The document will be generated in such a  way  as  to  force
programmers  to  update  the  document whenever a new BUG is
added to the monitor, and whenever the attributes of  a  BUG
are changed.

The BUG document will also be easily editable by people  who
don't  understand  how  to write programs (monitor code) but
who do know how to edit.

The intended audience of the BUG  documentation  are  people
that   understand  the  internals  of  the  TOPS20  monitor.
Generally,  this  will  include   system   programmers   and
specialists.  Because of this assumption, no attempt will be
made within a BUG's documentation to define terms used.  For
instance,  if  a  BUG having to do with magtapes involves an
inconsistency having to do with iorbs, no  attempt  will  be
made to define an iorb.



2.0  BACKGROUND

Currently,  BUGs  are  defined  within  the  monitor  source
module,  at  the location of program flow where a problem is
detected.  They are at places within the flow that shouldn't
be  reached,  so if they do, the BUG is provided to call the
malfunction to  a  person's  attention.   Depending  on  the
severity  of  the  abnormality,  a  BUG  may be one of three
types:

        BUGHLT - The monitor stops, because  this  situation
                 is  dangerous.   For  instance, if the disk
                 file data structure is  not  correct,  this
                                                      Page 2


                 would  hopefully  cause a BUGHLT before the
                 monitor writes more data onto it.

        BUGCHK - An incorrect situation, not  requiring  the
                 monitor  to  stop.   For instance, if a job
                 logging out cannot be reported  to  QUASAR.
                 This could happen if QUASAR has halted, but
                 it isn't necessary to stop the  monitor  in
                 this case.

        BUGINF - BUGINFS are the least severe  type  of  BUG
        detected.

A (contrived) BUG looks like this in the source listing:

        BUG(CHK,TSKTSK,<Bathroom light left on>,<T1>)

The first argument, "CHK" says that this  is  a  BUGCHK,  as
opposed  to  a  BUGHLT  or  a  BUGINF.  The second argument,
"TSKTSK" is the name  of  this  BUG.   The  third  argument,
"Bathroom light left on", is a short explanation of what bug
is detected.  The fourth argument is optional, "T1" in  this
case,  and  is  a list of locations to be printed out on the
operator's console if the bug  occurs.   In  this  contrived
example,  the monitor has discovered that the bathroom light
was left on when no  one  was  in  there!   T1  may  contain
information about which bathroom is involved.



3.0  GENERAL STRATEGY

The BUG definitions will be moved out of the monitor  source
listings,  and into a central source file.  This source file
will serve both as  the  definition  file  for  the  monitor
itself, and as the documentation of the BUGS.

An alternate idea was to have the monitor building procedure
create  the  BUGS  document, in the same sort of way that it
creates BUGSTRINGS.TXT, and hence each  BUG  in  the  source
code  would  have  its  documentation right there.  This has
merit, since it  is  then  convenient  to  add  a  new  BUG,
including its documentation, all in the source file.  It has
the disadvantage, however, that  editing  the  documentation
for a BUG involves editing monitor source files.  By keeping
all the information about BUGS in one file,  only  that  one
file  need  be  edited.  Hence people not trained in editing
source code can edit it.  Another disadvantage of  including
the  documentation  in  the  source  file is that the source
files become quite  large,  beyond  their  already-too-large
sizes.
                                                      Page 3


4.0  SPECIFICS OF THE PROCEDURE

The steps involved to add a BUG to the monitor with the  new
system are:

1)      Put the "instruction"

                BUG (name,<<x1,des1>,<x2,des2>...>)

        in the monitor code where you want  the  BUG  called
        "name" to occur.

        The x's are locations you want printed out when  the
        BUG  occurs  (the  optional  data).   The  des's are
        one-to-six character descriptors  for  each  of  the
        data.   For  instance,  if x1 is a temperature, des1
        could appropriately be "TEMP"  (but  leave  out  the
        quotes!).

2)      Add a DEFBUG entry to the file BUGS.MAC.  This entry
        cmopletely  defines  the  BUG,  including  its  type
        (BUGHLT, BUGCHK, or BUGINF) and documentation.

3)      Build an entire monitor in the standard way.

There are several more items that must be designated via the
DEFBUG  macro  than  were designated with the old BUG macro.
One new item is  a  documentation  argument  to  the  DEFBUG
macro,   which   documents   the  BUG  being  created.   The
documentation argument to  the  DEFBUG  macro  is  required.
MACRO  will  type  an  error message if the documentation is
omitted or blank.

Another new item which must  be  included  is  a  one-to-six
character  word  descriptor  for  each  optional datum.  The
descriptors are  put  in  with  each  optional  datam  being
specified.   Descriptors  will be required.  MACRO will type
out an error message if the descriptor of an optional  datum
is  omitted  or blank.  These descriptors must appear in the
source module also, to warn programmers which locations  are
printed  out when the BUG occurs, so that they don't clobber
the locations involved when they add new code.

The module name in which the BUG occurs must be specified as
an argument to the DEFBUG macro.

A short classification must  be  supplied  as  an  argument.
Often, this classification will be SOFT or HARD depending on
whether the BUG is expected to catch a software or  hardware
problem.   For  instance,  a  BUG  that  warns  that  a disk
controller caused a spurious interrupt would  be  classified
as  HARD.   A BUG that warns of an unexpected failure return
from a monitor  subroutine  would  be  classified  as  SOFT.
Other  classifications will be defined as need be.  If a BUG
cannot be classified, the word UNKNOWN will appear  as  this
                                                      Page 4


argument.

As an example,  here's  the  old  definition  of  OVRDTA  in
PHYSIO.MAC:

        BUG(INF,OVRDTA,<PHYSIO    -     OVERDUE     TRANSFER
        ABORTED>,<T1,T3,T2>)

Under the new scheme, all  that  appears  in  its  place  in
PHYSIO.MAC is:

        BUG(OVRDTA,<<T1,SLAVE>,<T3,MASSB>,<T2,CHN>>)

In BUGS.MAC, OVRDTA gets defined like this:

        DEFBUG(INF,OVRDTA,PHYSIO,HARD,<PHYSIO   -    OVERDUE
        TRANSFER ABORTED>,<<T1,SLAVE>,<T3,MASSB>,<T2,CHN>>,<

Cause:  Every BUG's documentation will be the last  argument
        to  the  DEFBUG macro.  This argument will be broken
        up into sections.  The "cause"  section  tells  what
        environment or chain of events will cause the BUG to
        happen.

Action: The "action" section tells  what  action  should  be
        taken  when  the  BUG  occurs.   For  some BUGs, the
        action might be to fix some directory.   For  others
        it  may  mean  calling  in  a  disk  or  tape  drive
        repairman.  For still others, it may say to send the
        dump of the crashed system to DEC with an SPR.

        If no particular action is suggested,  the  "action"
        section  will  not appear.  For instance, the CRDOLD
        BUGCHK will occur whenever a  user  program  does  a
        CRDIR  jsys  with a negative number in the left half
        of the user group word.  There's no  special  action
        needed  in  this  case,  so  the "action" section is
        omitted for CRDOLD.

Data:   SLAVE - the slave number of the hardware that caused
        the problem.

        MASSB - the massbus number.  If there is  no  number
        known,  or  this  value  is inapplicable, the number
        will be 777777777777.

        CHN -the channel the problem occurred on.

        For each datum, a paragraph starts with its name,  a
        hyphen, and some documentation.
>)

Note that the closing angle bracket closes the documentation
argument  to  the  DEFBUG macro, and the closing parenthesis
closes the entire list of arguments to the DEFBUG macro.
                                                      Page 5


In order to make listings (output from MACRO or  CREF)  more
informative  than  before,  the  BUG  macro  will  cause the
statement of the short description, plus the description  of
each optional datum to be displayed in the listing where the
BUG macro is called.  Also, the flavor of bug (INF, CHK,  or
HLT)  and  whether it's hardware or software related will be
displayed in the listing.  Hence the OVRDTA bug would appear
in the listing as

        BUG(OVRDTA)
        ;BUG Type:              hardware-related BUGINF
        ;BUG description:       PHYSIO - OVERDUE TRANSFER
        ABORTED
        ;BUG Data:              SLAVE
        ;                       MASSB
        ;                       CHN
As an improvement over the old  display  on  the  operator's
console, the system will now display the one-word descriptor
of the optional datum  before  each  one.   Here's  how  the
OVRDTA BUG would look in both formats:

Old format:
        ********************
        *BUGINF "OVRDTA" AT 27-JUL-79 10:57:42
        *PHYSIO - OVERDUE TRANSFER ABORTED
        *ADDITIONAL DATA: 0, 777777777777, 4
        ********************

New format:
        ********************
        *BUGINF "OVRDTA" AT 27-JUL-79 10:57:42
        *PHYSIO - OVERDUE TRANSFER ABORTED
        *ADDITIONAL DATA: SLAVE 0, MASSB 777777777777, CHN 4
        ********************



5.0  TRANSLATING OLD CODE

A TV macro will be used to pass over all the monitor  source
files,  finding  the  BUGs,  and  change  them  to their new
format.  As it does so, it accumulates the entries  for  the
new BUGS.MAC file.

The TV macro will  initialize  all  the  BUG  categories  to
"HARD".  Eventually, these should be changed as appropriate.
Any new BUGs added should be given the correct category.

The macro will initialize all the documentation to say "this
BUG is not documented yet".

The descriptors are all initialized by the macro to say "D".
All new bugs should be given the correct descriptors.
                                                      Page 6


6.0  HOW THE BUG MACROS WORK

The DEFBUG macro in BUGS.MAC generates a macro with the same
name as the BUG being defined.  The statement

        BUG(name)

in the source code causes the  macro  called  "name"  to  be
executed.   Macro  "name"  is  basically the same as the BUG
macro in  the  old  procedure.   Hence  the  actual  monitor
support  code for BUGs is fairly much the same as in the old
system.

Instead of having just the optional data locations specified
in   the  monitor,  the  BUG  macro  will  have  the  SIXBIT
descriptor in a word after each datum.  The monitor  routine
which currently types out the optional data will be enhanced
to understand to print out the descriptor in front  of  each
datum.