PDP-10 Archive: 6-manuals/1020spear.mem from SRI_NIC_PERM_SRC_1

Trailing-Edge - PDP-10 Archives - SRI_NIC_PERM_SRC_1_19910112 - 6-manuals/1020spear.mem

There are no other files named 1020spear.mem in the archive.















                              TOPS-10/TOPS-20
                                SPEAR Manual








|                    VERSION 6.0 INTERIM RELEASE DRAFT
|  
|  
|  
|                              December 1984

                  This manual describes the SPEAR  product
                  (Standard Package for Error Analysis and
|                 Reporting).     SPEAR    contains    two
|                 functions  that report on the errors and
                  events  that   are   recorded   by   the
                  operating system.

                  For   TOPS-20   systems,   this   manual
|                 supersedes   the  TOPS-10/TOPS-20  SPEAR
|                 Manual, Order Number:  AA-J833A-TK.



|  OPERATING SYSTEM:   TOPS-10   7.02
|                      TOPS-20   4.1 (KS/KL MODEL A)
|                      TOPS-20   6.0 (KL Model B)
|  SOFTWARE:           SPEAR     2.0


















                                     i



                                               First Printing, April 1982
|                                                  Revised, December 1984



   The information in this document is subject to change  without  notice
   and  should  not  be  construed  as  a commitment by Digital Equipment
   Corporation.  Digital Equipment Corporation assumes no  responsibility
   for any errors that may appear in this document.

   The software described in this document is furnished under  a  license
   and  may  only  be used or copied in accordance with the terms of such
   license.

   No responsibility is assumed for the use or reliability of software on
   equipment that is not supplied by DIGITAL or its affiliated companies.





|         Copyright C, 1982, 1984, Digital Equipment Corporation.
                            All Rights Reserved.





   The postage-prepaid READER'S COMMENTS form on the last  page  of  this
   document  requests  the  user's  critical  evaluation  to assist us in
   preparing future documentation.

   The following are trademarks of Digital Equipment Corporation:

        DEC                 DECnet              IAS
        DECUS               DECsystem-10        MASSBUS
        Digital Logo        DECSYSTEM-20        PDT
        PDP                 DECwriter           RSTS
        UNIBUS              DIBOL               RSX
        VAX                 EduSystem           VMS
                                                VT






















                                     ii



                                      CONTENTS



   CHAPTER 1       SPEAR OVERVIEW

           1.1     INTRODUCTION . . . . . . . . . . . . . . . . . . . 1-1
           1.2     USER PROFILES AND INTERACTION  . . . . . . . . . . 1-3


   CHAPTER 2       THE SYSTEM EVENT FILE

           2.1     INTRODUCTION . . . . . . . . . . . . . . . . . . . 2-1
           2.2     ENTRY CATEGORIES . . . . . . . . . . . . . . . . . 2-2
           2.2.1     Software Entries . . . . . . . . . . . . . . . . 2-2
           2.2.2     Hardware Entries . . . . . . . . . . . . . . . . 2-2
           2.2.2.1     CPU and Memory Failures  . . . . . . . . . . . 2-3
           2.2.2.2     Channel and Controller Failures  . . . . . . . 2-4
           2.2.2.3     I/O Device Failures  . . . . . . . . . . . . . 2-4
           2.2.3     Performance Entries  . . . . . . . . . . . . . . 2-4
           2.3     RECORDING EVENTS . . . . . . . . . . . . . . . . . 2-4
           2.3.1     Record Format  . . . . . . . . . . . . . . . . . 2-5
           2.3.2     Record Conventions for Numbers and Dates . . . . 2-6


   CHAPTER 3       ISOLATING FAULTS

           3.1     INTRODUCTION . . . . . . . . . . . . . . . . . . . 3-1
           3.2     TYPES OF FAILURES  . . . . . . . . . . . . . . . . 3-1
           3.2.1     Characteristics of Solid Failures  . . . . . . . 3-2
           3.2.2     Characteristics of Intermittent Failures . . . . 3-2
           3.3     ERROR DETECTING AND ERROR CHECKING . . . . . . . . 3-2
           3.3.1     Hardware Error Detectors . . . . . . . . . . . . 3-2
           3.3.2     Software Error Checking  . . . . . . . . . . . . 3-4
           3.4     ISOLATION TECHNIQUES . . . . . . . . . . . . . . . 3-5
           3.4.1     Verification . . . . . . . . . . . . . . . . . . 3-6


   CHAPTER 4       SPEAR FUNCTIONS

           4.1     INTRODUCTION . . . . . . . . . . . . . . . . . . . 4-1
           4.2     RUNNING SPEAR  . . . . . . . . . . . . . . . . . . 4-1
           4.2.1     Prompts, Responses, and Arguments  . . . . . . . 4-2
           4.2.2     Separators and Terminators . . . . . . . . . . . 4-2
           4.2.3     Help Features  . . . . . . . . . . . . . . . . . 4-3
           4.2.4     File Specifications  . . . . . . . . . . . . . . 4-4
           4.2.5     SPEAR Switches . . . . . . . . . . . . . . . . . 4-4
           4.2.6     Exiting from SPEAR . . . . . . . . . . . . . . . 4-5
           4.3     RETRIEVE . . . . . . . . . . . . . . . . . . . . . 4-5
           4.3.1     RETRIEVE Input . . . . . . . . . . . . . . . . . 4-6
           4.3.2     RETRIEVE Output  . . . . . . . . . . . . . . . . 4-7
           4.3.3     RETRIEVE Procedure . . . . . . . . . . . . . . . 4-9
           4.3.3.1     Retrieving Selected Events . . . . . . . . .  4-10
           4.3.3.2     Sample RETRIEVE Session  . . . . . . . . . .  4-18
           4.3.3.3     Short Format . . . . . . . . . . . . . . . .  4-18
           4.3.3.4     Octal Format . . . . . . . . . . . . . . . .  4-19
           4.3.3.5     Full Format  . . . . . . . . . . . . . . . .  4-21
           4.4     SUMMARIZE  . . . . . . . . . . . . . . . . . . .  4-24
           4.4.1     The SUMMARIZE Report . . . . . . . . . . . . .  4-25
           4.4.2     Error Register Codes . . . . . . . . . . . . .  4-32
           4.4.3     SUMMARIZE Procedure  . . . . . . . . . . . . .  4-34


                                    iii



           4.4.4     Sample SUMMARIZE Session . . . . . . . . . . .  4-39
           4.5     TOPS-20 KLSTAT MODE  . . . . . . . . . . . . . .  4-39
           4.5.1     KLSTAT Procedure . . . . . . . . . . . . . . .  4-41


   CHAPTER 5       ENTRY DESCRIPTIONS

           5.1     INTRODUCTION . . . . . . . . . . . . . . . . . . . 5-1
           5.2     TOPS-10 ENTRIES  . . . . . . . . . . . . . . . . . 5-2
           5.2.1     System Reload  . . . . . . . . . . . . . . . . . 5-2
           5.2.2     Non-Reload Monitor Error . . . . . . . . . . . . 5-3
           5.2.3     Crash Extract  . . . . . . . . . . . . . . . . . 5-4
           5.2.4     Data Channel Error . . . . . . . . . . . . . . . 5-7
           5.2.5     DAEMON Started . . . . . . . . . . . . . . . . . 5-7
           5.2.6     MASSBUS Disk Error . . . . . . . . . . . . . . . 5-8
           5.2.7     DX20 Device Error  . . . . . . . . . . . . . . . 5-9
           5.2.8     Software Event . . . . . . . . . . . . . . . .  5-13
           5.2.9     Configuration Status Change  . . . . . . . . .  5-14
           5.2.10    System Log Entry . . . . . . . . . . . . . . .  5-15
           5.2.11    Software Requested Data  . . . . . . . . . . .  5-15
           5.2.12    Magtape System Error . . . . . . . . . . . . .  5-16
           5.2.13    Front End Device Report  . . . . . . . . . . .  5-18
           5.2.14    Front End Reload . . . . . . . . . . . . . . .  5-18
           5.2.15    KS10 Halt Status Block . . . . . . . . . . . .  5-19
           5.2.16    Magtape Statistics . . . . . . . . . . . . . .  5-19
           5.2.17    Disk Statistics  . . . . . . . . . . . . . . .  5-20
           5.2.18    DL10 Communications Error  . . . . . . . . . .  5-22
           5.2.19    KL10 Parity or NXM Interrupt . . . . . . . . .  5-22
           5.2.20    KS10 NXM Trap  . . . . . . . . . . . . . . . .  5-23
           5.2.21    KL10 or KS10 Parity Trap . . . . . . . . . . .  5-24
           5.2.22    Memory Sweep for NXM . . . . . . . . . . . . .  5-25
           5.2.23    Memory Sweep for Parity  . . . . . . . . . . .  5-26
           5.2.24    CPU Status Block . . . . . . . . . . . . . . .  5-26
           5.2.25    Device Status Block  . . . . . . . . . . . . .  5-28
           5.2.26    Line printer Error . . . . . . . . . . . . . .  5-29
           5.2.27    Unit Record Error  . . . . . . . . . . . . . .  5-30
           5.3     TOPS-20 ENTRIES  . . . . . . . . . . . . . . . .  5-30
           5.3.1     TOPS-20 System Reloaded  . . . . . . . . . . .  5-30
           5.3.2     TOPS-20 BUGCHKs and BUGHLTs  . . . . . . . . .  5-31
           5.3.3     MASSBUS Device Error . . . . . . . . . . . . .  5-33
           5.3.4     DX20 Device Error  . . . . . . . . . . . . . .  5-39
           5.3.5     Drive Statistics Entries . . . . . . . . . . .  5-41
           5.3.6     Configuration Status Change  . . . . . . . . .  5-43
           5.3.7     System Log Entry . . . . . . . . . . . . . . .  5-44
           5.3.8     Front-End Device Report  . . . . . . . . . . .  5-44
           5.3.9     Front End Reloaded . . . . . . . . . . . . . .  5-45
           5.3.10    Processor Parity Trap  . . . . . . . . . . . .  5-46
           5.3.11    Processor Parity Interrupt . . . . . . . . . .  5-47
           5.3.12    KL CPU Status Block  . . . . . . . . . . . . .  5-48
           5.3.13    MF20 Device Report . . . . . . . . . . . . . .  5-50
           5.3.14    KLERR Front End Device Report  . . . . . . . .  5-50
           5.4     DECNET ENTRIES (V2.1)  . . . . . . . . . . . . .  5-53
           5.4.1     Network Control Started  . . . . . . . . . . .  5-53
           5.4.2     Network Up-Line Dump . . . . . . . . . . . . .  5-54
           5.4.3     Network Down-Line Load . . . . . . . . . . . .  5-54
           5.4.4     Network Hardware Error . . . . . . . . . . . .  5-55
           5.4.5     Network CHECK11 Report . . . . . . . . . . . .  5-56
           5.4.6     Network Line Statistics  . . . . . . . . . . .  5-57
           5.5     DECNET ENTRIES (V3.0)  . . . . . . . . . . . . .  5-58




                                     iv



   APPENDIX A      SPEAR MESSAGES


   APPENDIX B      INSTALLATION PROCEDURES

           B.1     INTRODUCTION . . . . . . . . . . . . . . . . . . . B-1
           B.1.1     SPEAR Files  . . . . . . . . . . . . . . . . . . B-1
           B.1.2     Loading and Installing SPEAR . . . . . . . . . . B-1


   APPENDIX C      COMMAND AND CONTROL FILES


   APPENDIX D      EVENT CODES


   APPENDIX E      DISK SUBSYSTEM ERROR BITS


   APPENDIX F      NETWORK EVENT PARAMETERS


   APPENDIX G      GLOSSARY


   INDEX


   FIGURES

           2-1     Components of a Computer System  . . . . . . . . . 2-3


   TABLES

           4-1     Device Types . . . . . . . . . . . . . . . . . . . 4-7
           4-2     Network Event Classes  . . . . . . . . . . . . . . 4-7
           4-3     Subprompts for Device Types  . . . . . . . . . .  4-13
           4-4     Error Types  . . . . . . . . . . . . . . . . . .  4-13
           4-5     MASSBUS Disk Registers . . . . . . . . . . . . .  4-32
           4-6     Tape Registers . . . . . . . . . . . . . . . . .  4-33
           4-7     Subprompts for Device Types  . . . . . . . . . .  4-35
           5-1     Network Event Classes  . . . . . . . . . . . . .  5-58
           A-1     User Validation Messages . . . . . . . . . . . . . A-1
           A-2     Dialogue Usage Messages  . . . . . . . . . . . . . A-3
           A-3     Warning Messages . . . . . . . . . . . . . . . . . A-4
           A-4     Event File Messages  . . . . . . . . . . . . . . . A-5
           D-1     TOPS-10 and TOPS-20 Event Codes  . . . . . . . . . D-1















                                     v















                                  PREFACE



|  This manual describes Version 2.0 of SPEAR  on  TOPS-10  and  TOPS-20.
   The  primary  audience  for this manual is a person with experience in
   the following areas:

        1.  Fault isolation techniques

        2.  KL10 instruction set

        3.  All hardware  connected  to  the  various  configurations  of
            TOPS-10 or TOPS-20

   If you do not have the above experience, refer to:

        TOPS-10 Operators Guide

        TOPS-20 Operators Guide

        DECsystem-10/DECSYSTEM-20 Processor Reference Manual

        DECsystem-10 Hardware Reference Manual


   READING PATH

   This manual has three functions:  it  serves  as  a  learning  aid,  a
   user's  guide, and a reference tool for those who already have learned
   to use the SPEAR library.

   As a learning aid:  Chapters 1, 2, and 3 provide an  overview  of  the
   SPEAR  library.  They also provide background information necessary to
   understand and use the SPEAR library.

|  As a user's guide:  Chapter 4  provides  step-by-step  procedures  for
|  using  the  SPEAR  functions  RETRIEVE  and  SUMMARIZE.   This chapter
|  explains the command syntax and  the  response  parameters  associated
   with each function.

|  As a reference tool:  Chapter 5 and the appendixes  provide  reference
   material  such  as  system  event  file formats, error messages, and a
   glossary.  This material is not meant to be  read  from  beginning  to
|  end.   Use  Chapter  5 and the appendixes as a reference when you need
   them.





                                     vi















                      CONVENTIONS USED IN THIS MANUAL



   The following conventions are used throughout this manual:

        Contrasting colors       Red - where examples contain  both  user
                                 input    and    computer   output,   the
                                 characters you  type  are  in  red;  the
                                 characters SPEAR prints are in black.

        Lowercase letters        Lowercase letters in  a  command  string
                                 indicate  variable  information you must
                                 supply.

        UPPERCASE LETTERS        Uppercase letters in  a  command  string
                                 indicate   fixed  (literal)  information
                                 that you must enter as shown.

        [ ]                      Square   brackets   indicate    optional
                                 information  that  you  can  omit from a
                                 command string.  Do not type the  square
                                 brackets.

        Examples                 All examples were produced on either the
                                 TOPS-10 or the TOPS-20 operating system.

                                 This symbol represents where  you  press
                                 the Escape key.

                                 This symbol represents where  you  press
                                 the RETURN key.



















                                    vii


















                                Tab divider



                           BACKGROUND INFORMATION












                                 CHAPTER 1

                               SPEAR OVERVIEW



   1.1  INTRODUCTION

   This Chapter introduces you to the SPEAR product and gives an overview
   of its use.

   The name SPEAR is an acronym for Standard Package for  Error  Analysis
   and  Reporting.   The  main  function  of SPEAR is to help isolate the
   cause of a failure through information contained in the  system  event
   file.  Most failures are intermittent; that is, they are active at one
   instant causing system malfunction and  inactive  at  another  instant
   allowing  system  operation.  The task at hand is to find the cause of
   the failure and correct the problem  in  the  least  amount  of  time.
   SPEAR helps to accomplish this task.

|  SPEAR contains functions that report on the errors and events that are
   recorded  by  the  operating system, TOPS-10 or TOPS-20.  In the past,
   the field service engineer was forced to analyze intermittent failures
   by  sorting  through  error  reports  generated by SYSERR, looking for
   common failure patterns.  For example, the engineer  examined  several
   disk  reports  looking  for  common  media  failures, common disk head
   failures, or common failures of the read/write circuitry.  Now,  SPEAR
   can do the tedious work.


























                                    1-1

                               SPEAR OVERVIEW


   The system event file contains entries made by  the  operating  system
   and  the communications subsystems (if any).  Each time certain events
   occur, the operating system records and stores pertinent data  in  the
   system  event  file.   The  operating  system continually monitors and
   records information about every disk, tape, and memory parity error as
   they  occur,  along  with  errors  from  other  subsystems.   At  your
|  discretion, you can call on SPEAR to generate  a  report  of  selected
   events.

   For more information on the system event file,  refer  to  Chapter  2.
   For  samples  of  events  your  operating  system can record, refer to
|  Chapter 5.

|  The SPEAR program consists of the following functions:

         o  RETRIEVE

         o  SUMMARIZE
|  
|        o  KLSTAT (TOPS-20 only)

   These function names are also the primary commands you type to run the
   particular function of SPEAR in which you are interested.

   RETRIEVE reads the binary data in the system event file  and  produces
   an  ASCII report for each entry selected.  RETRIEVE also allows you to
   save specific entries either for later analysis and translation or for
   record-keeping purposes.

|  SUMMARIZE reads the binary data in the system event file and  produces
|  an ASCII report.  Refer to Section 4.4 for a description of SUMMARIZE.

|  Chapter  4  describes  these  functions  in  detail,  along  with   an
   additional feature available only on TOPS-20, KLSTAT mode.





























                                    1-2

                               SPEAR OVERVIEW


   1.2  USER PROFILES AND INTERACTION

   There are three main groups of SPEAR users:

        1.  Field  Service  and  Software  Support  personnel  who   have
            specific maintenance responsibilities.

        2.  System operators who must  recognize  failures  and  initiate
            recovery procedures.

        3.  System managers who have a need  to  monitor  overall  system
            performance and schedule system use.

   These groups each have varying degrees of expertise  in  software  and
   hardware areas.  SPEAR can not only handle the needs of each group but
   can also guide the new user as well as the experienced user.

   The system operator and Field Service engineer can cooperate by  using
   SPEAR as a tool for both preventive and corrective maintenance.












































                                    1-3

2-1












                                 CHAPTER 2

                           THE SYSTEM EVENT FILE



   2.1  INTRODUCTION

   This chapter discusses the file that SPEAR uses for input, the  system
   event  file.   Specifically,  this  chapter  discusses what events are
   recorded, how they are recorded, and what form they take within  their
   respective files.

   Each operating system and communications subsystem has its  own  error
   logging  facility  to gather and maintain information on system errors
   and events as they  occur.   The  error  logging  facility  detects  a
   variety  of  hardware and software errors, providing a detailed record
   of system activity.   When  an  error  occurs,  the  facility  gathers
   significant  data  about  the current state of the system; the type of
   data it gathers depends on the type of error detected.  In addition to
   detecting  actual  errors,  the  facility monitors events that reflect
   other aspects of system performance.  The  recording  of  such  events
   helps to define the system context in which actual errors occur.

   The events are recorded  in  a  system  event  file,  ERROR.SYS.   The
   logical  name  for the location of this file (structure and directory)
   depends on which operating system you are using.  The  following  list
   gives you the names to use to locate your system event file:
|  
|        o  TOPS-10 V7.02    SYS:ERROR.SYS
|  
|        o  TOPS-20 V4.1     SYSTEM:ERROR.SYS
|  
|        o  TOPS-20 V6.0     SERR:ERROR.SYS

   Events that occur during the operation of the system are  logged  into
   the  system  event  file  for use in preventive maintenance as well as
   corrective  maintenance.   These  events  occur  within  the   various
   hardware and software components of the system, such as:

        Hardware            Software

        CPU                 Operating system
        Memory              Memory management
        I/O                 I/O
        Console             File system

   Some of the events that  can  occur  include  parity  errors,  address
   failures,  operator  log  entries,  system  reloads, device mounts and
   dismounts.  Each time one of these events occurs, an entry is appended
   to the system event file in binary format.



                                    2-1

                           THE SYSTEM EVENT FILE


   2.2  ENTRY CATEGORIES

   There are two general categories of entries in the system event  file,
   error  and  nonerror.  Both categories can be broken down further into
   the following:

        1.  Software entries

        2.  Hardware entries

        3.  Performance entries

   The following three sections describe the  entry  types  that  can  be
   found in the system event file.



   2.2.1  Software Entries

   The software error entries that SPEAR is concerned with  are  internal
   software  errors.   On  TOPS-10,  these  errors result in a STOPCD; on
   TOPS-20, these errors result in a BUGHLT, BUGCHK, or BUGINF.

   A STOPCD is represented by a 3-letter message that is printed  at  the
   operator's  terminal (CTY) when the operating system detects a serious
   error.  Sometimes the operating system crashes  immediately  following
   this message; at other times the operating system continues to run but
   halts the current job.  The action the operating system takes  depends
   on the severity of the problem.  There are five types of STOPCDs:

        1.  HALT  - The system halts and you must manually dump  and  and
                    reload the operating system.

        2.  STOP  - All jobs are aborted, and  the  system  automatically
                    dumps and reloads itself.

        3.  CPU   - This is the same as STOP except this  message  occurs
                    on  dual  processors.   Jobs  are aborted only on the
                    processor where the error occurs.

        4.  JOB   - The current job is aborted and processing continues.

        5.  DEBUG - A message prints and processing continues.

   The list  of  all  stopcode  messages  is  documented  in  the  STOPCD
   specification in the TOPS-10 Software Notebooks.

   The TOPS-20 operating system errors also range in severity.  A  BUGHLT
   is  the  most  serious.  It is a non-recoverable error detected by the
   operating system.  A BUGCHK is a recoverable  error  detected  by  the
   operating  system,  while  a  BUGINF is a message informing you that a
   certain event related to the operating system has occurred.   BUGHLTs,
   BUGCHKs, and BUGINFs are listed in the TOPS-20 Operators Guide.



   2.2.2  Hardware Entries

   The hardware entries come from a variety of subsystems;  CPU,  memory,
   I/O, console, and networks.  The number and type of components depends
   on the system configuration.  In general, Figure  2-1  represents  the


                                    2-2

                           THE SYSTEM EVENT FILE


   major  components  or  subsystems  that  can contribute entries to the
   system event file.



















   Figure 2-1:  Components of a Computer System


   Hardware error entries are the most frequent  type  of  error.   These
   errors  are  caused by a failure in the hardware itself.  Each time an
   event of this type occurs, an entry is  made  into  the  system  event
   file.   Hardware  error  entries  can  be  divided  into three general
   categories:

        1.  CPU-instruction and CPU-addressing failures

        2.  Controller and channel failures

        3.  I/O errors

   Because the system hardware cannot be expected to operate continuously
   without  failure,  the  design  of the hardware includes facilities to
   monitor the hardware operation.  (One  such  facility  is  the  parity
   check.)   Once  the system has detected an error, it can either signal
   the CPU and system software that an error has occurred or  attempt  to
   recover  from  the  error and notify the software if it cannot recover
   successfully.  This activity is recorded in the form of  one  or  more
   entries in the system event file.



   2.2.2.1  CPU and Memory Failures - The first  category  is  a  failure
   occurring  in  the  CPU  and main storage section of the system.  This
   type of failure is perhaps the most  difficult  to  handle  correctly.
   These  failures can easily modify either the operating system software
   or a user program or cause instructions to be incorrectly executed.  A
   failure  in an addressing section can cause the system to operate with
   incorrect data or unknowingly modify some other job's program or data.
   For  these  reasons, CPU errors ordinarily cause the crash of a job or
   the entire system, depending on whether a user or the operating system
   is in control.






                                    2-3

                           THE SYSTEM EVENT FILE


   2.2.2.2  Channel and  Controller  Failures - The  second  category  of
   hardware  error  entry is a channel or controller failure.  The system
   controllers monitor and control several I/O devices of the same  type,
   and  the channels of various types connect the CPU and/or main storage
   units with the I/O controllers and devices.  These errors  are  likely
   to affect several jobs or users because each controller or channel can
   handle several I/O devices being  used  by  many  jobs  or  processes.
   Detected errors are signalled to the CPU, and the operating system may
   stop the current operation if the error is serious.  An example  is  a
   controller's  parity  check  of  a command issued by the CPU.  If this
   parity check fails, the command will not be performed, and  the  error
   will be signalled to the CPU.  Such an event is recorded in the system
   event file for subsequent retrieval by SPEAR.



   2.2.2.3  I/O Device Failures - The third category  of  hardware  error
   entry  is a failure of an I/O device.  Errors detected by a single I/O
   device are recovered in the same  manner  as  channel  and  controller
   failures but usually the error affects only one job or task.  Some I/O
   failures are caused by faulty media.  The most frequently used form of
   error recovery in this case is to retry the failing operation.  If the
   failure continues for a specified number of consecutive  retries,  the
   job  or task is crashed.  Each failure is recorded in the system event
   file.



   2.2.3  Performance Entries

   The system event file contains more than just error entries.  It  also
   contains  entries  concerning  day-to-day events of the system.  These
   events vary depending on the operating system.  But  in  general,  you
   might find entries of the following nature:

        1.  System reloads

        2.  Tape and disk mounts/dismounts

        3.  Operator messages

   These entries add another  dimension  to  your  environment.   Keeping
   track  of  system  performance  can  be  a  useful  tool in preventive
   maintenance.



   2.3  RECORDING EVENTS

   The operating system continually detects and records events concerning
   every  disk,  tape,  and  memory  parity  error  as  they  occur.  The
   operating system:

        1.  Detects the event

        2.  Identifies the type of event

        3.  Associates it with a device





                                    2-4

                           THE SYSTEM EVENT FILE


        4.  Gathers information about it

        5.  Records the date and time

        6.  Stores the information as an entry by  appending  it  to  the
            system event file

        7.  In some cases, tries to recover or  find  a  way  around  the
            error

   The system event file is a sequential file, therefore, each new  entry
   is  written  to  the  end of the file.  SPEAR can format these entries
   into an ASCII report with its RETRIEVE facility.  Refer to Section 4.3
   for  information  on  RETRIEVE.   The  following section describes the
   template that each entry fills.



   2.3.1  Record Format

   Each entry in a TOPS-10 and TOPS-20 system event file is  composed  of
   two  sections:   a header section and a body section.  The top section
   (contained in asterisks) of each entry report is the  header  section.
   It contains the following information:

        1.  The entry type

        2.  The time the entry was recorded

        3.  The operating system uptime at the time of the entry

        4.  The serial number of the CPU where the entry occurred

        5.  The record sequence number

   The record sequence number is a number indicating the position of  the
   entry  in  the  file.  SPEAR assigns the record sequence number to the
   entry when you decide to RETRIEVE it.

   For each operating system, the format of the header is the same.   The
   following  is a sample of an entry header on TOPS-20 after it has been
   translated by SPEAR:

           ************************************************************
           MASSBUS DEVICE ERROR 
            LOGGED ON FRI 13 JUN 80 03:23:15 MONITOR UPTIME WAS 2:34:08
                   DETECTED ON SYSTEM #2137.
                   RECORD SEQUENCE NUMBER:   344.
           ************************************************************

   On TOPS-10, if the system crashed and the entry has been  copied  from
   the  CRASH.EXE  file,  the  header  states this fact at the top of the
   section.  For example:

           ***********************************************************
           **THIS ENTRY COPIED FROM A SAVED CRASH**
                           .
                           .
                           .
                           .
           ***********************************************************


                                    2-5

                           THE SYSTEM EVENT FILE


   Because the information was extracted from a saved crash instead of  a
   running  operating  system,  the  date  and  time of the entry and the
   uptime listed in the header  are  the  last  values  recorded  by  the
   operating  system  before  it  crashed.   (Note  that multiple entries
   extracted from a crash will have identical DATE, TIME, and UPTIME.)

   The body section of the entry contains the  various  data  items  that
   make up the entry.  The format of the header is constant regardless of
   the entry type but the body varies according to  the  type  of  entry.
   The  amount  of  information  that is reported in the body also varies
   depending on the format you specify to RETRIEVE.  You  can  receive  a
   SHORT  version  of  an  entry  with only summary information or a FULL
   entry with all the information that  is  in  the  system  event  file.
   Refer to Section 4.3 for more information on the RETRIEVE function.



   2.3.2  Record Conventions for Numbers and Dates

   In the entries on TOPS-10 and TOPS-20, most numbers  output  by  SPEAR
   are  either decimal or octal.  If SPEAR uses another numbering system,
   it is so noted on any  report  you  request.   Decimal  values  always
   contain  a  decimal point; all other values are octal.  Values printed
   in half-word format have leading zeroes suppressed in each half of the
   word, and the halves are separated with a comma.

   All register values that are translated to  text,  such  as  the  CONI
   value,  have text translations only for bits or bytes of interest, and
   the whole value is dumped.  For example, the CONI value might  include
   a  DONE  bit and a PI assignment, but these bits are not translated to
   text.

   All dates and times printed by SPEAR are from your  local  time  zone,
   for example EST, unless otherwise stated.

|  Refer to Chapter 5 for samples of  entries  that  can  appear  in  the
   system event file of your operating system.



























                                    2-6












                                 CHAPTER 3

                              ISOLATING FAULTS



   3.1  INTRODUCTION

   The main reason for using SPEAR is to  isolate  the  faults  that  are
   causing  intermittent  failure of the system.  In case you are unaware
   of the various problems you can run into trying to find the  cause  of
   these failures, this chapter discusses:

        1.  The types of failures that can occur and what causes them.

        2.  The various error-checking schemes built into the system.

        3.  Some techniques to follow in isolating these failures.



   3.2  TYPES OF FAILURES

   A fault is a condition that causes  a  system  component  to  fail  to
   perform  as expected.  For example, such a condition could be a broken
   wire, a power supply fluctuation, or an unexpected interaction between
   two  or  more software routines.  As a matter of course, the operating
   system records the symptoms of these occurrences in the  system  event
   file for later reference.

   A fault is not necessarily  noticeable  until  a  failure  occurs.   A
   failure  occurs  only  when a fault causes an adverse effect on system
   performance.  The fault probably does  not  become  apparent  until  a
   failure occurs.

   You are likely to find several faults before you find the one that  is
   causing  the  failure.   Therefore,  always confirm that the fault you
   corrected is indeed the one that is causing  the  failure.   Refer  to
   Section 3.4.1 for verification techniques.

   You should also be on the lookout for changes in performance that  may
   indicate  an  impending failure.  By running SPEAR daily and keeping a
   record of its output, you could prevent a problem with the system.

   There are two general categories of failures caused by  faults.   They
   are:

         o  Solid failures

         o  Intermittent failures




                                    3-1

                              ISOLATING FAULTS


   3.2.1  Characteristics of Solid Failures

   A fault that affects the system in a permanent  manner  results  in  a
   solid   failure.    A  solid  failure  is  easier  to  solve  than  an
   intermittent failure.

   Because the failure is solid; that is, reproducible, you have a  basis
   by  which  to  research,  identify,  and  eliminate  the  cause of the
   failure.



   3.2.2  Characteristics of Intermittent Failures

   A fault that affects the system in a temporary manner can result in an
   intermittent  failure.   An  intermittent failure is more difficult to
   solve than a solid failure.  Something must be causing the failure  to
   occur  and  something  must  be  making it go away.  The secret behind
   finding the cause of an intermittent failure is knowing that  somehow,
   somewhere, something is changing the conditions under which the system
   is running.  The  changing  conditions,  in  turn,  make  the  problem
   intermittent.

   For field service engineers:  the next  time  you  are  working  on  a
   really  tough  intermittent problem (after checking the power supplies
   and ground  system  and  running  the  appropriate  diagnostics),  try
   stepping  back  and  thinking about the problem.  Think about what the
   system is doing.  Watch it for a while.  See if you can  identify  the
   exact  conditions  at the time of the failure.  Use SPEAR to watch the
   conditions of the system and check the events before  and  after  they
   occur by checking the system event file.

   If you can identify the conditions, then maybe you can reproduce them.
   If  you  can  reproduce  the  conditions,  then  you  have changed the
   intermittent failure into a solid failure.  Although the  approach  to
   solving  a  solid  failure  is  the same as the approach to solving an
   intermittent failure, in many cases, you  will  find  that  solving  a
   solid failure is easier.



   3.3  ERROR DETECTING AND ERROR CHECKING

   The system has several means by which to check for errors in both  the
   hardware   and   software.    The  hardware  contains  error-detection
   circuits, and the software contains error-checking routines.  Both the
   detection circuits and checking routines serve a dual purpose:  (1) to
   minimize the effects of a failure on overall system  performance,  (2)
   to help isolate the cause of a failure.



   3.3.1  Hardware Error Detectors

   There are three basic types of hardware error detectors in common use:

        1.  Threshold error detectors

        2.  Timing error detectors




                                    3-2

                              ISOLATING FAULTS


        3.  Parity error detectors






























































                                    3-3

                              ISOLATING FAULTS


   Threshold error detectors monitor critical analog  circuits,  such  as
   power   supplies,   servomechanisms,   write   current  circuits,  and
   temperature probes.

   Timing error detectors monitor asynchronous events within the  system,
   such  as  data  requests to main memory or cache.  The memory or cache
   must respond to the request within a certain amount of  time.   If  it
   does  not,  the nonexistent-memory timing-error detector sets an error
   condition.  Other asynchronous  events  that  must  be  monitored  for
   proper timing are:  index and sector pulses, disk and tape up-to-speed
   operations, and internal and external clocks.

   Parity error detectors  monitor  the  transfer  of  information.   The
   parity  generator adds one or more extra bits to the information being
   transferred to satisfy a particular parity algorithm.  For example, in
   the  case of the single-bit odd parity, the information is in the form
   of ones and zeros, the extra parity bit assures that the total  number
   of  one  bits  in  the  transfer  is  odd.   The parity error detector
   monitors each transfer.  Should a transfer ever contain an even number
   of  one  bits,  the  parity  error  detector  raises  a  parity  error
   condition.  Note that in some cases, two bits can be  dropped  leaving
   odd parity.  However, this is an undetectable error condition.

   Once any one of  these  detectors  detects  an  error  condition,  the
   operating  system  records  the  information as an entry in the system
   event file.  These are the kinds of events you  will  be  looking  for
   when using the SPEAR library.



   3.3.2  Software Error Checking

   There are four types of software error  checking  routines  in  common
   use:

        1.  Range checking

        2.  Validity checking

        3.  Sum checking (checksum)

        4.  Loop checking

   A range checking routine verifies that the  arguments  supplied  to  a
   routine fall between two known values.

   A validity checking routine verifies that a routine written to  accept
   only certain arguments indeed accepts only those arguments.  Any other
   response causes an error condition.

   A sum checking routine  (checksum)  checks  file  storage.   When  the
   monitor assembles a group of blocks to write contiguously on the disk,
   it checksums the first word of that group and saves that  checksum  in
   the  retrieval  information  block  (RIB).   If,  when read back, that
   checksum does not match the first word; the monitor  assumes  it  read
   the  wrong  block.   If there are no hardware errors, this is the best
   assumption.  These errors probably indicate a disk addressing failure.






                                    3-4

                              ISOLATING FAULTS


   If the monitor crashes before it is able to write the new  RIB  of  an
   old file, the checksum may change in core but not on disk.  An obscure
   software problem may also be responsible.  Reproducing  the  error  is
   one  way for you to narrow the problem down.  Also check the crash log
   and look for other error types.

   Note that a checksum error  is  not  a  substitute  for  parity.   Its
   purpose  is  to  make  sure  that  a data set was written in the right
   place.  If it was not, either the software failed to keep track of the
   data, or the hardware failed to address the correct place.

   A loop checking routine keeps count of the number of times  a  program
   entered  a  loop and reports an error when a maximum count is reached,
   indicating that the loop is unable to reach a decision.

   Any time one of these error conditions is set,  the  operating  system
   records  the  event  in the system event file.  You can check on these
   events by using the SPEAR library.



   3.4  ISOLATION TECHNIQUES

   When you are faced with  the  problem  of  finding  the  cause  of  an
   intermittent  failure, you should take the time to define the problem.
   First check the symptoms:

        1.  What is happening that should?

        2.  What is happening that should not?

        3.  What are the conditions and circumstances?

   As you probably know, here are some possible  causes  of  intermittent
   failures:

        1.  An environmental violation  (power,  grounding,  temperature,
            humidity, contamination)

        2.  A damaged, defective, or worn component

        3.  A faulty mechanical or electrical connection

        4.  A mechanical misalignment

        5.  An electrical misadjustment

        6.  A software design oversight

        7.  A hardware design oversight

   What you have to work with are the symptoms of  the  failure  and  the
   SPEAR  library  of functions.  Hopefully, the system operator has been
   running SPEAR analysis on a daily basis so that you can get a  picture
   of  the  conditions  leading  up  to the problem.  If not, you can run
   SPEAR and receive a report within a short period of time.  With  SPEAR
   analysis and reported symptoms, you should be able to venture a guess






                                    3-5

                              ISOLATING FAULTS


   as to the cause of the problem.  You might even be  able  to  pinpoint
   the failure right away.  If you are not that fortunate, your next plan
   of action is to do the following:

        1.  Devise an experiment

        2.  Predict the results

        3.  Conduct the experiment

        4.  Evaluate the results

        5.  Refine the experiment

        6.  Repeat the process

   For example, if you suspect that a disk pack is bad, move the pack  to
   another  disk drive.  If the media is bad, the error pattern will move
   to the other drive.  Once you believe you have isolated  the  failure,
   you should confirm your findings.  After moving the disk pack, run the
   system for a couple of days.  Check to see if the same error  patterns
   occur on the second drive.



   3.4.1  Verification

   There are two general methods of verifying your findings.   The  first
   method  is to reinsert the problem.  If the symptoms recur, you can be
   relatively sure that you have identified the  cause  of  the  problem,
   thereby  verifying  your  findings.  If the symptoms do not recur, you
   should proceed with the second method.

   The second method is called the time window.  You should use the  time
   window  for  intermittent  problems  or  when reinserting the probable
   cause is not feasible; that is, when reinserting  would  be  too  time
   consuming or potentially damaging to the system.

   The time window is simply a period of time during  which  you  closely
   monitor  the performance of the system.  If the problem does not recur
   during that period, then you assume the problem is  solved,  and  your
   findings are verified.

   The duration of the time window depends on  whether  the  problem  was
   solid  or  intermittent.   If  the problem was solid, then monitor the
   system for 24 hours.  If the problem was intermittent, wait  at  least
   three  times  as  long as the frequency of the error.  Experience will
   dictate the method that works best for you.

   Your site  may  have  its  own  specific  isolation  and  verification
   techniques  that  are  tried  and  true.   If  so,  stay with the most
   successful method.

                              ISOLATING FAULTS

















                                Tab divider



                               SPEAR LIBRARY











































                                    3-7

4-1












                                 CHAPTER 4

                              SPEAR FUNCTIONS



   4.1  INTRODUCTION

   The previous chapters introduced you to SPEAR, described  where  SPEAR
   gets  its  information,  and  listed techniques for intermittent fault
   isolation.  This chapter explains how to use the SPEAR  dialogue  with
|  its help facilities and describes the functions in the SPEAR library:

        1.  RETRIEVE

        2.  SUMMARIZE
|  
|       3.  KLSTAT (TOPS-20 only)

   SPEAR is set up in such a way that after you use it a number of  times
   you  can run through it without any problems.  The reason for its ease
   of use is the way you interact with SPEAR.  SPEAR has a dialogue  that
   prompts and helps you along as much as you want.



   4.2  RUNNING SPEAR

   To run SPEAR, first log in to your operating system, then type one  of
   the following:

        .R SPEAR       On TOPS-10 based systems

        @SPEAR         On TOPS-20 based systems

   SPEAR indicates that it is waiting for instructions by displaying  the
   following prompt:

        SPEAR>

   After you see the SPEAR prompt, you can type any one of  the  function
   names,  (you can type KLSTAT on TOPS-20 only) or type HELP or question
   mark, or EXIT back to operating system command level.  If you  type  a
   function  name,  you  need  only  specify enough characters to make it
   unique to SPEAR.  In this case, you need type only the first character
   of the name for SPEAR to recognize it.

   If you type a question mark (?) at this point, SPEAR prints a list  of
   the features available to you in your version of the SPEAR Library.





                                    4-1

                              SPEAR FUNCTIONS


                                  CAUTION

           The  SPEAR  library  is   not   transportable   across
           operating  systems.   You cannot run SPEAR for TOPS-10
           on TOPS-20 and so on.  Consequently,  you  cannot  use
           the system event file from one operating system with a
           SPEAR library from another system.

   SPEAR has several features to guide you in  its  use.   The  following
   subsections describe these features.



   4.2.1  Prompts, Responses, and Arguments

   Each function of SPEAR has several levels  of  questions  for  you  to
   answer.   SPEAR  prompts  you  and gives you a selection of acceptable
   responses.  The default is listed in parentheses with each prompt.

   If you have been through this before, you can speed up the process  by
   responding  to  all  the  prompts  on  the  first  line,  using  legal
   separators,  or  by  specifying  an  indirect  file  containing   your
   responses.

   SPEAR can process commands from a disk  file  as  well  as  from  your
   terminal.  This disk file, known as an indirect file, is useful if you
   have a set of responses you often use.  To use this function, create a
   disk  file while at operating system command level with a text editor.
   The file should contain responses that  you  would  normally  type  to
   SPEAR on the terminal.

                                    NOTE

           Be sure to delete any line-sequence numbers from  your
           indirect file.  SPEAR will not accept them.

   Once you have created the file and saved it in your disk area, all you
   need  to  do  is to run SPEAR and type the file name preceded by an at
   sign (@).  The at sign (@) signifies an indirect  file.   The  default
   file  name  for  an  indirect  file  is SPRCMD.CMD.  Note that you can
   specify an indirect file at any prompt level of SPEAR, as long as  the
   file contains only the remaining information necessary to complete the
   SPEAR requests.

   You can choose to be prompted at every step or decide  to  supply  all
   required  information  without  prompting.   In fact, at SPEAR command
   level, you can input an entire SPEAR session on one  line,  separating
   each field with a space.  For example:

        SPEAR>RETRIEVE A0916.PAK 5,6,10 ASCII FULL /G<RET>

   By using special characters as separators, you can also speed  up  the
   process  within  the  SPEAR  dialogue.   Section 4.2.2 describes these
   characters.



   4.2.2  Separators and Terminators

   The following characters and terminal keys  have  special  meaning  to
   SPEAR:


                                    4-2

                              SPEAR FUNCTIONS


        1.  The RETURN key <RET> -  indicates  that  you  have  completed
            input  to  a  SPEAR  prompt  in one way or another.  You have
            either input your own arguments or taken the default.

        2.  A comma (,) - indicates that you  are  inputting  a  list  of
            items  within  one  request  for input, for example a list of
            sequence numbers or packet identifiers.

        3.  A colon (:) - indicates that you have either input  a  device
            name  within  a  file  specification  or  you  have specified
            devices within an error type specification.

        4.  A plus sign (+) - separates more than one major error type on
            one line.

        5.  A semicolon (;) - indicates  that  the  next  argument  is  a
            version number in a file specification.

        6.  An exclamation point (!) - allows  you  to  insert  comments.
            SPEAR  ignores  anything it sees on the current line after an
            exclamation point.



   4.2.3  Help Features

   There are five major help features in SPEAR, the  question  mark  (?),
   the  HELP  command,  the @HELP command, the question mark switch (/?),
   and the /HELP switch.

        1.  The question mark (?) provides enough information to  refresh
            your memory about the acceptable responses.

        2.  The HELP command provides detailed information  on  both  the
            prompt and on acceptable commands.

        3.  The @HELP command displays  information  concerning  indirect
            files.

        4.  The question mark switch (/?) provides a list of switches you
            can type as response to a particular prompt.

        5.  The /HELP switch provides an explanation  of  the  acceptable
            switches  that  you  can  type  as  response  to a particular
            prompt.

   You can type any of these help features after any prompt in the  SPEAR
   dialogue  and also after you have typed a response to the prompt.  For
   example, if you type a question mark in response to  a  prompt,  SPEAR
   does the following:

        1.  Lists all acceptable responses.

        2.  Gives a brief description of the desired response  if  it  is
            general (for example, file specification).

   If you type a question mark after supplying characters  to  a  prompt,
   SPEAR lists all acceptable responses matching the characters typed.

   You can also type the HELP command after any prompt.  SPEAR prints  up
   to 22 lines of information about the use of the prompt.


                                    4-3

                              SPEAR FUNCTIONS


   The Escape key is another help feature  in  the  SPEAR  library.   The
   Escape key fills in a response if you type enough characters for SPEAR
   to know what you want.  For example:

        Output mode (ASCII):B<ESC>INARY

   If you do not supply enough information  before  typing  <ESC>,  SPEAR
   prompts  you for more input by sending a bell to the terminal.  If you
   press <ESC> without typing any characters in  response  to  a  prompt,
   SPEAR fills in the default response.  For example:

        Event file (SERR:ERROR.SYS):<ESC>SERR:ERROR.SYS

   The following keys can also help you through the SPEAR dialogue:

        1.  CTRL/U - deletes the current input line

        2.  CTRL/W - deletes back to the last punctuation character

        3.  CTRL/F - completes the next field  of  a  file  specification
                     with the default



   4.2.4  File Specifications

   The following are the formats of the file specifications that  can  be
   given  in  a SPEAR command string.  These formats are listed according
   to operating system:

        TOPS-10        dev:filename.file extension[directory]

        TOPS-20        dev:<directory>filename.file type.file version



   4.2.5  SPEAR Switches

   The following is a list of the switches available in SPEAR.  Note that
   the  square  brackets indicate optional information that you can omit.
   You do not type the square brackets.

        /?             lists the available switches.

        /B[REAK]       returns you to the SPEAR> prompt.

        /G[O]          executes  the  current  SPEAR  command  with   the
                       parameters  you  have  given so far.  It takes the
                       defaults for the rest of the parameters.  This  is
                       the default switch.

        /H[ELP]        lists the available switches  and  gives  a  brief
                       explanation of their uses.










                                    4-4

                              SPEAR FUNCTIONS


        /R[EVERSE]     returns you one level back to the previous prompt,
                       where you can change any parameters.

        /S[HOW]        shows all the parameters you have specified so far
                       and  fills  in  the defaults for the ones you have
                       not specified.

   The following is an example (from TOPS-10) using the /SHOW switch with
   the  RETRIEVE  and SUMMARIZE commands.  Note that all the defaults are
   shown because no other parameters have been specified.

        SPEAR> SUMMARIZE/SHOW


        Event file: SYS:ERROR.SYS
        Report to: DSK:SUMMAR.RPT
|       Time from: 8-Mar-83                     
        Time to: LATEST                        
|       Show Error Distribution: YES


        SPEAR> RETRIEVE/SHOW


        Event or packet file: SYS:ERROR.SYS
        Output to: DSK:RETRIE.RPT
        Merge with: NONE
        Time from: EARLIEST
        Time to: LATEST
        Selection to be: INCLUDED
        Output mode: ASCII
        Report format: SHORT
        Selection type: ALL

        SPEAR> RETRIEVE/REVERSE

        SPEAR> EXIT
        .



   4.2.6  Exiting from SPEAR

   To exit from SPEAR, first  return  to  the  SPEAR>  prompt  by  typing
   /BREAK.   Then type the EXIT command.  You can also exit from SPEAR by
   typing CONTROL/C at any prompt.



   4.3  RETRIEVE

   RETRIEVE provides a means by which  to  convert  the  entries  in  the
   system  event  file  from  internal  binary format to a readable ASCII
   format.  It also allows you to select specific entries from the system
   event file and save them in a separate file.








                                    4-5

                              SPEAR FUNCTIONS


   4.3.1  RETRIEVE Input

   RETRIEVE accepts the following types of input:

        1.  The system event file

        2.  A file created by the RETRIEVE process

        3.  Any file containing entries from the system event file

   With RETRIEVE, you have the option of translating  the  entire  system
   event  file  or  specific  entries in the file by sequence number.  In
   order to have more control over the selection  of  specific  types  of
|  entries,  you can use RETRIEVE to extract the entry types in which you
|  are interested and then translate them.

|  You can select entries on the basis of the following:

        1.  Date/time limits

        2.  Sequence numbers

        3.  Event codes
|  
|       4.  Error
|  
|       5.  Statistics
|  
|       6.  Configuration
|  
|       7.  Diagnostics
|  
|  Error,  Statistics,  Configuration  and  Diagnostics  can  be  further
|  subdivided into the following categories:

        1.  Mainframe (CPU, memory, front-end)
|  
|       2.  Disk
|  
|       3.  Tape
|  
|       4.  CI
|  
|       5.  NI
|  
|       6.  Unit record
|  
|       7.  Network
|  
|       8.  Operating system

        9.  Disk pack identifier

       10.  Tape reel identifier

|  Once you have defined a category, you can specify  physical  names  or
|  device  types  within  a  class,  such  as LPT for unit record device.
   Table 4-1 lists the available device types that you can specify:





                                    4-6

                              SPEAR FUNCTIONS


|  Table 4-1:  Device Types
|  
|  
           Category                    Device Types


          Mainframe       ALL, MEM, FE, CPU

          Disk            ALL, RM03,  RM05,  RP04,  RP05,  RP06,  RP07,
|                         RS04, RP20, RA60, RA80, RA81

|         Tape            ALL, TU16,  TU45,  TU70,  TU71,  TU72,  TU73,
|                         TU77, TU78, TA78
|  
|         CI              CI20, HSC50

          Unit Record     ALL, LPT, CDR

          Network         ALL, Decimal number in range 0-511 (see Table
                          4-2)


|  Table 4-2 lists the classes available for selection of DECnet events:
|  
|  
|  Table 4-2:  Network Event Classes
|  
|  
     Class             Description

     0                 Management layer
     1                 Application layer
     2                 Session Control layer
     3                 Network services layer
     4                 Transport layer
     5                 Data link layer
     6                 Physical link layer
     007-031           Reserved for other common event classes
     032-063           Reserved for RSTS specific event classes
     064-095           Reserved for RSX specific event classes
     096-127           Reserved for TOPS-20 specific event classes
     128-159           Reserved for VMS specific event classes
     160-191           Reserved for RT specific event classes
     192-479           Reserved for future use
     480-511           Reserved for Customer specific event classes


|  For more information concerning  network  entries  from  DECnet  V3.0,
|  refer to the DECnet documentation for system managers and operators.
|  
|  If you specify Error as an entry selection, you can  also  specify  an
|  error type.  See Table 4-4 for a list of error types.



   4.3.2  RETRIEVE Output

   RETRIEVE output can be in the following forms:





                                    4-7

                              SPEAR FUNCTIONS


        1.  One or two lines containing the most pertinent data in  ASCII
            format.

        2.  All data about each event, in ASCII format.



























































                                    4-8

                              SPEAR FUNCTIONS


        3.  All data about each event in octal dump format.  This  format
            is useful only for debugging the error-reporting system.

        4.  Specific events saved in binary format, for future reference.

   Your default output can be an ASCII  file,  RETRIE.RPT,  or  a  binary
   file, RETRIE.SYS.

   You should be aware that user-defined  entries  that  are  unknown  to
   SPEAR cannot be translated into ASCII.  You can, however, get an octal
   dump of these entries by specifying OCTAL to the  Output  Mode  prompt
   when running RETRIEVE.

   An unusual event you may find in the system  event  file  is  a  KLERR
   entry.   The  KLERR entries are different from most entries in that it
   takes several event file records to make up one complete entry.   This
   is  because  the front-end must send information in pieces through the
   DTE interface along with all communications,  console,  and  hard-copy
   data.   Because  of  this, there is a chance that not all records will
   actually get through to the event file.  When SPEAR sees that a  KLERR
   entry  is  incomplete,  it  will type an error message (non-fatal) and
   will translate all available data anyway.

   Each KLERR entry uses one sequence number.  When looking at a RETRIEVE
   report,  you may notice gaps between sequence numbers even if you have
   selected ALL entries.  A KLERR entry  is  listed  using  the  sequence
   number  of  the  first record in the entry, but it is not listed until
   all records of the entry have been received.   Because  other  entries
   may  enter the event file before the front-end has sent all records of
   one KLERR entry, the KLERR entry will appear to be  out  of  sequence.
   For example, you may find entries with the following sequence numbers:

        1.  Configuration status change

        3.  Disk error

        6.  Tape error

        2.  KLERR

        8.  Reload

   For step-by-step procedures  for  using  RETRIEVE,  refer  to  Section
   4.3.3.



   4.3.3  RETRIEVE Procedure

   RETRIEVE allows you the option of  converting  events  in  the  system
   event  file  into  an  ASCII  format  for  listing  on the terminal or
   lineprinter.  To begin with, RETRIEVE prompts with one or more of  the
   following guidewords:










                                    4-9

                              SPEAR FUNCTIONS


        RETRIEVE Mode
        _____________

                Event or packet file(SERR:ERROR.SYS):

                Selection to be (INCLUDED):

                Selection type (ALL):

                Sequence numbers:

                Event codes:

|               Category (ALL):
|  
|               Next category (FINISHED):

                Mainframe devices (ALL):

                Disk drives (ALL):

                Tape drives (ALL):

|               CI controller (ALL):

                Unit record devices (ALL):

|               Disk (structure IDs):
|  
|               Tape (reel IDs):

                Time from (EARLIEST):

                Time to (LATEST):

                Output mode (ASCII):

                Merge with (NONE):

                Report format (SHORT):

                Output to (DSK:RETRIE.RPT):



   4.3.3.1  Retrieving Selected Events - If you  want  to  take  all  the
   defaults, type R/G to the SPEAR> prompt; otherwise, read the following
   procedure.

   STEP 1

|  After typing RETRIEVE to the SPEAR> prompt, RETRIEVE asks for the name
|  of the input file:

        Event or packet file (SERR:ERROR.SYS):       TOPS-20

                        or

        Event or packet file (SYS:ERROR.SYS):        TOPS-10





                                    4-10

                              SPEAR FUNCTIONS


   Type one of the following:

        1.  The RETURN key - to select  the  default,  the  system  event
            file.

        2.  Any file name, in the proper format, containing events stored
            in binary.

        3.  The name of a previous file  that  you  RETRIEVEd  in  BINARY
            mode.

   STEP 2

   RETRIEVE then prompts for the method of selection:

        Selection to be (INCLUDED):

   Type one of the following:

        1.  The RETURN key - to select the default I[NCLUDED].   INCLUDED
            moves a few selected entries of various types into a separate
            file.

        2.  E[XCLUDED] - to select all but a few entry types.

   STEP 3

   After selecting  INCLUDED  or  EXCLUDED,  you  receive  the  following
   prompt:

        Selection type (ALL):

|  At this prompt, you have two separate  lists  from  which  to  choose.
|  Type one or more of the following from the first group:
|  
|       1.  E[RROR] - to select entries that contain actual failure data.
|  
|       2.  ST[ATISTICS] - to select statistic entries.
|  
|       3.  D[IAGNOSTICS] - to select entries created by a diagnostic.
|  
|       4.  CON[FIGURATION] - to select configuration entries.
|  
|       5.  O[THER] - to select entries that do not fit  into  the  other
|           types.
|  
|  If you choose more than one of  these  types,  separate  each  with  a
|  comma.
|  
|  Or type one of the following from the second group:
|  
|       1.  The RETURN key or A[LL] - to select the default that extracts
|           all  entries.   You  will  be  asked for date and time limits
|           next.
|  
|       2.  SE[QUENCE] - to select entries by sequence number.
|  
|           If you choose SEQUENCE, RETRIEVE prompts further with:
|  
|                Sequence numbers:
|  
|           Here you can specify one number, several numbers separated by


                                    4-11

                              SPEAR FUNCTIONS


|           commas, or a range of numbers separated by a hyphen.
|  
|       3.  COD[E] - to select entries on the basis of their  octal  code
|           number.   These  numbers  are  listed in Table D-1 and in the
|           SPEAR Reference card.
|  
|           If you choose CODE, RETRIEVE prompts you further with:
|  
|                Event codes:
|  
|           Here you can specify one number, several numbers separated by
|           commas, or a range of numbers separated by a hyphen.
|  
|  If you chose ERROR, STATISTICS, CONFIGURATION, OTHER, or a combination
|  of  these, proceed with Step 3A.  If you chose ALL or CODE, proceed to
|  Step 4.  If you chose SEQUENCE proceed to Step 6.
|  
|  STEP 3A
|  
|  If  you  choose  ERROR,  STATISTICS,  CONFIGURATION,   OTHER,   or   a
|  combination of these types, you receive the following prompt:
|  
|       Category (ALL):
|  
|  Type one of the following:
|  
|       1.  The RETURN key or A[LL] - to select all the categories.  This
|           is the default.
|  
|       2.  M[AINFRAME]  -  to  select  errors  occurring   in   specific
|           mainframe components.
|  
|       3.  D[ISK] - to select entries occurring on  disk  subsystems  or
|           individual drives.
|  
|       4.  T[APE] - to select entries occurring on  tape  subsystems  or
|           individual drives.
|  
|       5.  CI - to select entries occurring on the  CI  interconnect  or
|           the HSC50 disk controller.
|  
|       6.  NI - to select entries occurring on the NI.
|  
|       7.  U[NITRECORD] - to select  entries  occurring  on  unit-record
|           devices such as card readers and line printers.
|  
|       8.  NE[TWORK] - to select entries occurring on the network nodes.
|  
|       9.  O[PERATING-SYSTEM] - to  select  entries  that  are  software
|           related.
|  
|      10.  CO[MM]  -  to  select  entries  occurring  on  communications
|           devices.
|  
|      11.  P[ACKID] - to  select  entries  occurring  on  specific  disk
|           packs.
|  
|      12.  R[EELID] - to  select  entries  occurring  on  specific  tape
|           reels.
|  
|  All categories except COMM and NI, prompt further for specific  device
|  types.  Table 4-3 lists the subprompts you can expect:


                                    4-12

                              SPEAR FUNCTIONS


|  Table 4-3:  Subprompts for Device Types
|  
|  
|    Device Type       Subprompt
|  
|    MAINFRAME         Mainframe devices (ALL):
|    DISK              Disk drives (ALL):
|    TAPE              Tape dives (ALL):
|    CI                CI controllers (ALL):
|    UNITRECORD        Unit record devices (ALL):
|    NETWORK           Event class and type (ALL):
|    OPERATING-SYSTEM  Operating System codes (ALL):
|    PACKID            Disk (structure IDs):
|    REELID            Tape (reel IDs):
|  
|  
|  Type ?  at the subprompt level to get a list of acceptable  responses,
|  or refer to Table 4-1 in this manual.
|  
|  If you chose ERROR as one of the selection types in STEP  3,  you  can
|  also  specify  the particular error types for which you are looking in
|  relation to the specific device.  Table 4-4 lists the error types  for
|  the devices:
|  
|  
|  Table 4-4:  Error Types
|  
|  
|    Prompts                     Error Types
|  
|    Disk error type (ALL):      OFFLINE
|                                WRITE-LOCK
|                                UNSAFE
|                                MICROPROCESSOR
|                                SOFTWARE
|                                BUS
|                                CHANNEL-CONTROLLER
|                                READ-WRITE
|                                SEEK-SEARCH
|                                TIMING
|                                OTHER
|  
|    Tape error type (ALL):      READ
|                                WRITE
|                                DEVICE-FORMATTER
|                                BUS
|                                CHANNEL-CONTROLLER
|                                SOFTWARE
|                                OFFLINE
|                                OPERATOR
|                                OTHER
|  
|    CI error type (ALL):
|    for CI20                    EBUS
|                                MBUS
|                                CRAM-PARITY
|                                CHANNEL-ERROR
|                                SERDES-OVERRUN
|                                EDS
|                                INCONSISTENT-DATA



                                    4-13

                              SPEAR FUNCTIONS


|    CI error type (ALL):
|    for HSC50                   SERDES-OVERRUN
|                                EDC
|                                INCONSISTENT-DATA
|  
|    NI error type (ALL):        EBUS
|                                MBUS
|                                CRAM-PARITY
|                                CHANNEL-ERROR
|  
|  
   STEP 3B

|  RETRIEVE keeps prompting you for  categories  until  you  either  type
   FINISHED or press the RETURN key:

|       Next category (FINISHED):

   Type one of the following:
|  
|       1.  The RETURN key or F[INISHED] to take the default.
|  
|       2.  Another category.

   Note that you can select disk entries by either  DISK  or  PACKID  and
|  tape  entries  by  either  TAPE  or  REELID.  If you are interested in
   media, use PACKID or REELID; otherwise, use  DISK  or  TAPE.   If  you
   specify both DISK and PACKID (or TAPE and REELID), you select all disk
   entries (or tape entries), not just  those  that  match  the  selected
   media.   If  you  want  to  select  entries with a specific device and
   media, you must run RETRIEVE twice.
































                                    4-14

                              SPEAR FUNCTIONS


|  You can specify more than one device  name  by  separating  them  with
   commas.  For example:

|       Disk drives (ALL):DISK:RP06,RM03,RP05

   You can always  come  back  to  error  category  selection  (by  using
   /REVERSE)  to add parameters.  Everything typed here remains until you
|  type CTRL/U or CTRL/W.

|  Note that supplying a device type (RP06, RM03) causes SPEAR to  search
   a  different  field  than  if you had supplied a physical name (DP130,
   MTA1, and so forth).  If the name you supply does not match one of the
   known device types, SPEAR assumes that it is a physical name.

   STEP 4

   RETRIEVE then prompts you for the date and time limits of the  entries
   you want to select:

        Time from (EARLIEST):

   Type one of the following:

        1.  The RETURN key or E[ARLIEST] - to select the beginning of the
            file.  This is the default.

        2.  A date and time in the format dd-mmm-yy hh:mm:ss - to signify
            where to begin extracting entries.  A date by itself defaults
            to one second after midnight.

        3.  A date and time in the format -nn  to  indicate  a  reference
            point  prior  to  the  current  date.  For example, -7 causes
            RETRIEVE to begin extracting entries from seven days prior to
            the current day.

   STEP 5

   RETRIEVE then prompts for the end of the time period:

        Time to (LATEST):

   Type one of the following:

        1.  The RETURN key or L[ATEST] - to select the end of  the  file.
            This is the default.

        2.  A date and  time  in  the  format  dd-mmm-yy  hh:mm:ss  -  to
            indicate  the  last  date  for  extracted entries.  A date by
            itself defaults to one second after midnight.

        3.  A date and time in the format -nn  to  indicate  a  reference
            point  prior  to  the  current date.  For example, -13 causes
            RETRIEVE to stop extracting entries  recorded  thirteen  days
            before the current date.









                                    4-15

                              SPEAR FUNCTIONS


   STEP 6

   RETRIEVE next prompts for style of output:

        Output mode (ASCII):

   Type one of the following:

        1.  The RETURN key or A[SCII] - to  convert  entries  into  ASCII
            format.  This is the default.

        2.  B[INARY] - to retain the entries in their internal format.

   If you choose ASCII, proceed to STEP 7.  If you choose BINARY, skip to
   STEP 8.

   STEP 7

   After choosing ASCII, RETRIEVE  prompts  you  for  the  form  of  your
   output:

        Report format (SHORT):

   Type one of the following:

        1.  The RETURN key or S[HORT] -  to  select  the  default.   This
            selection  produces  a  report  with  only the most essential
            information.  No entry will be longer than three lines of  72
            columns.

        2.  F[ULL] - to display all the information  that  the  operating
            system recorded for that entry.

        3.  O[CTAL] - to produce a ones and zeros ASCII report.  The ones
            and  zeros represent the actual binary contents of the entry.
            Unless you are familiar  with  the  internal  format  of  the
            individual  entries,  this format has very little value.  Its
            primary purpose is to aid  in  debugging  the  SPEAR  program
            library.

   STEP 8

   If you specified BINARY as output style,  RETRIEVE  then  prompts  for
   another file name to give you an opportunity to combine two files into
   one for record-keeping purposes.  The merged output file  will  be  in
   the  proper chronological order.  Both files must be in binary format.
   The prompt is:

        Merge with (NONE):

   Type one of the following:

        1.  The RETURN key - to select the default of NONE.

        2.  A file name of  another  file  containing  entries  from  the
            system event file.







                                    4-16

                              SPEAR FUNCTIONS


   STEP 9

   The last thing RETRIEVE asks for is the destination of the output.  If
   you chose ASCII, the prompt is:

        Output to (DSK:RETRIE.RPT):

   If you chose BINARY, the prompt is:

        Output to (DSK:RETRIE.SYS):

   Type one of the following:

        1.  The  RETURN  key  -  to  select  the  default  RETRIE.RPT  or
            RETRIE.SYS.

        2.  TTY:  - to direct ASCII formatted  output  to  the  terminal.
            You  should not request BINARY formatted output to be printed
            on the terminal.

        3.  Any file name in the proper format for your system.

   After you select the output destination and press RETURN,  SPEAR  asks
   you to confirm your decision:

|       Type <cr> to confirm (/GO):

|  At this point, you can:

        1.  Press RETURN or type /GO to execute the RETRIEVE process.

        2.  Type /SHOW to list the parameters you have chosen.

        3.  Type /REVERSE to return to the previous prompt.

        4.  Type /BREAK to return to SPEAR> level.

        5.  Type question mark (?), HELP, the question mark switch  (/?),
            or /HELP to find out what your options are.

   If your output is formatted in ASCII and you decide to output the file
   to  your  disk area, you can list the file on the lineprinter by doing
   the following:

        Return to operating system command level by typing  EXIT  to  the
        SPEAR> prompt.

        Use  the  PRINT  command  with  any  options  available  on  your
        operating system.














                                    4-17

                              SPEAR FUNCTIONS


   4.3.3.2  Sample RETRIEVE Session - The following is a sample  RETRIEVE
   session using the TOPS-20 system event file for input:

|  @spear
|  
|  Welcome to SPEAR for TOPS-20. Version 2(605)
|  Type "?" for help.
|  
|  
|  SPEAR> retrieve
|  
|  RETRIEVE mode
|  -------------
|      Event or packet file (SERR:ERROR.SYS): 
|  
|      Selection to be (INCLUDED): 
|  
|      Selection type (ALL): error,diagnostic
|  
|          Category (ALL): disk
|  
|              Disk drives (ALL): RP07
|  
|                  Disk error type (ALL): ?
|  
|                    One or more of the following:
|                    ALL
|                    OFFLINE
|                    WRITE-LOCK
|                    UNSAFE
|                    MICROPROCESSOR
|                    SOFTWARE
|                    BUS
|                    CHANNEL-CONTROLLER
|                    READ-WRITE
|                    SEEK-SEARCH
|                    TIMING
|                    OTHER
|                    HELP
|  
|                  Disk error type (ALL): read-write
|  
|          Next Category (FINISHED): 
|  
|      Time from (EARLIEST): 
|  
|      Time to (LATEST): 
|  
|      Output mode (ASCII): 
|  
|          Report format (SHORT): full
|  
|      Output to (DSK:RETRIE.RPT): 
|  
|  Type <cr> to confirm (/GO): 



   4.3.3.3  Short Format - The following is a sample of a RETRIEVE report
   in short format:



                                    4-18

                              SPEAR FUNCTIONS


|  @ty retrie.RPT
|  
|  SPEAR Version 2(565). Retrieval from SERR:ERROR.SYS
|    Report generated  6-Mar-84 15:57:46-EST
|    As directed by user
|    Selected window: 23-Feb-84 00:00:01-EST to 26-Feb-84 00:00:01-EST.
|    Selected records are included
|    Selection type is ERRORS,
|    Report sent to DSK:RETRIE.RPT
|  
|  
|  SEQ    TIME    Thu 23 Feb 84
|  
|  1249. 03:12:43 DP100 WORK: RP07 SERIAL #2861. CONI RH= 0,222715
|                      CHN STS= 540100,174632 SR= 0,51700 ER= 0,100000
|                      CYL/SURF/SEC= 212./27./3.
|  1713. 08:15:49 DP040 RP06 SERIAL #0125. CONI RH= 0,202615
|                      CHN STS= 500000,305600 SR= 0,51700 ER= 0,100000
|                      CYL/SURF/SEC= 0./0./1.
|  1875. 11:26:39 DP000 SERR: RP06 SERIAL #0941. CONI RH= 0,222615
|                      CHN STS= 540100,174024 SR= 0,51700 ER= 0,100000
|                      CYL/SURF/SEC= 603./10./16.
|  
|  SEQ    TIME    Fri 24 Feb 84
|  
|  328. 13:14:20 DP010 PUBLIC: RP06 SERIAL #0484. CONI RH= 0,222615
|                      CHN STS= 540100,174066 SR= 0,51700 ER= 0,100000
|                      CYL/SURF/SEC= 93./12./0.
|  372. 17:04:09 DP000 SERR: RP06 SERIAL #0941. CONI RH= 0,222615
|                      CHN STS= 540100,174024 SR= 0,51700 ER= 0,100000
|                      CYL/SURF/SEC= 361./15./16.
|  
|  SEQ    TIME    Sat 25 Feb 84
|  
|  85. 10:43:36 DP110 GALAXY: RP07 SERIAL #251D. CONI RH= 0,322615
|                      CHN STS= 540100,174632 SR= 0,51700 ER= 0,400
|                      CYL/SURF/SEC= 623./15./35.
|  
|  
|  
|  4.3.3.4  Octal Format - The following is a sample of a RETRIEVE report
|  in octal format.
|  
|  
|  SPEAR Version 2(565). Retrieval from SERR:ERROR.SYS
|    Report generated  6-Mar-84 16:08:12-EST
|    As directed by user
|    Selected window: 23-Feb-84 00:00:01-EST to 26-Feb-84 00:00:01-EST.
|    Selected records are included
|    Selection type is ERRORS,
|    Report sent to DSK:RETRIE.OCTAL
|  
|  
|  
|  Sequence # 1249 -- Record HEADER: 
|  0/      111001,,125124
|  1/      131271,,257140
|  2/      0,,116617
|  3/      0,,5467
|  4/      0,,2341
|  


                                    4-19

                              SPEAR FUNCTIONS


|  Record BODY: 
|  0/      0,,0
|  1/      675762,,530000
|  2/      1242,,440147
|  3/      1,,74014
|  4/      100000,,1
|  5/      0,,222715
|  6/      0,,2415
|  7/      0,,35624
|  10/     1,,234156
|  11/     0,,172464
|  12/     0,,0
|  13/     0,,0
|  14/     0,,0
|  15/     732200,,177471
|  16/     732200,,177471
|  17/     720000,,15403
|  20/     720000,,15403
|  21/     0,,715652
|  22/     600001,,0
|  23/     0,,1
|  24/     0,,0
|  25/     0,,0
|  26/     0,,0
|  27/     0,,324
|  30/     0,,2214
|  
|                  .
|                  .
|                  .
|  
|  Sequence # 1713 -- Record HEADER: 
|  0/      111001,,125124
|  1/      131271,,432751
|  2/      0,,272430
|  3/      0,,5467
|  4/      0,,3261
|  
|  Record BODY: 
|  0/      0,,0
|  1/      0,,0
|  2/      1242,,440146
|  3/      0,,1
|  4/      100000,,1
|  5/      0,,202615
|  6/      0,,2415
|  7/      0,,0
|  10/     0,,466
|  11/     0,,0
|  12/     0,,0
|  13/     0,,0
|  14/     0,,0
|  15/     732204,,177771
|  16/     732204,,177771
|  17/     720004,,1
|  20/     720004,,1
|  21/     0,,715436
|  22/     200001,,0
|  23/     0,,1
|  24/     0,,0
|  25/     0,,0


                                    4-20

                              SPEAR FUNCTIONS


|  26/     0,,0
|  27/     0,,0
|  30/     0,,1
|  
|                  .
|                  .
|                  .



|  4.3.3.5  Full Format - The following is an example of a full format:




















































                                    4-21

                              SPEAR FUNCTIONS


                              RETRIEVE SESSION

|  
|  SPEAR Version 2(565). Retrieval from SERR:ERROR.SYS
|    Report generated  6-Mar-84 16:02:31-EST
|    As directed by user
|    Selected window: 23-Feb-84 00:00:01-EST to 26-Feb-84 00:00:01-EST.
|    Selected records are included
|    Selection type is ERRORS,
|    Report sent to DSK:RETRIE.FULL
|  
|  
|  
|  
|  ***********************************************
|  MASSBUS DEVICE ERROR       
|   LOGGED ON Thu 23 Feb 84 03:12:43      MONITOR UPTIME WAS  3:41:34
|          DETECTED ON SYSTEM # 2871.
|          RECORD SEQUENCE NUMBER: 1249.
|  ***********************************************
|          UNIT NAME:      DP100
|          UNIT TYPE:      RP07
|          UNIT SERIAL #:  2861.
|          VOLUME ID:      WORK
|          LBN AT START OF XFER:           1074014  =
|          CYL:   212.     SURF:   27.     SECT:   3.
|          OPERATION AT ERROR:      DEV.AVAIL., GO +  READ DATA(70)
|          FINAL ERROR STATUS:     100000,1
|          RETRIES PERFORMED:      2.
|          ERROR:  RECOVERABLE 
|  DRIVE EXCEPTION,CHN ERROR, IN CONTROLLER CONI
|                  DCK, IN DEVICE ERROR REGISTER
|  
|  CONTROLLER INFORMATION:
|  CONTROLLER:     RH20 # 1
|  CONI AT ERROR:  0,222715 =
|          DRIVE EXCEPTION,CHN ERROR,
|  CONI AT END:    0,2415 =
|           NO ERROR BITS DETECTED
|          DATAI PTCR AT ERROR:    732200,177471
|          DATAI PTCR AT END:      732200,177471
|          DATAI PBAR AT ERROR:    720000,15403
|          DATAI PBAR AT END:      720000,15403
|  
|  CHANNEL INFORMATION:
|  CHAN STATUS WD 0:       200000,174567
|          CW1:  0,0  CW2:  0,0
|  CHN STATUS WD 1:        540100,174632 =
|          NOT SBUS ERR,NOT WC = 0,LONG WC ERR,
|  CHN STATUS WD 2:        614005,377200
|  
|  DEVICE REGISTER INFORMATION:
|          AT ERROR        AT END          DIFF.
|  CR(00): 4070            4070               0            
|           DEV.AVAIL., READ DATA(70)
|  SR(01): 51700           11700           40000           
|          ERR,MOL,PGM,DPR,DRY,VV,
|  ER(02): 100000             0            100000          
|          DCK,
|  MR(03):    0               0               0            
|  AS(04):    0               0               0            


                                    4-22

                              SPEAR FUNCTIONS


|  DA(05): 15404           15407           3               
|          D. TRK = 33, D.SECT. = 4
|  DT(06): 24042           24042              0            
|  LA(07): 1700            700             1000            
|  SN(10): 24141           24141              0            
|  OF(11):    0               0               0            
|  DC(12): 324             324                0            
|          212.
|  CC(13): 324             324                0            
|          212.
|  E2(14):    0               0               0            
|          NO ERROR BITS DETECTED
|  E3(15):    0               0               0            
|          NO ERROR BITS DETECTED
|  EP(16): 1454               0            1454            
|  PL(17): 2400               0            2400            
|  
|  DEVICE STATISTICS AT TIME OF ERROR:
|  # OF READS:     342126. # OF WRITES:    62772.  # OF SEEKS:     15252.
|  # SOFT READ ERRORS:     1.      # SOFT WRITE ERRORS:    0.
|  # HARD READ ERRORS:     0.      # HARD WRITE ERRORS:    0.
|  # SOFT POSITIONING ERRORS:      0.
|  # HARD POSITIONING ERRORS:      0.
|  # OF MPE:  0.   # OF NXM:  0.   # OF OVERRUNS:  0.







































                                    4-23

                              SPEAR FUNCTIONS


   4.4  SUMMARIZE

   SUMMARIZE reads the system event  file  and  summarizes  its  contents
   according to the following categories:

        1.  Event code

        2.  STOPCODE (TOPS-10)

        3.  BUGCHK, BUGHLT, BUGINF (TOPS-20)

        4.  Front-end reloads



















































                                    4-24

                              SPEAR FUNCTIONS


        5.  Channel errors

        6.  Disk errors

        7.  Magnetic tape errors

   The SUMMARIZE report also contains Error Distribution  tables.   These
   tables  show  a  24  hour  distribution  of events listed according to
   subsystem.  With these tables, you can determine when the large number
   of events is occurring.  Once you know the subsystem (Mainframe, Disk,
   Tape, and so forth) and the timeframe, you can use RETRIEVE or ANALYZE
   to pinpoint the specific device that is causing the problem.

   After reading the  file,  SUMMARIZE  produces  an  ASCII  report  file
   containing  the  summaries and Error Distribution tables and stores it
   in your disk area (or wherever you specify).  You can then  print  the
   report  on  the  lineprinter  for  inspection.  You can also print the
   report on the terminal by specifying TTY:  to SPEAR's request for  the
   output destination.

   SUMMARIZE allows you to pinpoint the timeframe  of  the  summaries  by
   requesting  a  beginning  date and an ending date to search for in the
   system event file.  In addition, you can also specify  a  binary  file
   created with the RETRIEVE process (RETRIE.SYS) for input.  See Section
   4.3 for information on RETRIEVE.



   4.4.1  The SUMMARIZE Report

|  The following example is representative of a SUMMARIZE report in  that
|  it contains:
|  
|        o  File environment information
|  
|        o  Entry occurrence counts
|  
|        o  System  event  codes,  shown  in  parentheses   under   entry
|           occurrence counts
|  
|        o  Summaries of bugchecks and subsystems
|  
|        o  Error distribution tables
|  
|  Note that if the media name  cannot  be  identified  in  reports  that
|  include media identification, SUMMARIZE uses three specific formats:
|  
|       1.  <unknown> - if SUMMARIZE does not find a mount record in  the
|           error file prior to the time of the error.
|  
|       2.  <none> - if a series of mount and dismount  records  indicate
|           no  medium  was  mounted at the time of the error, such as an
|           error occurring during the mount process.
|  
|       3.  <blank> - if  SUMMARIZE  finds  a  mount   record   but   the
|           medium-name field of the mount record is empty.
|  
|  Note the error register codes listed in the report  are  described  in
|  Section 4.3.3.
|  
|  File Environment
|  

                                    4-25

                              SPEAR FUNCTIONS


|       SPEAR Version 2(613)
|       Input file:   SERR:ERROR.SYS Created: 12-Mar-84 08:49:00-EST
|       Output file:  DSK:SUMMAR.RPT
|  
|       Selection Criteria: ALL
|  
|       Date of first entry processed: 14-Mar 01:22:13
|       Date of last entry processed:  14-Mar 23:53:38
|  
|       Number of entries processed: 1128.
|       Number of inconsistencies detected in error file: 0.
|  
|  Entry Occurrence Counts:
|  
|       9. SYSTEM RELOAD ...(101)
|     496. MONITOR BUG ...(102)
|      36. MASSBUS ERROR ...(111)
|     120. STATISTICS ...(114)
|       8. CONFIGURATION CHANGE ...(115)
|     102. FRONT END DEVICE ERROR ...(130)
|       1. CPU PARITY INTERRUPT ...(162)
|     294. PHASE III DECNET ENTRY ...(240)
|      62. HSC50 ERROR LOG ...(243)
|  
|  
|  Monitor Detected Errors and Reloads:
|  
|      43. BUGCHK
|       4. BUGHLT
|     449. BUGINF
|  Monitor Error and Reload Breakdown:
|  
|  BUGCHK Breakdown
|       8. FLKTIM
|       2. KLPERR
|      17. MSCORO
|       3. NODDMP
|       5. PI2ERR
|       4. SCACVC
|       4. SCATMO
|  
|  BUGHLT Breakdown
|       1. ILPSEC
|       1. NOTOFN
|       1. SKDPF1
|       1. UNPGF2
|  
|  BUGINF Breakdown
|       8. CFCONN
|       4. KLPCVC
|      29. KLPNUP
|       1. KLPRRQ
|       1. KLPSTR
|      28. MSCAVA
|       2. MSCDSR
|       7. MSCPTG
|     324. NSPBAD
|      29. NSPLAT
|       2. NTOHNG
|       1. SPRZRO
|       1. TM8AEI


                                    4-26

                              SPEAR FUNCTIONS


|      12. TTYSTP
|  
|  Front-end Summary:
|  
|      10. CD20
|      10. DH11
|      10. DL11C
|      10. DM11
|       1. DM11-3
|       6. KLCPU
|      45. KLERR records forming 5. full entries
|      10. LP20
|  
|  DECnet Phase III Summary:
|  
|    Class.Type    Count    Description
|  
|        0.0         10.    Event records lost
|        0.3          8.    Automatic line service
|        2.0          2.    Local node state change
|        4.0         29.    Aged packet loss
|        4.1        233.    Node unreachable packet loss
|        4.4          1.    Packet format error
|        4.7          6.    Circuit down, circuit fault
|        4.10         5.    Circuit up
|  
|  
|  RH20 Channel/Controller Summary:
|  
|                  Hard    Soft
|       # 1           0.      1.
|       # 2           5.     30.
|  
|  
|  RP07 Summary:
|  
|                  Hard    Soft
|    S/N 2861
|        DP100        0.      1.
|  
|  
|  TM78 Summary:
|  
|                  Hard    Soft
|    S/N 4404
|        MT200        2.      4.
|    S/N 5242
|        MT210        3.     26.
|  
|  
|                     RH20 Breakdown (CONI)
|  
|               PAR       LWC  SWC  CHN  RES       OVR
|               ERR  EXC  ERR  ERR  ERR  ERR  RAE  RUN
|  
|  
|  DP100
|       SOFT          1.             1.
|  
|  MT200
|       HARD          2.


                                    4-27

                              SPEAR FUNCTIONS


|  MT200
|       SOFT          4.
|  
|  MT210
|       HARD          3.
|  MT210
|       SOFT         26.
|  
|                *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
|                *                                           *
|                *       Disk Subsystem Error Summary        *
|                *                                           *
|                *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
|  
|  
|  Disk Subsystem Error Entries Summarized by Device, then Error Type.
|          Where the Error Types are the following:
|  
|                 OTHER    =   OTHER
|                 TIMIN    =   TIMING
|                 SK-SR    =   SEEK-SEARCH
|                 READ     =   READ-WRITE
|                 CH-CO    =   CHANNEL-CONTROLLER
|                 BUS      =   BUS
|                 SOFT     =   HARDWARE DETECTED SOFTWARE ERROR
|                 MICRO    =   MICROPROCESSOR DETECTED ERROR
|                 UNSAF    =   UNSAFE
|                 WRTLK    =   WRITE LOCK
|                 OFFLI    =   OFFLINE
|  
|  
|  
|            OTHER TIMIN SK-SR READ  CH-CO  BUS  SOFT  MICRO UNSAF WRTLK OFFLI
|            ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
|  
|  DP100
|                                 1.
|  DU-7-14-17
|              36.                            3.
|  DU-7-3-17
|              19.                            3.                1.
|  
|  Read Data Errors further summarized by Drive and Media ID.
|  
|     Drive     Media      Error Totals
|     -------   ------     --------
|  
|      DP100      WORK           1.
|  
|  
|  *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
|  *                                                                   *
|  * This report summarizes all Read Data Errors by Drive and Media ID *
|  *                                                                   *
|  *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
|  
|  
|  
|  
|    DRIVE   MEDIA   CYL TRK SECT  HARD  SOFT RETRIES            LBN
|    -----   -----   --- --- ----  ----  ---- -------            ---
|  

                                    4-28

                              SPEAR FUNCTIONS


|    DP100    WORK  565.  5.  15.    0.    1.      2.      2,,756704
|  
|  
|  RP07 BREAKDOWN:
|  
|                               Error Register 1
|  
|                 D   U   O   D   W   I   A   H   H   E   W   F   P   R   I   I
|                 C   N   P   T   L   A   O   C   C   C   C   E   A   M   L   L
|                 K   S   I   E   E   E   E   R   E   H   F   R   R   R   R   F
|                                             C
|  
|    S/N 2861
|       DP100  S   1.
|  
|                *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
|                *                                           *
|                *       Tape Subsystem Error Summary        *
|                *                                           *
|                *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
|  
|  
|  Tape Subsystem Error Entries Summarized by Device, then Error Type.
|          Where the Error Types are the following:
|  
|                 OTHER    =   OTHER
|                 READ     =   READ
|                 WRITE    =   WRITE
|                 FORMT    =   DEVICE FORMAT
|                 CH-CO    =   CHANNEL-CONTROLLER
|                 BUS      =   BUS
|                 SOFT     =   HARDWARE DETECTED SOFTWARE ERROR
|                 OPER     =   OPERATOR
|                 OFFLI    =   OFFLINE
|  
|  
|  
|        OTHER READ  WRITE FORMT CH-CO BUS   SOFT  OPER  OFFLI
|        ----- ----- ----- ----- ----- ----- ----- ----- -----
|  
|  MT200                6.
|  
|  MT210               29.
|  
|         *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
|         *                                                         *
|         * SUMMARY of all Errors sorted by Media and Drive by      *
|         * Operation.                                              *
|         *                                                         *
|         *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
|  
|  
|  
|  
|  Operation : WRITE Related
|  
|        MEDIA
|          ID                    UNIT ID
|  
|                    MT200    MT210    TOTAL     
|                   ------   ------   ------


                                    4-29

                              SPEAR FUNCTIONS


|        unknown  !     6. !    29. !    35.
|  
|      TOTAL      !     6. !    29. !    35.
|  
|  
|  
|  TM78 Breakdown:
|    (Interrupt and Failure Codes are OCTAL)
|            Interrupt      Failure        Hard     Soft
|              Code          Code
|  
|    S/N 4404
|       MT200   22 (WRITE)     7             0.       3.
|       MT200   22 (WRITE)    10             0.       1.
|       MT200   22 (WRITE)    14             2.       0.
|    S/N 5242
|       MT210   22 (WRITE)     1             0.       7.
|       MT210   22 (WRITE)     4             0.      10.
|       MT210   22 (WRITE)     7             0.       1.
|       MT210   22 (WRITE)    10             0.       8.
|       MT210   22 (WRITE)    14             3.       0.
|  
|  
|                            Error distribution
|  
|                   Main-|Disk |Tape |Unit |Comm |Net- |Soft-|Crash|Totals
|     14-Mar-84     frame|     |     |rec  |     |work |ware |     |
|                   -----+-----+-----+-----+-----+-----+-----+-----+-----
|    1:00 -  2:00        |     |     |     |     |   6.|     |   5.|  11.
|    6:00 -  7:00        |   7.|     |     |     |   6.|     |  12.|  25.
|    8:00 -  9:00     19.|  35.|     |     |     |  13.|     |  64.| 133.
|    9:00 - 10:00        |  20.|     |     |     |   5.|     |  31.|  56.
|   10:00 - 11:00      9.|     |     |     |     |  10.|     |   7.|  28.
|   11:00 - 12:00      9.|     |     |     |     |   6.|     |   6.|  22.
|   12:00 - 13:00        |     |     |     |     |     |     |   3.|   3.
|   13:00 - 14:00        |     |     |     |     |   1.|     |   9.|  10.
|   14:00 - 15:00        |     |     |     |     |   3.|     |   7.|  10.
|   15:00 - 16:00      9.|     |   4.|     |     |  27.|     |  45.|  86.
|   16:00 - 17:00        |     |     |     |     |  91.|     |  76.| 167.
|   17:00 - 18:00        |     |     |     |     |  19.|     |   6.|  25.
|   18:00 - 19:00        |     |   2.|     |     |  22.|     |  38.|  62.
|   19:00 - 20:00        |     |  11.|     |   1.|  17.|     |  43.|  72.
|   20:00 - 21:00        |   1.|   8.|     |     |  21.|     |  39.|  69.
|   21:00 - 22:00        |     |   4.|     |     |  19.|     |  38.|  61.
|   22:00 - 23:00        |     |   2.|     |     |  12.|     |  38.|  52.
|   23:00 -  0:00        |     |   4.|     |     |  16.|     |  38.|  58.
|                   -----+-----+-----+-----+-----+-----+-----+-----+-----
|      Totals         46.|  63.|  35.|     |   1.| 294.|     | 505.| 950.
|  
|  Due to the addition of the CI and HSC50, you will find another  format
|  for  listing  the  names  of  disks  in  the SUMMARIZE report.  In the
|  previous report, you will find the following:
|  
|       DU-7-14-17
|       DU-7-3-17
|  
|  Starting  from  left  to  right,  these  four  fields  represent   the
|  following:
|  
|       Field one           Device type  DU = RA80, RA81
|                                             DJ = RA60


                                    4-30

                              SPEAR FUNCTIONS


|                                             ??  = unknown
|  
|       Field two           RH slot number for the CI20.  This is  always
|                           number 7.
|  
|       Field three         HSC50 node number on the CI.
|  
|       Field four          Drive number on  the  push  button.   If  the
|                           HSC50 cannot get this number, the number 4095
|                           appears in this field.
|  
|  Note you will find a description of the Disk Subsystem Error  Bits  in
|  Appendix E.


















































                                    4-31

                              SPEAR FUNCTIONS


   4.4.2  Error Register Codes

   The following tables contain brief explanations of  the  abbreviations
   of  the  error  register  codes  (MASSBUS disk registers for RP04s and
   RP06s and tape registers for TU45s, TU77s, and TE16s).
|  
|  
|  Table 4-5:  MASSBUS Disk Registers
|  
|  
                              Error Register 1


|          Code         Meaning


           DCK          Data Check
           UNS          Unsafe
           OPI          Operation Incomplete
           DTE          Drive Timing Error
           WLE          Write Lock Error
           IAE          Invalid Address Error
           AOE          Address Overflow Error
           HCRC         Header CRC Error
           HCE          Header Compare Error
           ECH          ECC Hard Error
           WCF          Write Clock Fail 
           FER          Format Error
           PAR          Parity Error
           RMR          Register Modification Refused
           ILR          Illegal Register
           ILF          Illegal Function


                                 Error Register 2


|          Code         Meaning


           ACU          RP04 - AC Unsafe
                        RP06 - Unused
           PLU          Phase Locked Oscillator Unsafe
           30VU         RP04 - 30 Volts Unsafe
                        RP06 - Unused
           IXE          Index Error
           NHS          No Head Select
           MHS          Multiple Head Select
           WRU          Write Ready Unsafe
           FEN          RP04 - Failsafe Enabled
           ABS          RP06 - Abnormal Stop
           TUF          Transition Unsafe
           TDF          Transition Detector Failure
           MSE          RP04 - Motor Sequence Error
           R&W          RP06 - Read and Write
           CSU          Current Switch Unsafe
           WSU          Write Select Unsafe
           CSF          Current Sink Failure
           WCU          Write Current Unsafe




                                    4-32

                              SPEAR FUNCTIONS


                                 Error Register 3


|          Code         Meaning


           OCYL         Off Cylinder
           SKI          Seek Incomplete
           OPE          RP04 - Unused
                        RP06 - Operator Plug Error
           ACL          AC Voltage Unsafe
           DCL          DC Voltage Unsafe
           DIS          RP04 - Unused
           35V          35 Volts Unsafe
           UWR          RP04 - Any Unsafe Except Read/Write
                        RP06 - Unused
           VUF          RP04 - Velocity Unsafe
           WOF          RP06 - Write and Unsafe
           PSU          RP04 - Pack Speed Unsafe
           DCU          RP06 - DC Voltage Unsafe


|  
|  
|  Table 4-6:  Tape Registers
|  
|  
|          Code         Meaning


           COR/CRC      PE - Correctable Data Error
                        NRZI - CRC Does Not Match Computed CRCC
           UNS          Unsafe
           OPI          Operation Incomplete
           DTE          Drive Timing Error
           NEF          Nonexecutable Function
           CS/ITM       PE - Correctable Skew
                        NRZI - Illegal Tape Mark
           FCE          Frame Count Error
           NSG          Nonstandard Gap Tape Character
           PEF/LRC      PE - Format Error
                        NRZI - Longitudinal Redundancy Check
           INC/VPE      PE - Noncorrectable Data Error
                        NRZI - Vertical Parity Error
           DPA          Data Bus Parity Error
           FMT          Format Error
           PAR          Control Bus Parity
           RMR          Register Modification Refused
           ILR          Illegal Register
           ILF          Illegal Function













                                    4-33

                              SPEAR FUNCTIONS


   4.4.3  SUMMARIZE Procedure

   SUMMARIZE prompts with one or more of the following guidewords:

   SUMMARIZE Mode
   ______________

        Event file (SERR:ERROR.SYS):

|       Category (ALL):

        Time from (EARLIEST):

        Time to (LATEST):

|       Show Error Distribution (YES):

        Report to (DSK:SUMMAR.RPT):

|  If you want to take all the defaults, type S/G to the  SPEAR>  prompt;
   otherwise, read the following procedure:

   STEP 1

|  After you type SUMMARIZE to the SPEAR> prompt, SUMMARIZE requests  the
   name of the input file:

        Event file (SERR:ERROR.SYS):       TOPS-20

                    or

        Event file (SYS:ERROR.SYS):        TOPS-10

   Type one of the following:

        1.  The RETURN key - to take the default, the system event file.

        2.  The name of a file you have previously RETRIEVEd,  in  binary
            format, for example RETRIE.SYS.

        3.  Any file in binary format containing events from  the  system
            event file.

|  STEP 2
|  
|  SUMMARIZE asks for the category  of  the  summary  in  which  you  are
|  interested:
|  
|       Category (ALL):
|  
|  Type one of the following:
|  
|       1.  The RETURN key  or  A[LL]  -  to  take  the  default  of  all
|           categories.
|  
|       2.  M[AINFRAME] - to select a summary for mainframe events.
|  
|       3.  D[ISK] - to select a summary for disk devices.





                                    4-34

                              SPEAR FUNCTIONS


|       4.  T[APE] - to select a summary of tape devices.
|  
|       5.  CI - to select a summary of CI-related events.
|  
|       6.  NI - to select a summary of NI-related events.
|  
|       7.  U[NITRECORD] - to select a summary of hard-copy devices.
|  
|       8.  NE[TWORK] - to select a summary of network-related events.
|  
|       9.  O[PERATING-SYSTEM] - to select a summary of  software-related
|           events.
|  
|      10.  CO[MM] - to select a summary of communication devices.
|  
|      11.  P[ACKID] - to select a summary of specific disk packs.
|  
|      12.  R[EELID] - to select a summary of specific tape reels.
|  
|  All categories except for COMM  and  NI  prompt  for  specific  device
|  types.  Table 4-7 lists the subprompts you can expect.
|  
|  
|  Table 4-7:  Subprompts for Device Types
|  
|  
|    Device Type       Subprompt
|  
|    MAINFRAME         Mainframe devices (ALL):
|    DISK              Disk drives (ALL):
|    TAPE              Tape drives (ALL):
|    CI                CI controllers (ALL):
|    UNITRECORD        Unit record devices (ALL):
|    NETWORK           Event class and type (ALL):
|    OPERATING-SYSTEM  Operating System codes (ALL):
|    PACKID            Disk (structure IDs):
|    REELID            Tape (reel IDs):
|  
|  STEP 3
|  
|  SUMMARIZE keeps prompting you for categories  until  you  either  type
|  FINISHED or press the RETURN key:
|  
|       Next Category (FINISHED):
|  
|  Type one of the following:
|  
|       1.  The RETURN key or F[INISHED] - to take the default.
|  
|       2.  Another category.
|  
|  STEP 4

   After you have specified the source of input,  SUMMARIZE  prompts  you
   for the date and time at which you want the summary to begin:

        Time from (EARLIEST):

   Type one of the following:




                                    4-35

                              SPEAR FUNCTIONS


        1.  The RETURN key - to take  the  default  EARLIEST,  the  first
            event in the file.

        2.  A date and time in the format dd-mmm-yy hh:mm:ss - to signify
            where to begin extracting entries.  A date by itself defaults
            to one second after midnight.

        3.  A date and time in the format -nn  to  indicate  a  reference
            point  prior  to  the  current  date.  For example, -7 causes
            SUMMARIZE to begin extracting entries seven days prior to the
            current day.




















































                                    4-36

                              SPEAR FUNCTIONS


|  STEP 5

   SUMMARIZE then prompts for the end of the time period:

        Time to (LATEST):

   Type one of the following:

        1.  The RETURN key - to take the default LATEST, the  last  entry
            in the system event file.

        2.  A date and  time  in  the  format  dd-mmm-yy  hh:mm:ss  -  to
            indicate  the  last  date  for  extracted entries.  A date by
            itself defaults to one second after midnight.

        3.  A date and time in the format -nn  to  indicate  a  reference
            point  prior  to  the  current date.  For example, -13 causes
            SUMMARIZE to stop extracting entries recorded  thirteen  days
            before the current date.

|  STEP 6
|  
|  After specifying a timeframe, you can choose whether or not to receive
|  the error distribution tables:
|  
|       Show Error Distribution (YES):
|  
|  Type one of the following:
|  
|       1.  The RETURN key or Y[ES] - to take  the  default.   This  will
|           give  you  all  the error distribution charts relevant to the
|           time constraints you specify.
|  
|       2.  N[O] - to suppress the error  distribution  charts  from  the
|           report.
|  
|  STEP 7

   The last thing SUMMARIZE asks for is the destination of the output:

        Report to (DSK:SUMMAR.RPT):

   Type one of the following:

        1.  The RETURN key - to take the default DSK:SUMMAR.RPT.

        2.  Any file name in the proper format.

        3.  TTY:  - to have the report printed on  your  terminal.   Note
            that if you specify TTY:, SUMMARIZE does not save the file in
            your disk area.

   After you select the output destination and press RETURN,  SPEAR  asks
   you to confirm your decision.

|       Type <cr> to confirm (/GO):

   At this point you can:





                                    4-37

                              SPEAR FUNCTIONS


        1.  Press RETURN or type /GO to execute the SUMMARIZE process.

        2.  Type /SHOW to list the parameters you have chosen.

        3.  Type /REVERSE to return to the previous prompt.

        4.  Type /BREAK to return to SPEAR level.

        5.  Type question mark (?), HELP, the question mark switch  (/?),
            or /HELP to find out what your options are.

   To read the SUMMARIZE report, you can list the file on the lineprinter
   by doing the following:

        Return to operating system command level by typing  EXIT  to  the
        SPEAR> prompt.















































                                    4-38

                              SPEAR FUNCTIONS


        Use  the  PRINT  command  with  any  options  available  on  your
        operating system.

   Note that if you specified TTY:  to the Report to:  prompt,  you  will
   not have a file saved in your area to print.



   4.4.4  Sample SUMMARIZE Session

   The following is a sample of a  SUMMARIZE  session  using  the  system
   event file for input:

|  @spear
|  
|  Welcome to SPEAR for TOPS-20. Version 2(605)
|  Type "?" for help.
|  
|  SPEAR> summarize
|  
|  SUMMARIZE mode
|  --------------
|      Event file (SERR:ERROR.SYS): 
|  
|          Category (ALL): main
|  
|              Mainframe devices (ALL): cpu
|  
|          Next Category (FINISHED): disk
|  
|              Disk drives (ALL): rpo7
|  
|          Next Category (FINISHED): 
|  
|      Time from (EARLIEST): 
|  
|      Time to (LATEST): 
|  
|      Show Error Distribution (YES): no
|  
|      Report to (DSK:SUMMAR.RPT): 
|  
|  Type <cr> to confirm (/GO): 
|  INFO - Summarizing ST:GIDNEY.02-27
|  INFO - Now sending summary to DSK:SUMMAR.RPT
|  INFO - Summary output finished
|  
|  
|  SPEAR> ex



   4.5  TOPS-20 KLSTAT MODE

   On TOPS-20, there is an additional troubleshouting  aid  that  can  be
   helpful  if severe intermittent faults do not leave enough information
   in the system event file.  This feature is the KLSTAT mode.  When  you
   turn  KLSTAT on, you are actually turning on a monitor flag that tells
   the monitor to record additional information  into  the  system  event
   file when any CPU, memory, or MASSBUS errors occur.



                                    4-39

                              SPEAR FUNCTIONS


   Note that turning on this flag causes severe system  degradation  (the
   system  goes  down while KLSTAT is collecting data) you should turn it
   on only when absolutely necessary.  In fact,  you  must  have  special
   privileges to turn it on or off.

   When the KLSTAT mode is in  operation,  the  system  event  file  will
   contain  KL  CPU STATUS BLOCK entries.  For a sample of such an entry,
   turn to Section 5.3.12.  For the KLSTAT procedure, read the  following
   section, Section 4.5.1.






















































                                    4-40

                              SPEAR FUNCTIONS


   4.5.1  KLSTAT Procedure

   The KLSTAT mode  has  three  functions:   ON,  OFF,  and  CHECK.   The
   following procedure describes their use:

   STEP 1

   First,  enable  your  special  privileges  at  monitor  level,  either
   OPERATOR  or  WHEEL privileges.  Then access SPEAR.  (Note, you do not
   need privileges to CHECK the status of KLSTAT.)

   STEP 2

   Once at the SPEAR prompt, type K[LSTAT]:

        SPEAR>KLSTAT

   SPEAR responds with:

        SPEAR>KLSTAT

        KLSTAT mode
        ___________

        Extra reporting (CHECK):

   STEP 3

   At this point, type one of the three options.  Pressing the Escape key
   gets  you  the  default,  CHECK.   If  you  type ON, you will get this
   message:

        The following should be noted before proceeding!
        This function can cause SEVERE system degradation!

   If you decide not to risk it, type /R to return to the SPEAR prompt.

   STEP 4

   If you respond with one of the three choices, SPEAR prompts with:

|       Type <cr> to confirm (/GO):

   If you chose ON or OFF, SPEAR returns you to the SPEAR prompt.  If you
   chose CHECK, the default, SPEAR prints one of the following:

        (KLSTAT) Extra error reporting is currently enabled.

                                or

        (KLSTAT) Extra error reporting is currently disabled.

   You can check the information gathered by turning on the  KLSTAT  mode
   by looking for the KL CPU STATUS BLOCK entry in the system event file.
   See Section 5.3.12.








                                    4-41

5-1












                                 CHAPTER 5

                             ENTRY DESCRIPTIONS



   5.1  INTRODUCTION

   This chapter provides a sample of most  of  the  events  that  can  be
   recorded  in  the system event file.  These samples appear just as you
   see them when you use RETRIEVE to translate  entries  from  binary  to
   ASCII.   Although  the  entries  may  differ in format, they each have
   sections in common, some more than others depending on  the  operating
   system  involved.   Each entry may contain from one to six sections of
   information:

        Section 1  Entry Description
        Section 2  Unit Identification
        Section 3  Software Status
        Section 4  Controller Status
        Section 5  Device or Unit Status
        Section 6  Statistical Information

   Every entry has at least a Section 1, Entry Description.  This section
   contains:

        1.  Type of entry and/or type of error

        2.  Error-entry date and time that it was logged

        3.  Monitor uptime

        4.  System serial number

   Entries may contain Sections 2 through Section 6.  Section 2  contains
   the following information:

        1.  Unit logical name

        2.  Unit physical name

        3.  Unit type

        4.  Media identification










                                    5-1

                             ENTRY DESCRIPTIONS


   Section 3 contains the following:

        1.  Highest process requesting service (user)

        2.  Lowest process requesting service (author)

        3.  User/process  identification  (user  identification,  program
            name, file name, program location in memory, and so forth)

        4.  Pertinent system registers (processor flags, program counter,
            and so forth) before and/or after error as applicable

        5.  Disposition of event (retry  count,  recovered  or  not,  the
            point in the retry algorithm where recovery was affected, and
            so forth)

        6.  Other I/O activity at error time

   Section 4 contains the following:

        1.  Controller name and/or address

        2.  Controller type

        3.  Name  and  value  of  all  information  available  from   the
            controller

   Section 5 contains the following:

        1.  Name and value of all status information available  from  the
            unit

        2.  Function that was active at error time

        3.  Logical and physical address of the unit before error

        4.  Logical and physical address of the unit at error

        5.  Transfer  size  and  starting  memory  location  of  I/O   if
            applicable

   Section 6 contains unit activity since start-up.

   The default radix in these entries is decimal; however,  some  entries
   may have numbers displayed in octal or binary.



   5.2  TOPS-10 ENTRIES

   The following sections list both the FULL and SHORT  versions  of  the
   entries that TOPS-10 can record in its system event file.



   5.2.1  System Reload

   The monitor generates a System Reload entry into the system event file
   whenever  it  is  loaded.   Note  that  HALT,  STOP,  and CPU stopcode
   information is also recorded in this entry, if applicable.



                                    5-2

                             ENTRY DESCRIPTIONS


                                    FULL

   ***********************************************
   SYSTEM RELOAD              
    LOGGED ON  5-Aug-80 AT  0:16:39      MONITOR UPTIME WAS  0:00:38
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 190.
   ***********************************************
   CONFIGURATION INFORMATION
           SYSTEM  NAME:               RZ064A KL #1026/1042
           MONITOR BUILT ON:       07-23-80
           CPU SERIAL #:           1026.
           STATES WORD:            771165,0
           MONITOR VERSION  %701(0)

   RELOAD BREAKDOWN
           CAUSE:                  SCHED
           COMMENTS                ;PUT 1
   MEMORY ON-LINE AT RELOAD:
   FROM:  0 P  TO:  2048 P

                                   SHORT

   SEQ    TIME     5-Aug-80

   190.  0:16:39 RELOAD OF     RZ064A KL #1026/1042 VERSION (70100)
                       BUILT ON 07-23-80 REASON SCHED



   5.2.2  Non-Reload Monitor Error

   Each time a JOB or DEBUG stopcode  occurs,  the  monitor  records  the
   information  as  a  Non-Reload Monitor Error in the system event file.
   The JOB stopcode endangers the integrity of the job currently running;
   therefore,  the  monitor  aborts  the  current job, then continues.  A
   DEBUG stopcode is not immediately harmful to any job  or  the  system;
   therefore,  the  monitor prints the stopcode message on the operator's
   terminal (CTY) and then continues processing.

                                    FULL

   ***********************************************
   NON-RELOAD MONITOR ERROR   
    LOGGED ON  5-Aug-80 AT 10:51:49      MONITOR UPTIME WAS  2:26:26
           DETECTED ON SYSTEM # 1042.
           RECORD SEQUENCE NUMBER: 863.
   ***********************************************
           SYSTEM NAME:        RZ64C  KL #1026/1042
           SYSTEM SERIAL #:        1026.
           MONITOR DATE:   07-23-80
           MONITOR VERSION  %701(0)
           STOPCD NAME:    BAZ
           RESULT:         
                   JOB #:  6.
                   USER'S ID:      [1,2]
                   TTY NAME:          470
                   PROGRAM NAME:   ACTDAE





                                    5-3

                             ENTRY DESCRIPTIONS


           CONTENTS OF AC'S AT STOPCD:

       0:  20,0
       1:  777642,377507
       2:  0,100
       3:  5777,371000
       4:  526200,340000
       5:  664145,663167
       6:  440004,0
       7:  0,50
      10:  0,0
      11:  0,505273
      12:  0,250255
      13:  47040,1
      14:  0,1
      15:  0,1
      16:  0,4
      17:  0,146

           PI STATUS:      440004,0

                                   SHORT

   SEQ    TIME     5-Aug-80

   863. 10:51:49 STOPCD    BAZ ON CPU SERIAL # 1026 FOR JOB # 6 ON    470
                       USER WAS [1,2] RUNNING ACTDAE



   5.2.3  Crash Extract

   A Crash Extract becomes a part of the system event file  whenever  the
   program  DAEMON  starts.   When  DAEMON  starts,  it checks the system
   search list for a CRASH.EXE file.  If it finds one,  it  extracts  the
   information and appends it to the system event file.

                                    NOTE

           It is strongly recommended that, each time the monitor
           is  started,  you  save  a dump as a CRASH.EXE file so
           that DAEMON/SPEAR can provide a  complete  picture  of
           system  activity.   You  can  do  this  by saving each
           monitor core image (dumping the crash) after each run;
           that  is,  before  PM  or CM periods, before scheduled
           reloads, after stand-alone periods, and so forth.   To
           save core-image, use the /D command to MONBTS.

   Because DAEMON extracted the information from a saved crash, the  date
   and  time  and  the  monitor  uptime in the header are the last values
   recorded by the monitor before the crash.












                                    5-4

                             ENTRY DESCRIPTIONS

































































                                    5-5

                             ENTRY DESCRIPTIONS

































































                                    5-6

                             ENTRY DESCRIPTIONS


   5.2.4  Data Channel Error

   When a channel detects an error or a device  connected  to  a  channel
   detects  an  error  during  a  data  transfer, the monitor logs a Data
   Channel Error into the system event file.  The entry is  made  at  the
   time  of  first  error; thus, the entry can be a soft or a hard error.
   Because the monitor programs the channel to stop when it encounters an
   error   (except   on  the  last  retry),  this  entry  gives  valuable
   information about the word in error and its address,  whether  or  not
   the error was detected by the channel.

   The Data Channel Error is generated only for DF10 data channels and is
   not generated for devices using the KL10 internal channels (RH20).

                                    FULL

   ***********************************************
   DATA CHANNEL ERROR         
    LOGGED ON  1-Oct-80 AT  9:03:12      MONITOR UPTIME WAS  1:02:10
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 3122.
   ***********************************************
   DATA CHANNEL ERROR TOTALS
           NXM'S AND OVERRUNS:     1.
           MEM PE SEEN BY CHANNEL: 0.
           CONTROLLER DATA PE
           OR CCW TERM CHK FAILS:  0.

   CHANNEL COMMAND LIST BREAKDOWN
           DEVICE  USING CHANNEL:  RPA5
           INITIAL CONTROL WORD:   0,454
           TERMINATION WD WRITTEN: 11323,313216
           EXPECTED TERM. WORD:    11323,313413
           CHANNEL COMMAND LIST:   0,454
                                   774003,313213
                                   0,0
           3RD FROM LAST DATA WORD:0,0
           2ND FROM LAST DATA WORD:0,0
           LAST DATA WORD XFERRED: 0,0

                                   SHORT

   SEQ    TIME     1-Oct-80

   3122.  9:03:12 RPA5 CHANNEL ERROR COUNTS: NXM/MPE/DPE 1/0/0
                       WRITTEN TERM WD = 11323,313216
                       EXPECTED TERM WD = 11323,313413



   5.2.5  DAEMON Started

   The monitor logs this entry into  the  system  event  file  each  time
   DAEMON  is  started,  either  after  a  system  reload or a restart of
   DAEMON.  If DAEMON is modified  at  the  site,  the  customer  version
   number should be edited to track the modifications.







                                    5-7

                             ENTRY DESCRIPTIONS


                                    FULL

   ***********************************************
   DAEMON STARTED             
    LOGGED ON  5-Aug-80 AT  0:16:30      MONITOR UPTIME WAS  0:00:28
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 184.
   ***********************************************
           DAEMON VERSION  20(757)

                                   SHORT

   SEQ    TIME     5-Aug-80

   184.  0:16:30 DAEMON STARTED--VERSION 20(757)



   5.2.6  MASSBUS Disk Error

   Any time the monitor detects an error in any portion  of  the  MASSBUS
   system  (either hardware or software), DAEMON is called to collect and
   record all pertinent hardware and software information  in  the  error
   file.

   In this entry, the MEDIA ID is  the  value  given  to  the  disk  when
   structured  with ONCE or TWICE.  The STR ID is the logical name of the
   media such as DSKB0.  Both are recorded in the HOME  block.   The  LBN
   (logical  block  number)  is  the  location  of the first block in the
   transfer.  If LBN n,  n+1,  n+2,  and  n+3  were  transferred,  it  is
   possible  that  LBN  n,  n+1, and n+2 are alright, but LBN n+3 is bad.
   This value is broken into either the cylinder #, surface #, and sector
   # (for disks) or the track # and sector # (for RS04s) to determine the
   physical location of the failure.

   The OPERATION AT ERROR is the text translation  of  the  last  command
   issued  to  the  device  before the error was detected (presumably the
   command that caused the error).  The text translation should match the
   translation  of the bits in DATAI RHCR AT ERROR for the RH10 and DATAI
   PTCR AT ERROR for an RH20.  If the information does  not  match,  look
   for an error in the control bus.

                                    NOTE

           Because of dual-port capabilities for disk drives, the
           physical  device  number  can  change according to the
           port assignment.  For example, on dual-ported  drives,
           one drive may be RPA3 on PORT A and RPC3 on PORT B.

   MASSBUS devices  store  and  make  available  significant  amounts  of
   device-dependent  information.   The  contents  of  all  registers are
   listed in the entry both at error time and after the last retry, along
   with  the  difference  between  the two values.  Text translations are
   always from the AT ERROR  value  with  the  exception  of  the  OFFSET
   Register; offsets are not normally used.










                                    5-8

                             ENTRY DESCRIPTIONS


   Note that software errors are checked  only  after  the  hardware  has
   completed the transfer without a detected error.




















































   5.2.7  DX20 Device Error

   The monitor records a DX20 Device Error in the system event file  when
   it  detects an error in any portion of the MASSBUS system connected to
   the DX20 channel interface.

   In this entry, the MASSBUS REGISTER INFORMATION contains  the  nonzero
   contents of all registers both at error time and after the last retry.
   Also the SB (sense bytes) describe the device type and status  of  the


                                    5-9

                             ENTRY DESCRIPTIONS


   device (in octal) attached to the DX20.






























































                                    5-10

                             ENTRY DESCRIPTIONS

































































                                    5-11

                             ENTRY DESCRIPTIONS

































































                                    5-12

                             ENTRY DESCRIPTIONS


   5.2.8  Software Event

   This entry is logged into the system  event  file  when  a  user  with
   special privileges, for example the system operator, issues one of the
   following monitor  calls:   POKE,  RTTRP,  SNOOP,  or  TRPSET.   These
   monitor calls have the following effect:

        1.  POKE changes the value of a word in monitor core.

        2.  RTTRP connects a device to or releases it from  the  realtime
            interrupt facility.

        3.  SNOOP allows privileged programs to insert breakpoints in the
            monitor  that  trap to a user program.  The user program must
            be locked in core when the trap occurs.  This feature is used
            for   fault   insertion,   performance  analysis,  and  trace
            functions.

        4.  TRPSET prevents jobs other than the calling job from running.
            You  can use this call to guarantee fast response to realtime
            interrupts.

   For more information on monitor calls, refer to  the  TOPS-10  Monitor
   Calls Manual.

                                    FULL

   ***********************************************
   SOFTWARE EVENT             
    LOGGED ON 14-Jul-80 AT  8:56:45      MONITOR UPTIME WAS  0:42:42
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 1.
   ***********************************************
           EVENT TYPE: POKE
           JOB #: 46.
           USER PPN: [10,5324]
           LOCATION OF USER:
                   NODE:26
                   LINE:154
                   TTY154
           PROGRAM: SPICE
   STORED DATA VALUES:
           0,34030

                                   SHORT

   SEQ    TIME    14-Jul-80

   1.  8:56:45 SOFTWARE EVENT  TYPE:   POKE  BY JOB 46 USER WAS [10,5324]
                       RUNNING SPICE AT NODE: 26 LINE: 154 TTY154













                                    5-13

                             ENTRY DESCRIPTIONS


   5.2.9  Configuration Status Change

   The monitor records a Configuration Status Change whenever the  system
   operator  marks  disk  units  and  sections  of core memory on-line or
   off-line.  The system operator uses the either the CONFIG  program  or
   the  SET  command to change the system configuration.  These tools are
   useful because they can prevent further errors to users until  a  unit
   can  be repaired, or they can be used to split and later join dual CPU
   systems.  For more information on the CONFIG  program,  refer  to  the
   file CONFIG.DOC.

   With the SET command, the system operator can also give a  2-character
   reason  for  the  change  in configuration.  Any two characters can be
   used, but the following codes are suggested:

        PM  -  preventive maintenance

        CM  -  corrective maintenance

        DN  -  unit is down

        OT  -  other

                                  CAUTION

           When the system operator adds memory  to  the  system,
           the  monitor  checks to verify the availability of the
           specified addresses.  Mistakes  are  reported  at  the
           operator's  terminal  (CTY),  but  the  error  logging
           system treats these as valid NXMs  and  generates  the
           appropriate  NXM  reports.   You  can  identify  a NXM
           report of this type  because  no  physical  memory  is
           placed off-line and the user's directory is [1,2].

                                    FULL

   ***********************************************
   CONFIGURATION STATUS CHANGE
    LOGGED ON  4-Aug-80 AT 14:06:05      MONITOR UPTIME WAS  1:44:50
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 15.
   ***********************************************
   COMMAND:DETACH
           DEVICE:RNA0

                                   SHORT

   SEQ    TIME     4-Aug-80

   15. 14:06:05 CONFIGURATION CHANGE  DETACHED RNA0













                                    5-14

                             ENTRY DESCRIPTIONS


   5.2.10  System Log Entry

   The monitor records a System Log Entry when the system operator enters
   a log entry into the system event file with the OPR program.

   A system operator, or anyone with operator  privileges,  can  make  an
   entry into the system event file by doing the following:

        1.  Run the OPR program

                 .OPR<RET>
                 OPR>

        2.  When you see the prompt, specify the REPORT command:

                 OPR>REPORT

        3.  Use the following syntax:

                 OPR>REPORT user text <RET>

            where user can be directory name and/or device name and  text
            can be a single-line or multiple-line response.

   For more information on OPR, refer to the TOPS-10  Operator's  Command
   Language Reference Manual.

                                    FULL

   ***********************************************
   SYSTEM LOG ENTRY           
    LOGGED ON 15-Sep-80 AT 10:40:12      MONITOR UPTIME WAS  5:30:10
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 37.
   ***********************************************
   ENTRY CREATED BY:
           JOB #, TTY #:   77,502
           P,PN:           [27,2617]
           WHO:            MASELL
           DEV:            TTY
           MESSAGE:        : THIS IS A TEST.

                                   SHORT

   SEQ    TIME    15-Sep-80

   37. 10:40:12 SYSTEM LOG ENTRY BY MASELL FOR DEVICE TTY ON TTY # 502
                       MESSAGE: : THIS IS A TEST.



   5.2.11  Software Requested Data

   At certain times during system operation, some problems can arise that
   are not easily understood.  Most frequently, the source of the failure
   is a hardware failure but the failure is detected by the software.  In
   order to troubleshoot this type of failure, you may require additional
   data from the monitor.  You can obtain this  information  by  patching
   the monitor to collect the information at the proper point and passing
   it to the system event file for listing.



                                    5-15

                             ENTRY DESCRIPTIONS


                                  CAUTION

           Patching  a  monitor  can  easily   produce   drastic,
           undesired  results  such  as  loss  of  customer data,
           system crashes, and so forth.   Be  EXTREMELY  CAREFUL
           and  enlist  the  help of someone who is familiar with
           the monitor structure and internal workings.

   SPEAR lists the information in this entry in octal and sixbit.

   ***********************************************
   SOFTWARE REQUESTED DATA    
    LOGGED ON  4-Jan-81 AT  6:50:34      MONITOR UPTIME WAS  3:13:34
           DETECTED ON SYSTEM # 2263.
           RECORD SEQUENCE NUMBER: 1.
   ***********************************************
           OCTAL VALUE     SIXBIT VALUE
           504554,545700   HELLO
           675762,544400   WORLD
           123456,654321   *<NUC1
           654321,123456   UC1*<N
           555762,450063   MORE S
           517042,516400   IXBIT



   5.2.12  Magtape System Error

   The monitor records any magtape errors it detects as a Magtape  System
   Error.   Errors  that  are  non-recoverable  are  classified  as HARD,
   recoverable errors are classified as SOFT.

   If  the  monitor  detects  a  data  channel  error,  it  records   the
   appropriate  information  under  error  code  6 or Data Channel Error.
   After a user issues an UNLOAD command or UUO, the monitor records  the
   performance  statistics  for  the  tape, including the total number of
   characters transferred and the  number  of  errors  (soft  read,  soft
   write, hard read, hard write) encountered.

   Note that if someone mounts unlabelled tapes  without  specifying  any
   kind of ID, there will be no MEDIA identified in the error file.






















                                    5-16

                             ENTRY DESCRIPTIONS

































































                                    5-17

                             ENTRY DESCRIPTIONS


   5.2.13  Front End Device Report

   You will find a Front End Device Report in the system event file  when
   the  front  end  passes  a packet of error information to the monitor.
   This information contains errors detected by the front end  and  KLCPU
   hardware  and software.  If the device being reported on is unknown to
   SPEAR, the entry is reported in octal.

                                    FULL

   ***********************************************
   FRONT END DEVICE REPORT     
    LOGGED ON  3-Nov-80 AT  9:44:10      MONITOR UPTIME WAS 2 DAYS 14:37:29
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 67.
   ***********************************************
           CPU #,DTE #:            0,0
           FE SOFTWARE VER:        0.
           DEVICE: KLCPU
           STD. STATUS:    100 = ERROR LOG REQUEST,
           KL RELOAD STATUS FROM FRONT END:  0      =  NO ERROR BITS DETECTED

                                   SHORT

   SEQ    TIME     3-Nov-80

   67.  9:44:10 KLCPU STD STAT=100 RELOAD STAT=0



   5.2.14  Front End Reload

   The monitor logs a Front End Reload entry into the system  event  file
   when  it determines that one of its front ends (attached to a DTE on a
   KL10 only) has crashed and has attempted to reload.  Before  rebooting
   the front end, the monitor dumps the crashed front end's core image to
   a disk file for later analysis.


























                                    5-18

                             ENTRY DESCRIPTIONS


   5.2.15  KS10 Halt Status Block

   The monitor records a KS10 Halt Status Block  entry  into  the  system
|  event  file  when  the  KS10  microcode  executes  a HALT stopcode.  A
   snapshot of the condition of the system is taken  just  prior  to  the
   HALT, and this information is written as the entry.

                                    FULL

   ***********************************************
   KS10 HALT STATUS BLOCK     
    LOGGED ON  9-Feb-81 AT 14:21:55      MONITOR UPTIME WAS  0:01:12

           DETECTED ON SYSTEM # 4145.
           RECORD SEQUENCE NUMBER: 1.
   ***********************************************
   HALT STATUS CODE:       2
   PROGRAM  COUNTER:       1000
   HALT STATUS BLOCK
           MAG:    0,2
           PC:     0,1000
           HR:     777756,4
           AR:     0,0
           ARX:    377777,777777
           BR:     0,1000
           BRX:    254000,1000
           ONE:    241200,200000
           EBR:    0,1
           UBR:    0,31463
           MASK:   774777,470177
           FLAGS,,PAGE FAIL WORD:  0,1
           PI STATUS:      400060,120000
           XWD1:   500101,553000
           T0:     777777,777777
           T1:     4000,0
           VMA:    0,177

                                   SHORT

   SEQ    TIME     9-Feb-81

   1. 14:21:55 HALT STATUS CODE =  PC = 0,1000 HR = 254000,1000
                       PAGE FAIL = 4000,0 PI = 0,177 FLAGS,,VMA = 0,0



   5.2.16  Magtape Statistics

   Each time an UNLOAD UUO or monitor command is given to  a  tape  drive
   the  monitor creates a Magtape Statistics entry.  The same information
   is printed in summary  form  on  both  the  user's  terminal  and  the
   operator's terminal (CTY).

   In this entry, the REEL IDENTIFICATION is the  name  supplied  to  the
   monitor  at  the time the tape was mounted.  It has nothing to do with
   any label information found on the tape.  The CHARS READ is the number
   of  characters  or  frames  of  tape  read on this unit since the last
   UNLOAD command was issued to this unit.   The  CHARS  WRITTEN  is  the
   number  of characters or frames of tape written on this unit since the
   last UNLOAD command was issued.



                                    5-19

                             ENTRY DESCRIPTIONS


                                    FULL

   ***********************************************
   MAGTAPE STATISTICS         
    LOGGED ON  4-Aug-80 AT 13:40:05      MONITOR UPTIME WAS  1:18:50
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 5.
   ***********************************************
   MAGTAPE STATISTICS
           UNIT NAME:               MTB261
           REEL IDENTIFICATION:     
           USER'S P,PN:             1,2
           CHARS READ:              2720.
           CHARS WRITTEN:           0.
           SOFT READ ERRORS:        0.
           HARD READ ERRORS:        1.
           SOFT WRITE ERRORS:       0.
           HARD WRITE ERRORS:       0.

                                   SHORT

   SEQ    TIME     4-Aug-80

   5. 13:40:05 MTB261 STATISTICS  READ CH/H/S: 2720/1/0 WRITE CH/H/S: 0/0/0



   5.2.17  Disk Statistics

   This entry reports the performance of each disk unit since the monitor
   was  loaded.   It is useful for computing the disk error rate and disk
   throughput.  This information is usually not recorded by DAEMON in the
   system  event  file  because  it  takes  up  a  great  deal  of space.
   Installations that want this entry should reassemble DAEMON  with  the
   conditional assembly switch FTUSN set.

   The monitor records this entry type for each disk unit on  the  system
   each hour.  You can find the same type of information for each monitor
   run in the Crash Extract entry (Section 5.2.3).
























                                    5-20

                             ENTRY DESCRIPTIONS

































































                                    5-21

                             ENTRY DESCRIPTIONS


   5.2.18  DL10 Communications Error

   The monitor records a DL10 Communications Error into the system  event
   file when the DL10 detects an error on the communications link.

                                    FULL

   ***********************************************
   DL10 COMMUNICATIONS ERROR  
    LOGGED ON  4-Aug-80 AT 16:45:09      MONITOR UPTIME WAS  4:23:54
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 86.
   ***********************************************
           UNIT:           DC76
           DL10 PORT:      0
           ERROR:   NO ERROR BITS DETECTED

           11 PROGRAM NAME:        DC76
   CONTROLLER INFORMATION:
           CONI DLC:       60,200204 = P1 ENB,
           DATAI DLC:      0,750 =  NO ERROR BITS DETECTED
           CONI DLB (R=0): 0,5037
           CONI DLB (R=1): 40000,6005
           CONI DLB (R=2): 2000,46401
           CONI DLB (R=3): 577777,46400
           DATAI DLB (R=1)(MB):    0,0

                                   SHORT

   SEQ    TIME     4-Aug-80

   86. 16:45:09 DL10 ERROR ON PDP11 # 0 CONI DLC = 60,200204
                       DATAI DLC = 0,750



   5.2.19  KL10 Parity or NXM Interrupt

   The monitor records a KL10 Parity or NXM Interrupt in the system event
   file  when  the  KL10 detects a parity error or an attempt to access a
   nonexistent memory location.

   The PC AT INTERRUPT is the status of the program counter at  the  time
   of  the  parity  or  nonexistent  memory  interrupt.   The  CONI PI AT
   INTERRUPT is the status of the Priority Interrupt system at  the  time
   of the parity or nonexistent memory interrupt.


















                                    5-22

                             ENTRY DESCRIPTIONS
































   5.2.20  KS10 NXM Trap

   When the KS10 detects a read on a  nonexistent  memory  location,  the
   monitor  records  a  KS10 NXM Trap into the system event file.  A trap
   stops execution during the current instruction.

                                    FULL

   ***********************************************
   KS10 NXM TRAP              
    LOGGED ON 22-Mar-81 AT  0:11:50      MONITOR UPTIME WAS  0:23:18
           DETECTED ON SYSTEM # 4608.
           RECORD SEQUENCE NUMBER: 1.
   ***********************************************
   ERROR DETECTED ON CPS0
   PC AT TRAP:     1,145267
   CONI PI AT TRAP:        0,2377
           PAGE FAIL WORD: 200013,770000
           PAGE FAIL CODE: 20 = I-O NXM
   PHYSICAL MEMORY ADDRESS AT TRAP:        0,0
   USER'S ID AT TRAP:      [307,5515]
   USER'S PROGRAM:         TSTUBA
   # OF RECOVERABLE TRAPS: 0.
   # OF NON-RECOVERABLE TRAPS:     0.

                                   SHORT

   SEQ    TIME    22-Mar-81

   1.  0:11:50 NXM TRAP  PFW = 200013,770000 PMA = 0,0 NON
                       RECOVERABLE FAILURE  RETRYS: 31
                       USER AT TRAP [307,5515] RUNNING TSTUBA

                                    5-23

                             ENTRY DESCRIPTIONS


   5.2.21  KL10 or KS10 Parity Trap

   The monitor records a KL10 or KS10 Parity Trap when either the KL10 or
   KS10 detects an internal parity error, not necessarily in memory.

   In this entry, the PHYSICAL MEMORY ADDRESS AT TRAP gives the  location
   of the parity error where the trap occurred.

                                    FULL

   ***********************************************
   KL10 OR KS10 PARITY TRAP   
    LOGGED ON  4-Feb-81 AT 17:37:14      MONITOR UPTIME WAS  0:03:13
           DETECTED ON SYSTEM # 2136.
           RECORD SEQUENCE NUMBER: 1.
   ***********************************************
   ERROR DETECTED ON CPL0
   PC AT TRAP:     316000,230
   CONI PI AT TRAP:        0,377
   PHYSICAL MEMORY ADDRESS AT TRAP:        547001,436241
   USER'S ID AT TRAP:      [1,2]
   USER'S PROGRAM:         KLPAR4
           PAGE FAIL WORD: 767000,241
           PAGE FAIL CODE: 36 = AR
           BAD DATA WORD:  252525,252525
           GOOD DATA WORD: 0,0
           DIFFERENCE:     252525,252525
           RECOVERY:       CRASH USER
           RETRY COUNT:    
                   W CACHE:        4.
                   W-O CACHE:      0.ERROR DURING CACHE SWEEP TO CORE
   # OF RECOVERABLE TRAPS: 0.
   # OF NON-RECOVERABLE TRAPS:     3.

                                   SHORT

   SEQ    TIME     4-Feb-81

   1. 17:37:14 PARITY TRAP  PFW = 767000,241 PMA = 547001,436241
                       NON RECOVERABLE FAILURE  USER AT TRAP [1,2]
                       RUNNING KLPAR4 RETRIES: 4






















                                    5-24

                             ENTRY DESCRIPTIONS


   5.2.22  Memory Sweep for NXM

   When the monitor detects an attempt to  access  a  nonexistent  memory
   location  in user core, it scans core by doing a memory sweep, looking
   for more NXMs.  The monitor then records the results of this scan as a
   Memory Sweep for NXM in the system event file.

   The ADDRESSES DETECTED BY SWEEP gives you the locations,  if  any,  of
   more attempts to access nonexistent memory locations.

                                    FULL

   ***********************************************
   MEMORY SWEEP FOR NXM       
    LOGGED ON  1-Oct-80 AT  9:03:14      MONITOR UPTIME WAS  1:02:21
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 3124.
   ***********************************************
   NXM CORE SWEEP TOTALS FOR CPL0
           REPRODUCIBLE:   0.
           NON-REPRODUCIBLE:       0.
           DETECTED BY DATA
            CHANNEL BUT NOT
            BY CPU:                20.
   SWEEP INFORMATION:
           ERRORS DETECTED:        0.
           LOGICAL "AND" OF BAD
            PHYSICAL ADDRESSES:    777777,777777
           LOGICAL "OR" OF BAD
            PHYSICAL ADDRESSES:    0,0
   MEMORY PLACED OFF-LINE:

                                   SHORT

   SEQ    TIME     1-Oct-80

   3124.  9:03:14 NXM SWEEP ON CPL0 # OF ERRORS SEEN = 0


























                                    5-25

                             ENTRY DESCRIPTIONS


   5.2.23  Memory Sweep for Parity

   When the monitor detects a parity error on a read attempt,  it  sweeps
   memory  looking  for  more  of the same.  The results of the sweep are
   recorded in the system event file as a Memory Sweep for Parity.

   The SWEEP INFORMATION contains the number  of  words  found  with  bad
   parity.   It  also  contains the logical AND and logical OR of the bad
   addresses and bad contents.

                                    FULL

   ***********************************************
   MEMORY SWEEP FOR PARITY    
    LOGGED ON  4-Nov-80 AT  8:39:53      MONITOR UPTIME WAS  0:35:34
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 2026.
   ***********************************************
   DATA PARITY CORE SWEEP TOTALS FOR CPL0
           REPRODUCIBLE:   0.
           NON-REPRODUCIBLE:       0.
           USER ENABLED:   0.
           CORE SWEEPS:    1.
           DETECTED BY DATA
            CHANNEL BUT NOT
            BY CPU:                1.
   SWEEP INFORMATION:
           ERRORS DETECTED:        0.
           LOGICAL "AND" OF BAD
            PHYSICAL ADDRESSES:    777777,777777
           LOGICAL "OR" OF BAD
            PHYSICAL ADDRESSES:    0,0
           LOGICAL "AND" OF BAD DATA:      777777,777777
           LOGICAL "OR" OF BAD DATA:       0,0

                                   SHORT

   SEQ    TIME     4-Nov-80

   2026.  8:39:53 DATA PARITY CORE SWEEP FOR CPL0 # OF ERRORS SEEN = 0



   5.2.24  CPU Status Block

   The monitor records this  entry  into  the  system  event  file  after
   recovering  from a system crash.  At the time of the crash, a snapshot
   is taken of the condition of all the components of the  CPU  (such  as
   controllers,  channels,  RH20s,  the  pager,  and so forth).  When the
   system recovers,  the  monitor  extracts  this  information  from  the
   CRASH.EXE  file and places it in the system event file as a CPU Status
   Block.

   This entry contains the condition of the registers and  channels  just
   prior  to  the  crash.  Also, the SBDIAG FUNCTIONS column contains the
   SBUS diagnostic functions.







                                    5-26

                             ENTRY DESCRIPTIONS


                                    FULL

   ***********************************************
     ** THIS ENTRY COPIED FROM A SAVED CRASH **
   CPU STATUS BLOCK           
    LOGGED ON  5-Aug-80 AT  0:11:25      MONITOR UPTIME WAS 11:50:09
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 185.
   ***********************************************
   APRID = 231,342002
   CONI APR = 7760,3
   RDERA = 604000,7427
   CONI PI = 0,10377
   DATAI PAG = 701100,3
   CONI PAG = 0,620001
   CONI RH0 THRU RH7 
           000000,,002445  000000,,006400  000000,,002445  000000,,002445
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
   CONI DTE0 THRU DTE3 
           000000,,020014  000000,,100000  000000,,100014  000000,,100014
   EPT LOCATIONS 0 THRU 37 (CHANNEL LOGOUT AREA)
           200000,,000454  500000,,000456  600000,,000000  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           200000,,000454  500000,,000455  600001,,457000  000000,,000000
           200000,,000454  500000,,000455  600001,,014660  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
   EPT LOCATIONS 140 THRU 177 (DTE CONTROL BLOCKS)
           141000,,413160  241000,,223676  264000,,057516  000000,,000000
           000000,,000442  000000,,057054  000000,,000030  000000,,057136
           000000,,000000  000000,,000000  264000,,057556  000000,,000000
           000000,,000443  000000,,057053  000000,,000030  000000,,057166
           241000,,224302  341000,,224563  264000,,057616  000000,,000000
           000000,,000444  000000,,057052  000000,,000030  000000,,057216
           341000,,232743  141000,,224000  264000,,057656  000000,,000000
           000000,,000445  000000,,057051  000000,,000030  000000,,057246
   UPT LOCATIONS 424 THRU 427 (UUO AREA)
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
   UPT LOCATIONS 500 THRU 503 (PAGE FAIL AREA)
           000000,,000000  304000,,112667  004000,,566102  000000,,000000
   AC BLOCK 6 LOCATIONS 0 THRU 3 AND 12
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           000000,,000000
   AC BLOCK 7 LOCATIONS 0 THRU 2
           255000,,000000  000000,,640010  000000,,000000

   SBDIAG FUNCTIONS
           CTRLR   FUNCTION 0      FUNCTION 1
           4       005740,,041736  000200,,000000  

                                   SHORT

   SEQ    TIME     5-Aug-80

   185.  0:11:25 CPU STATUS BLOCK  APRID = 231,342002 CONI APR = 7760,3
                       CONI PI = 0,10377 CONI PAG = 0,620001
                       DATAI PAG = 701100,3




                                    5-27

                             ENTRY DESCRIPTIONS


   5.2.25  Device Status Block

   The monitor records this  entry  into  the  system  event  file  after
   recovering  from a system crash.  At the time of the crash, a snapshot
   is  taken  of  the  condition  of  all  the  I/O  devices   (such   as
   lineprinters,  cardreaders,  disk  drives,  and  so  forth).  When the
   system recovers,  the  monitor  extracts  this  information  from  the
   CRASH.EXE  file  and  places  it  in the system event file as a Device
   Status Block.

                                    FULL

   ***********************************************
     ** THIS ENTRY COPIED FROM A SAVED CRASH **
   DEVICE STATUS BLOCK        
    LOGGED ON  5-Aug-80 AT  0:11:25      MONITOR UPTIME WAS 11:50:09
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 186.
   ***********************************************
   CONI 20 : 117,63202
   CONI 24 : 0,32003
   CONI 120 : 0,0
   CONI 104 : 0,0
   CONI 100 : 0,0
   CONI 240 : 0,0
   CONI 320 : 0,410000
   CONI 324 : 770010,4100
   CONI 150 : 3,0
   CONI 124 : 0,2400
   CONI 140 : 0,40
   CONI 344 : 0,0
   CONI 340 : 0,0
   CONI 220 : 1,420004
   CONI 170 : 0,0
   CONI 174 : 0,0
   CONI 270 : 0,0
   CONI 274 : 4000,5
   CONI 360 : 0,0
   CONI 250 : 0,0
   CONI 254 : 0,0
   CONI 260 : 0,0
   CONI 264 : 0,0
   CONI 334 : 0,0
   CONI 330 : 0,0
   CONI 64 : 60,200224
   CONI 60 : 0,5037
   CONI 164 : 0,0
   CONI 160 : 0,0
   CONI 110 : 0,400000
   CONI 154 : 2,0
   CONI 234 : 0,0
   CONI 230 : 307620,32400
   CONI 144 : 0,0
   DATAI 0 : 0,0
   DATAI 170 : 0,0
   DATAI 174 : 0,0
   DATAI 270 : 0,0






                                    5-28

                             ENTRY DESCRIPTIONS


   DATAI 274 : 4003,3
   DATAI 360 : 0,0
   DATAI 250 : 0,0
   DATAI 254 : 0,0
   DATAI 260 : 0,0
   DATAI 264 : 0,0
   DATAI 64 : 0,770
   DATAI 60 : 0,162
   DATAI 164 : 0,0
   DATAI 160 : 0,0

                                   SHORT

   SEQ    TIME     5-Aug-80

   186.  0:11:25 DEVICE STATUS BLOCK 
|  
|  
|  
|  5.2.26  Line printer Error
|  
   The monitor records any errors detected by the LP100 controller  as  a
|  Line  printer  Error  in the system event file.  Note that if the line
|  printer is taken off-line to add paper or change  forms,  the  monitor
   does not record this event.

   The LAST DATA WORD SENT can help to determine the location of  a  data
   parity error, if one exists.  Also, the CONI AT ERROR text translation
   contains significant error bits to describe the mode of operation when
   the failure occurred.

                                    FULL

   ***********************************************
   LINE PRINTER ERROR         
    LOGGED ON 22-Mar-81 AT  0:11:50      MONITOR UPTIME WAS  0:23:18
           DETECTED ON SYSTEM # 1536.
           RECORD SEQUENCE NUMBER: 1.
   ***********************************************

           UNIT NAME:      LPT0
           CONTROLLER TYPE:        LP100
           LAST DATA WORD SENT:    0,123
           CONI AT ERROR:  200045,226465 = NOT READY,VFU ERROR,OFF LINE,
           VFU TYPE:       DIRECT ACCESS
           CHARACTER SET:  VARIABLE
           PAGE COUNTER:   37.

                                   SHORT

   SEQ    TIME    22-Mar-81

   1.  0:11:50 LPT0 LP100 ERROR  CONI LP = 200045,226465










                                    5-29

                             ENTRY DESCRIPTIONS


   5.2.27  Unit Record Error

   The monitor logs a Unit Record Error into the system event  file  when
|  it  detects an error on a unit-record device such as a line printer, a
|  card reader, a card punch, or a plotter.

                                    FULL

   ***********************************************
   UNIT RECORD ERROR          
    LOGGED ON  8-Sep-80 AT 12:06:44      MONITOR UPTIME WAS  3:58:38
           DETECTED ON SYSTEM # 1026.
           RECORD SEQUENCE NUMBER: 314.
   ***********************************************
   UNIT NAME:              LPT262
   CONTROLLER TYPE:        LP100
   DEVICE TYPE:            LPT
   USER ID:                [1,2]
   PROGRAM NAME:           LPTSPL
   VFU TYPE:               DAVFU
   CHARACTER SET:          96 CHARACTER
   CONI AT ERROR:          307216,632444   NOT READY,VFU ERROR,OFF LINE,
   LAST DATA WD:           0,0

                                   SHORT

   SEQ    TIME     8-Sep-80

   314. 12:06:44 LPT262 ERROR FOR USER [1,2] RUNNING LPTSPL
                       CONI LP100 = 307216,632444



   5.3  TOPS-20 ENTRIES

   The following sections list both the FULL and SHORT  versions  of  the
   entries  that  TOPS-20 can record in its system event file.  Note that
   the network entries for DECnet-20 version 2.1 are listed separately in
   Section  5.4.  Network entries for DECnet-20 version 3.0 are listed in
   Section 5.5



   5.3.1  TOPS-20 System Reloaded

   Every time the monitor is loaded a TOPS-20 System  Reloaded  entry  is
   written  into  the  system  event  file, explaining why the system was
   reloaded.  If the system is on auto-reload and a  BUGHLT  occurs,  the
   BUGHLT  address is listed and the TOPS-20 BUGHLT-BUGCHK entry, Section
   5.3.2, is also written into the system event file.













                                    5-30

                             ENTRY DESCRIPTIONS


                                    FULL

   ***********************************************
   TOPS-20 SYSTEM RELOADED    
    LOGGED ON Mon 23 Jun 80 08:46:31      MONITOR UPTIME WAS  0:00:22
           DETECTED ON SYSTEM # 2116.
           RECORD SEQUENCE NUMBER: 22.
   ***********************************************
   CONFIGURATION INFORMATION
           SYSTEM NAME:    System 2116 TOPS-20 Monitor 4(3230)
           MONITOR BUILT ON:       Wed 28 Nov 79 11:00:01
           CPU SERIAL #:           2116.
           MONITOR VERSION:        4(3230)
           U-CODE VERSION:         0
   RELOAD BREAKDOWN:

                                   SHORT

   SEQ    TIME    Mon 23 Jun 80

   22. 08:46:31 RELOAD OF System 2116 The Big Orange Welcomes You, TOPS-20
                       Monitor 4(3230) VERSION 4(3230)
                       BUILT ON Wed 28 Nov 79 11:00:01 REASON




   5.3.2  TOPS-20 BUGCHKs and BUGHLTs

   When the monitor detects a BUGHLT, BUGCHK, or BUGINF, monitor software
   error,  it records a TOPS-20 BUGHLT-BUGCHK entry into the system event
   file.  The most serious of the three errors is a BUGHLT, which crashes
   the  system.   At  this  point,  something is seriously wrong, and the
   monitor does not have enough integrity to attempt  any  further  error
   recovery.   The  monitor  does, however, collect pertinent information
   for error recording.  When the system is reloaded, the information  is
   extracted from a crash dump and recorded in the system event file.

   BUGCHK   and   BUGINF   are   less   serious,   perhaps   correctable,
   monitor-detected  errors that can affect only particular users instead
   of the entire system.  These errors may or may not  crash  the  system
   depending on the error that occurs.

   The number of errors since reload is included in  this  entry  because
   only  five occurrences of this entry type are allowed in the monitor's
   error recording buffer at any one time.   In  the  case  of  an  error
   occurring  in  a tight loop, more than five entries could overflow the
   buffer, and the information for the first occurrence  might  be  lost.
   These  numbers should increment by one for each entry; however, if the
   sequence is broken, it indicates that more than five entries  occurred
   before the error-logger module of the monitor could empty the buffer.

   The FORK # and JOB # in the entry are the numbers associated with  the
   current  user  at  the  time  of  the  error.  A value of -1 or 777777
   indicates that the monitor was performing an overhead  function  (such
   as scheduling) and that there was no current user.  Note that the FORK
   # and JOB # indicate the current user, and not  necessarily  the  user
   being serviced by the monitor interrupt-level routines.

   All BUGHLTs now reside in a monitor  module,  BUGS.MAC.   This  module
   includes  a  description of what might have caused the BUGHLT and also


                                    5-31

                             ENTRY DESCRIPTIONS


   some corrective action that you can take.






























































                                    5-32

                             ENTRY DESCRIPTIONS


                                    FULL

   ***********************************************
   TOPS-20 BUGHLT-BUGCHK      
    LOGGED ON Mon 16 Jun 80 11:10:19      MONITOR UPTIME WAS  3:10:48
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 25.
   ***********************************************

   ERROR INFORMATION:
           DATE-TIME OF ERROR:     Mon 16 Jun 80 11:10:09
           # OF ERRORS SINCE RELOAD:       1.
           FORK # & JOB #:         72,0
           USER'S LOGGED IN DIR:   OPERATOR
           PROGRAM NAME:           SYSJOB
           ERROR:          BUGINF
           ADDRESS OF ERROR:       644111
           NAME:                   DN20ST
           DESCRIPTION:            DTESRV- DN20 STOPPED
           CONI APR:       7740,3 =  NO ERROR BITS DETECTED
           CONI PAG:       0,660132
           DATAI PAG:      700100,1246
           CONTENTS OF AC'S:
       0:  0,0
       1:  777775,1
       2:  0,1
       3:  0,0
       4:  0,0
       5:  0,0
       6:  0,0
       7:  0,0
      10:  0,0
      11:  0,0
      12:  0,0
      13:  0,0
      14:  0,0
      15:  0,0
      16:  60000,0
      17:  777505,335504
           PI STATUS:      0,177
           ADDITIONAL DATA ITEMS:  1
           0,1

           ERA:            602000,5504 = WD #3 MEMORY READ
           BASE PHY. MEM ADDR.
            AT FAILURE:    5504

                                   SHORT

   SEQ    TIME    Mon 16 Jun 80

   25. 11:10:19 BUGINF DN20ST AT Mon 16 Jun 80 11:10:09 USER OPERATOR
                       RUNNING SYSJOB CONI APR= 7740,3 CONI PAG= 0,660132
                       ERA= 602000,5504



   5.3.3  MASSBUS Device Error

   Every time the monitor detects  an  error  in  the  MASSBUS  system  a
   MASSBUS  Device  Error  is  recorded  in  the  system event file.  The


                                    5-33

                             ENTRY DESCRIPTIONS


   MASSBUS system includes the MASSBUS devices RP04,  RP05,  RP06,  TU45,
   and  RM03;  the  RH20  controller (RH11 and UBA for 2020); and certain
   errors occurring in the channel logic.




























































                                    5-34

                             ENTRY DESCRIPTIONS


   The unit name in this entry refers to the physical MASSBUS unit active
   at the time of the error.  This is a 5-character name in the format:

        xxabc

   where

        xx     is the device type DP (disk  pack)  or  MT  (magtape)  For
               example, DP220 refers to disk pack 220.

        a      is the logical address of the  RH20  controller  for  this
               device (0-7) - RH11 and UBA in a 2020 configuration.

        b      is the logical MASSBUS address for this device  (0-7)  For
               magtape units, this is the TM02 address on the MASSBUS.

        c      is the slave number of a magnetic tape unit.   For  RP04s,
               RP05s, and RP06s, this number is always 0.

               The following is a MASSBUS Device Error from an RP07  disk
               drive:










































                                    5-35

                             ENTRY DESCRIPTIONS

































































                                    5-36

                             ENTRY DESCRIPTIONS

























   The following MASSBUS Device Error is from a TU78 magnetic tape drive:


                                   FULL

   ***********************************************
   MASSBUS DEVICE ERROR       
    LOGGED ON Mon 31 Aug 81 15:42:02      MONITOR UPTIME WAS  0:08:46
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 161.
   ***********************************************
           UNIT NAME:      MT000
           UNIT TYPE:      TU78
           UNIT SERIAL #:  0175.
           VOLUME ID:      
           LOCATION:    RECORD  #  1.    OF FILE #  0.
           USER'S LOGGED IN DIR NUMBER:    5
           USER'S PGM:     SYSJOB
           OPERATION AT ERROR:     DEV.AVAIL.  GO +  READ FWD(70)
           FINAL ERROR STATUS:     0,0
           RETRIES PERFORMED:      0.
           ERROR:  NON-RECOVERABLE 
   DRIVE EXCEPTION,CHN ERROR, IN CONTROLLER CONI
                   
   M8960 u-CODE REVISION LEVELS:
           0  (    0- 3777)        005
           1  ( 4000- 7777)        005
           2  (10000-13777)        005
           3  (14000-17777)        003
           4  (20000-23777)        002
           5  (24000-27777)        003
           6  (30000-33777)        007
           7  (34000-37777)        003







                                    5-37

                             ENTRY DESCRIPTIONS



   CONTROLLER INFORMATION:
   CONTROLLER:     RH20 # 0
   CONI AT ERROR:  0,222415 =
           DRIVE EXCEPTION,CHN ERROR,
   CONI AT END:    0,222415 =
           DRIVE EXCEPTION,CHN ERROR,
           DATAI PTCR AT ERROR:    732200,177771
           DATAI PTCR AT END:      732200,177771
           DATAI PBAR AT ERROR:    720000,113000
           DATAI PBAR AT END:      720000,113000

   CHANNEL INFORMATION:
   CHAN STATUS WD 0:       200000,272774
           CW1:  0,0  CW2:  0,0
   CHN STATUS WD 1:        540100,272775 =
           NOT SBUS ERR,NOT WC = 0,LONG WC ERR,
   CHN STATUS WD 2:        420003,170000

   DEVICE REGISTER INFORMATION:
           AT ERROR        AT END          DIFF.
   CMD 00: 4070            4070               0            
           DEV.AVAIL.  READ FWD(70)
   DST 01: 4415            4415               0            
|       Interrupt code: NOT CAPABLE
|       ID Burst neither PE or GCR
           
   CNT 02: 30004           30004              0            
           SKIP COUNT = 0.  RECORD COUNT = 1.  DRIVE # 0
   DG1 03:    0               0               0            
   ATN 04:    0               0               0            
   BCT 05: 113000          113000             0            
           38400. BYTES
   DTR 06: 142101          142101             0            
   STA 07: 166200          166200             0            
           RDY, PRES, ONL, PE, BOT, AVAIL, 
   SER 10: 565             565                0            
   DG2 11:    0               0               0            
   DG3 12:    0               0               0            
   NST 13: 1               1                  0            
        Interrupt code: DONE
        Extended sense data not updated
                                  
           
   NC1 14: 406             406                0            
           CMD COUNT = 1. Rewind(06)
   NC2 15: 10              10                 0            
           CMD COUNT = 0. Sense(10)
   NC3 16: 10              10                 0            
           CMD COUNT = 0. Sense(10)
   NC4 17: 10              10                 0            
           CMD COUNT = 0. Sense(10)
   MPA 20: 2034            2034               0            
   MPD 21: 100000          100000             0            

   EXTENDED SENSE BYTE DATA NOT SUPPLIED FOR THIS ENTRY







                                    5-38

                             ENTRY DESCRIPTIONS



   DEVICE STATISTICS AT TIME OF ERROR:
   # OF READS:     0.      # OF WRITES:    0.      # OF SEEKS:     0.
   # SOFT READ ERRORS:     0.      # SOFT WRITE ERRORS:    0.
   # HARD READ ERRORS:     1.      # HARD WRITE ERRORS:    0.
   # SOFT POSITIONING ERRORS:      0.
   # HARD POSITIONING ERRORS:      0.
   # OF MPE:  0.   # OF NXM:  0.   # OF OVERRUNS:  0.

|  The soft read errors and hard read errors in this entry are counted as
|  of the last volume mount.

                                   SHORT

   161. 15:42:02 MT000 TU78 SERIAL #0175. OPERATOR RUNNING SYSJOB
                       CONI RH= 0,222415 CHN STS= 540100,272775 SR= 0,4415
                       ER= 0,30004 FILE/RECORD 0./1.



   5.3.4  DX20 Device Error

   When the monitor detects an error in any portion of the MASSBUS system
   connected  to  the  DX20  tape  controller,  the  DX20 Device Error is
   recorded in the system event file.

   This entry contains the octal values of the CONI and  DATAI  from  the
   controller  both  when the error was first detected and after the last
   retry.


































                                    5-39

                             ENTRY DESCRIPTIONS

































































                                    5-40

                             ENTRY DESCRIPTIONS

























































   5.3.5  Drive Statistics Entries

   Drive Statistics Entries are written into the  system  event  file  to
   record  the activity on the drive.  For example, mounts and dismounts,
   reloads, and drive shutdowns are information that  is  recorded  as  a
   drive statistic.


                                    5-41

                             ENTRY DESCRIPTIONS


                           FULL

   ***********************************************
   DRIVE STATISTICS ENTRIES
    LOGGED ON  5-Oct 10:52:28      MONITOR UPTIME WAS 367.
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 361.
   ***********************************************

           Volume ID: SPARE  Reason recorded: Disk pack mount

           Channel info(CDB): RH20 # 4 on PI level 5
           Device info(UDB):  RP20,  DP401 PIA: 0

                      READS    WRITES     SEEKS

           TOTAL :        8.                  1.

   ***********************************************
   DRIVE STATISTICS ENTRIES
    LOGGED ON  5-Oct 11:20:24      MONITOR UPTIME WAS 5454.
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 374.
   ***********************************************

           Volume ID: CDM    Reason recorded: Magtape unload

           Channel info(CDB): RH20 # 3 on PI level 5
           Device info(UDB):  TU70, MTA1, MT301 PIA: 0

                      READS    WRITES

           TOTAL :   353600.  7610560.

           NRZI  :
           PE    :   353600.  7610560.
           GCR   :

                                   SHORT

   361. 10:52:28 STATS DRIVE: DP401 VOLID: SPARE  REASON: Disk pack mount.
   374. 11:20:24 STATS DRIVE: MT301 VOLID: CDM    REASON: Magtape unload.





















                                    5-42

                             ENTRY DESCRIPTIONS


   5.3.6  Configuration Status Change

   The monitor records a Configuration  Status  Change  when  the  system
   operator  takes  disk  units and/or sections of core memory on-line or
   off-line, thus changing the configuration of the system.   The  system
   operator   can   give   a   2-character   reason  for  the  change  in
   configuration.  The following codes are suggested:

        PM - preventive maintenance

        CM - corrective maintenance

        DN - unit is down

        OT - other

   This entry lists what device was affected, what action was taken,  and
   where  the  action  was  performed (channel number, controller number,
   unit number).

                                  CAUTION

           When the system operator adds memory  to  the  system,
           the  monitor  checks to verify the availability of the
           specified addresses.  Mistakes  are  reported  to  the
           operator at the operator's terminal, CTY; however, the
           error-logging system treats these as  valid  NXMs  and
           records  them  as NXM entries.  You can identify a NXM
           entry of this type by the fact that no physical memory
           is off-line and the user's directory is [1,2].

                                    FULL

   ***********************************************
   CONFIGURATION STATUS CHANGE
    LOGGED ON Mon 23 Jun 80 08:50:21      MONITOR UPTIME WAS 2 DAYS  8:34:54
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 1.
   ***********************************************
    DETACH TU72 S/N:28410
    AS MTA2 AT CHANNEL #0 CONTROLLER #0 UNIT #2
    REASON:

                                   SHORT

   SEQ    TIME    Mon 23 Jun 80

   1. 08:50:21 DETACH TU72 S/N:28410 AS MTA2 AT CHANNEL #0 CONTROLLER #0
                       UNIT #2 REASON:














                                    5-43

                             ENTRY DESCRIPTIONS


   5.3.7  System Log Entry

   The monitor records a System Log Entry when the system operator enters
   a log entry into the system event file with the OPR program.

   A system operator, or anyone with operator  privileges,  can  make  an
   entry into the system event file by doing the following:

        1.  Run the OPR program

                 @OPR<RET>
                 OPR>

        2.  When you see the prompt, specify the REPORT command:

                 OPR>REPORT<RET>

        3.  Use the following syntax:

                 OPR>REPORT user text <RET>

            where user can be directory name and/or device name, and text
            can be a single-line or multiple-line response.

   For more information on OPR, refer to the TOPS-20  Operator's  Command
   Language Reference Manual.

                                    FULL

   ***********************************************
   SYSTEM LOG ENTRY           
    LOGGED ON Tue 1 Jul 80 11:37:37      MONITOR UPTIME WAS  0:09:48
           DETECTED ON SYSTEM # 2116.
           RECORD SEQUENCE NUMBER: 32.
   ***********************************************
   ENTRY CREATED BY:
           JOB #, TTY #:   11,17
           DIRECTORY:      SCHMITT
           WHO:            SCHMIT
           DEV:            NUL
           MESSAGE:        : testing

                                   SHORT

   SEQ    TIME    Tue 1 Jul 80

   32. 11:37:37 SYSTEM LOG ENTRY BY SCHMIT FOR DEVICE NUL ON TTY # 17
                       MESSAGE: : testing



   5.3.8  Front-End Device Report

|  You find a Front-End Device Report in the system event file  when  the
   front  end  passes a packet of error information to the monitor across
   the DTE-20.  This information contains errors detected  by  the  front
   end  and  KLCPU hardware and software.  Currently, entries are created
   for the following devices:  LP20,  CD20,  DH11,  KLCPU,  KLERROR,  and
   KLINIK.





                                    5-44

                             ENTRY DESCRIPTIONS


   If the FORK # and JOB # associated with the error  are  777777,777777,
   this indicates that the TOPS-20 monitor knows of this device but it is
   not currently assigned to any fork or job.  If the FORK #  and  JOB  #
   are  777776,777776,  this  indicates  that  the  monitor does not know
   anything about this device.

   The front end generates  a  standard-status  word  for  each  transfer
   across  the DTE-20.  The ERROR LOG REQUEST bit in this word causes the
   packet to be recorded into the system event file.

   The information in the entry varies depending on the  type  of  device
   being  reported on.  If SPEAR does not know how to list a device, this
   fact is stated in the entry, listed in octal.
























   5.3.9  Front End Reloaded

   Each time the KLCPU detects that the front end has halted or is  in  a
   loop  a Front End Reloaded entry is recorded in the system event file.
   The KL attempts to copy a crash dump file onto  disk  from  the  front
   end's memory and then reboots the front end.

   The front-end number is the logical  address  of  the  front  end  and
   indicates  whether this front end is privileged.  The status at reload
   describes, in  text,  any  errors  that  occurred  during  the  reboot
   process.   The  file name of the core dump is listed if the crash dump
   was successful.















                                    5-45

                             ENTRY DESCRIPTIONS


                                    FULL

   ***********************************************
   FRONT END RELOADED         
    LOGGED ON Tue 1 Jul 80 00:18:51      MONITOR UPTIME WAS  0:02:24
           DETECTED ON SYSTEM # 2102.
           RECORD SEQUENCE NUMBER: 126.
   ***********************************************
           FRONT END #:    0
           STATUS AT RELOAD:        NO ERROR BITS DETECTED
           RETRIES:        0
           REASON FOR RELOAD:      B03
           FILENAME FOR DUMP:      <SYSTEM>0DUMP11.BIN.17, 1-Jul-80 00:18:45

                                   SHORT

   SEQ    TIME    Tue 1 Jul 80

   126. 00:18:51 FRONT END RELOAD ON PDP11 #0 RELOAD STATUS,,RETRIES 0,0
                       PDP11 HALT CODE B03



   5.3.10  Processor Parity Trap

   The monitor records a Processor Parity Trap each time a page-fail trap
   occurs  in  the  CPU  as  a result of an AR, ARX, or PAGE TABLE parity
   error.

   The information contained in the GOOD DATA WORD is valid only  if  the
   error  is  recoverable;  otherwise, the data is 0,0 and the DIFFERENCE
   DATA is a copy of the BAD DATA WORD.  The DIFFERENCE is the result  of
   an XOR between the bad data and the good data words.  Note that if the
   user is unknown, 777777,777777 will be the FORK and JOB numbers.

                                    FULL

   ***********************************************
   PROCESSOR PARITY TRAP      
    LOGGED ON Tue 8 Jul 80 11:14:04      MONITOR UPTIME WAS  8:51:58
           DETECTED ON SYSTEM # 2102.
           RECORD SEQUENCE NUMBER: 320.
   ***********************************************
   STATUS AT ERROR:
           BAD DATA DETECTED BY:   AR
           PAGE FAIL WD AT TRAP:   763000,313
           BAD DATA WORD:  252525,252525
           GOOD DATA WORD: 525252,525252
           DIFFERENCE:     777777,777777
           PHYSICAL MEM ADDR.
            AT FAILURE:    563003,277313
           RECOVERY:       CONT. USER
           RETRY COUNT:    1.
           CACHE IN USE

           FORK #  & JOB #:                53,17
           USER'S LOGGED IN DIR:   EIBEN
           PROGRAM NAME:           KLPAR1






                                    5-46

                             ENTRY DESCRIPTIONS


                                   SHORT

   SEQ    TIME    Tue 8 Jul 80

   320. 11:14:04 PARITY TRAP  PAGE FAIL WORD;763000,313
                       PHYSICAL MEMORY ADDRESS;563003,277313
                       FAILURE TYPE,,RETRIES;40000,1



   5.3.11  Processor Parity Interrupt

   When the monitor detects an APR interrupt because of a  parity  error,
   it  records a Processor Parity Interrupt in the system event file.  It
   records the entry after it has scanned all physical memory looking for
   more  errors.   If the original error also generates a page-fail trap,
   the monitor also creates a Processor Parity Trap entry.

   The CONI APR and ERA values are the contents of these registers at the
   time of the first error.  The PC AT INTERRUPT value includes the flags
   in the left half.  The BASE PHYsical MEMory ADDRess AT FAILURE is from
   the right half of the contents of the ERA.

   The # OF ERRORS on this sweep refers to the number  of  parity  errors
   during  this  sweep  of  physical  memory.   If the value is zero, the
   monitor did not detect any errors, and 777777,777777  is  the  logical
   AND  function  for  both  bad  addresses and bad data.  The logical OR
   function, in this case, is 0,0.

   The  SYSTEM   MEMORY   CONFIGURATION   lists   the   physical   memory
   configuration  and any detected errors at the time of the first error.
   These are the results of S-BUS DIAGNOSTIC  FUNCTIONS  for  all  memory
   controllers on this CPU.

                                    FULL

   ***********************************************
   PROCESSOR PARITY INTERRUPT 
    LOGGED ON Tue 8 Jul 80 11:21:35      MONITOR UPTIME WAS  8:59:29
           DETECTED ON SYSTEM # 2102.
           RECORD SEQUENCE NUMBER: 323.
   ***********************************************
           CONI APR:       7740,413 = MB PAR ERR,
           ERA:            36001,520314 = WD #0 CACHE WRITE
           BASE PHY. MEM ADDR.
            AT FAILURE:    1520314

           PC FLAGS AT INTERRUPT:  300000,0
           PC AT INTERRUPT:        67320
           # ERRORS ON THIS SWEEP  2.
           LOGICAL AND OF
           BAD ADDRESSES:  1,520304
           LOGICAL OR OF
           BAD ADDRESSES:  1,520314
           LOGICAL AND OF 
           BAD DATA:       252525,252525
           LOGICAL OR OF
           BAD DATA:       252525,252525
   SYSTEM MEMORY CONFIGURATION:





                                    5-47

                             ENTRY DESCRIPTIONS


   CONTROLLER:  #0  MB20  128 K
   F0:     6000,0  F1:     36300,36012
           INTERLEAVE MODE:        4-WAY
           REQ ENABLED:    0 2 
           LOWER ADDRESS BOUNDARY: 0
           UPPER ADDRESS BOUNDARY: 777777
           ERRORS DETECTED:        NONE
   CONTROLLER:  #1  MB20  128 K
   F0:     6000,0  F1:     36300,36005
           INTERLEAVE MODE:        4-WAY
           REQ ENABLED:    1 3
           LOWER ADDRESS BOUNDARY: 0
           UPPER ADDRESS BOUNDARY: 777777
           ERRORS DETECTED:        NONE
   CONTROLLER:  #2  MB20  128 K
   F0:     6000,0  F1:     36301,36012
           INTERLEAVE MODE:        4-WAY
           REQ ENABLED:    0 2 
           LOWER ADDRESS BOUNDARY: 1000000
           UPPER ADDRESS BOUNDARY: 1777777
           ERRORS DETECTED:        NONE
   CONTROLLER:  #3  MB20  128 K
   F0:     6000,0  F1:     36301,36005
           INTERLEAVE MODE:        4-WAY
           REQ ENABLED:    1 3
           LOWER ADDRESS BOUNDARY: 1000000
           UPPER ADDRESS BOUNDARY: 1777777
           ERRORS DETECTED:        NONE
   CONTROLLER:  #10  MF20  
   F0:     26123,277313    F1:     500,1000
   LAST WORD REQUEST:      RQ3 WRITE 
   LAST ADDRESS HELD:      3277313
   CONTROLLER STATUS:       SF2 & SF1= 2
           ERRORS DETECTED:        WRITE PARITY
   CONTROLLER:  #11  MF20  
   F0:     7747,631734     F1:     500,1000
   LAST WORD REQUEST:      RQ0RQ1RQ2RQ3- READ 
   LAST ADDRESS HELD:      7631734
   CONTROLLER STATUS:       SF2 & SF1= 2
           ERRORS DETECTED:        NONE
   ERRORS DETECTED DURING SWEEP:
   ADDRESS BAD DATA        GOOD DATA       DIFFERENCE
   1520304 252525,252525           GOOD DATA NOT FOUND
   1520314 252525,252525           GOOD DATA NOT FOUND

                                   SHORT

   SEQ    TIME    Tue 8 Jul 80

   323. 11:21:35 PARITY INTERRUPT-CONI APR;7740,413 ERA;36001,520314
                       PC AT INTERRUPT;0,67320 # OF ERRORS;2.



   5.3.12  KL CPU Status Block

   This entry is written into ERROR.SYS on TOPS-20, if KLSTAT  is  turned
   on  at  the  time  of  a  system  crash.   (See Section 4.5.1 for this
   procedure.)





                                    5-48

                             ENTRY DESCRIPTIONS


   At the time of a crash,  a  snapshot  of  the  condition  of  all  the
   components  of  the  CPU  (such  as  controllers, channels, RH20s, the
   pager, and so  forth)  is  taken.   When  the  system  recovers,  this
   information  is  extracted  from  the CRASH.EXE file and written as an
   entry  in  ERROR.SYS.   This  entry  displays  the  condition  of  the
   registers and channels at the time of the crash.

                                    FULL

   ***********************************************
   KL CPU STATUS BLOCK        
    LOGGED ON Mon 15 Sep 80 15:03:19      MONITOR UPTIME WAS 17:49:02
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 26.
   ***********************************************
   APRID = 600236,364131
   CONI APR = 7740,3
   RDERA = 202000,132276
   CONI PI = 0,2377
   DATAI PAG = 701000,3201
   CONI PAG = 0,660124
   CONI RH0 THRU RH7 
           000000,,002445  000000,,002445  000000,,002445  000000,,002445
           000000,,002000  000000,,002000  000000,,002000  000000,,002000
   CONI DTE0 THRU DTE3 
           000000,,001016  000000,,101016  000000,,002000  000000,,002000
   EPT LOCATIONS 0 THRU 37 (CHANNEL LOGOUT AREA)
           200000,,225566  540100,,225567  620003,,477000  254340,,726001
           200000,,074442  500000,,074443  600000,,460000  254340,,726421
           200000,,075064  500000,,075065  600001,,053000  254340,,727011
           200000,,075522  500000,,075523  600001,,573000  254340,,727501
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
   EPT LOCATIONS 140 THRU 177 (DTE CONTROL BLOCKS)
           241000,,223711  241000,,730250  254340,,002135  000000,,000000
           000000,,000000  000000,,223434  000000,,000030  000000,,223516
           000000,,000000  041000,,731556  254340,,002147  000000,,000000
           000000,,000226  000000,,223433  000000,,000030  000000,,223546
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
           000000,,000000  000000,,000000  000000,,000000  000000,,000000
   UPT LOCATIONS 424 THRU 427 (UUO AREA)
           310100,,057200  000000,,700000  000000,,000000  601000,,003201
   UPT LOCATIONS 500 THRU 503 (PAGE FAIL AREA)
           411000,,742000  000000,,000162  000006,,611327  000000,,027543
   AC BLOCK 6 LOCATIONS 0 THRU 3 AND 12
           000770,,000007  301000,,002520  000000,,127000  000000,,153764
           011003,,276223
   AC BLOCK 7 LOCATIONS 0 THRU 2
           000000,,000000  000000,,000000  000000,,000000

   SBDIAG FUNCTIONS
           CTRLR   FUNCTION 0      FUNCTION 1
           0       006000,,000000  036300,,036012  
           1       006000,,000000  036300,,036005  
           10      007743,,201500  000500,,001000





                                    5-49

                             ENTRY DESCRIPTIONS


                                   SHORT

   SEQ    TIME    Mon 15 Sep 80

   26. 15:03:19 KL CPU STATUS BLOCK  APRID = 600236,364131
                       CONI APR = 7740,3 RDERA = 202000,132276
                       CONI PAG = 0,660124 DATAI PAG = 701000,3201 



   5.3.13  MF20 Device Report

   This entry is written to ERROR.SYS when a MOS memory error occurs.   A
   program  called  TGHA is called by the monitor every time a MOS memory
   error occurs.  TGHA is responsible for recovering from the error.   If
   TGHA  places  memory off-line or substitutes a spare bit, these events
   are recorded as an entry in ERROR.SYS.  The TGHA entry is actually  an
   ASCII  text  report describing the attempt to recover from an error in
   MOS memory.

                                    FULL

   ***********************************************
   MF20 DEVICE REPORT         
    LOGGED ON Mon 30 Jun 80 10:02:41      MONITOR UPTIME WAS 1 DAY 11:39:06
           DETECTED ON SYSTEM # 2102.
           RECORD SEQUENCE NUMBER: 21.
   ***********************************************
           TEXT FROM TGHA:  
    
   A NEW MF20 KNOWN ERROR HAS BEEN DECLARED. DATA:
   STORAGE MODULE SERIAL NUMBER: 8320021 
   BLOCK: 3, SUBBLOCK: 1, BIT IN FIELD (10): 5,

   ROW: 174, COLUMN: 52, E NUMBER: 109, ERROR TYPE: CELL

                                   SHORT

   SEQ    TIME    Mon 30 Jun 80

   21. 10:02:41 MF20 REPORT



   5.3.14  KLERR Front End Device Report

   The following entry is written into the system event file when the  KL
   clock  stops  for  any  of several errors (FAST MEMORY, PARITY ERRORS,
   CRAM PARITY ERROR, DRAM PARITY ERROR, or  FIELD  SERVICE  STOP).   Any
   significant error signal will be listed just after the header.














                                    5-50

                             ENTRY DESCRIPTIONS


































































                                    5-51

                             ENTRY DESCRIPTIONS


































































                                    5-52

                             ENTRY DESCRIPTIONS
























|  
|  
|  
|  5.4  DECNET ENTRIES (V2.1)
|  
   The following sections list  both  the  FULL  and  SHORT  versions  of
|  network  entries  (Version  2.1)  TOPS-10 or TOPS-20 can record in the
   system event file.



   5.4.1  Network Control Started

   Whenever NETCON is loaded and started, the monitor records  a  Network
   Control Started entry into the system event file.  This entry includes
   the version number and the node on which NETCON is running.

                                    FULL

   ***********************************************
   NETWORK CONTROL STARTED    
    LOGGED ON Mon 23 Jun 80 11:37:08      MONITOR UPTIME WAS 2 DAYS 11:21:41
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 15.
   ***********************************************
           PROGRAM NAME:           NETCON
           PROGRAM VERSION:  4(22)
           NODE NAME:              KL2137

                                   SHORT

   SEQ    TIME    Mon 23 Jun 80

   15. 11:37:08 NCU STARTED PROGRAM: NETCON VER:4(22)
                       STARTED ON NODE KL2137







                                    5-53

                             ENTRY DESCRIPTIONS


   5.4.2  Network Up-Line Dump

   Whenever NETCON dumps a node, the monitor records the name of the node
   involved,  the  line used, the dump-file specification, and any return
   code as a Network Up-Line Dump entry in the system event file.

                                    FULL

   ***********************************************
   NETWORK UP-LINE DUMP       
    LOGGED ON Mon 23 Jun 80 11:07:53      MONITOR UPTIME WAS 2 DAYS 10:52:26
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 11.
   ***********************************************
           TARGET NODE NAME:       DN20L
           SERVER NODE NAME:       KL2137
           SERVER LINE DESIG.:     DTE20_1_0
           FILE NAME DUMPED:       PS:<SROBINSON>DN20L-R4-26.DMP

                                   SHORT

   SEQ    TIME    Mon 23 Jun 80

   11. 11:07:53 UP-LINE DUMP OF NODE DN20L BY NODE KL2137
                       LINE DESIGNATION DTE20_1_0
                       FILE DUMPED TO PS:<SROBINSON>DN20L-R4-26.DMP



   5.4.3  Network Down-Line Load

   Whenever NETCON loads a node, the monitor records the name of the node
   involved,  the  line used, the load-file specification, and any return
   code as a Network Down-Line Load entry in the system event file.

                                    FULL

   ***********************************************
   NETWORK DOWN-LINE LOAD     
    LOGGED ON Mon 23 Jun 80 11:10:33      MONITOR UPTIME WAS 2 DAYS 10:55:06
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 13.
   ***********************************************
           TARGET NODE NAME:       DN20L
           SERVER NODE NAME:       KL2137
           SERVER LINE DESIG.:     DTE20_1_0
           FILE NAME LOADED:       PS:<NEXT-RELEASE>DN20L-R4-26.SYS.1

                                   SHORT

   SEQ    TIME    Mon 23 Jun 80

   13. 11:10:33 DOWN-LINE LOAD OF NODE DN20L BY NODE KL2137
                       LINE DESIGNATION DTE20_1_0
                       FILE LOADED PS:<NEXT-RELEASE>DN20L-R4-26.SYS.1









                                    5-54

                             ENTRY DESCRIPTIONS


   5.4.4  Network Hardware Error

   Whenever NETCON detects an error in any hardware device connected to a
   node, the monitor records this information as a Network Hardware Error
   in the system event file.



























































                                    5-55

                             ENTRY DESCRIPTIONS


   5.4.5  Network CHECK11 Report

   Whenever the DN20 or DN200 is loaded, CHECK11 (a hardware test module)
   is started.  All messages from CHECK11, at that time, become one entry
   in the system event file.

   Note that the log data in this entry is an ASCIZ  CHECK11  message  of
   arbitrary length.

                                    FULL

   ***********************************************
   NETWORK CHECK11 REPORT     
    LOGGED ON Mon 23 Jun 80 11:09:56      MONITOR UPTIME WAS 2 DAYS 10:54:28
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 12.
   ***********************************************
           MSG SENT FROM:  KL2137
           MSG REC'D AT:  KL2137
           HDWR TYPE:  UNKN   SOFTWARE TYPE:  UNKN 
            PARENT SYSTEM TYPE:  UNKN 
            MSG SEQUENCE # FROM XMIT NODE: 2.
   TEXT FROM CHK11 REPORT: 

   CHK11 HARDWARE TEST
   version 2A(21) of 10-AUG-79 by LDW
   Testing begins...

   THE PROCESSOR SEEMS TO BE A KD11-E (11/34)
      CHK11 EXPECTED AN 11/34

   KT11 memory management test

   PHYSICAL MEMORY HAS ABSOLUTE LIMITS OF 
      0 - 757777
      FOR A TOTAL OF 124KW (DECIMAL)

   MAPPED PHYSICAL MEMORY TEST...
         ...COMPLETE

   KW11-L checked

   device scan report assumes
          DN20
          DN21
          DN25 fixed assignments (no floating)
   1 Fixed DTE20 at 174440, vector at 774
   1 Fixed KMC11 at 160540, vector at 540
   2 Fixed DUP11s from 160300, vector at 570
   2 Fixed DMC11s from 160740, vector at 670

      CHK11 complete

                                   SHORT

   SEQ    TIME    Mon 23 Jun 80

   12. 11:09:56 NETWORK CHECK11 REPORT 






                                    5-56

                             ENTRY DESCRIPTIONS


   5.4.6  Network Line Statistics

   Periodically, NETCON records the status of each  communications  line,
   and this information becomes an entry in the system event file.

                                    FULL

   ***********************************************
   NETWORK LINE STATISTICS    
    LOGGED ON Mon 16 Jun 80 08:34:19      MONITOR UPTIME WAS  0:34:48
           DETECTED ON SYSTEM # 2137.
           RECORD SEQUENCE NUMBER: 1.
   ***********************************************
           MSG SENT FROM:  DN20L
           MSG REC'D AT:  KL2137
           HDWR TYPE:  DTE-20  SOFTWARE TYPE:  UNKN 
            PARENT SYSTEM TYPE:  UNKN 
           LINE ID:  DTE_1_0_0

           REASON FOR ENTRY:  PERIODIC ENTRY
   1802.   SECONDS SINCE LAST ZEROED
   808.    BLOCKS RECEIVED
   814.    BLOCKS SENT
   0.      NON - LINE ERROR RETRANSMISSIONS

                                   SHORT

   SEQ    TIME    Mon 16 Jun 80

   1. 08:34:19 NETWORK LINE COUNTERS  FROM NODE DN20L FOR LINE DTE_1_0_0
                       LINE ERROR RETRANS  RECV LINE ERRORS
































                                    5-57

                             ENTRY DESCRIPTIONS


|  5.5  DECNET ENTRIES (V3.0)
|  
|  The DECnet V3.0 module Event Logger records  any  significant  network
   events  into  the  system  event  file.   The  headers for DECnet V3.0
   entries have the title:

        PHASE III DECNET ENTRY

   The body of each entry contains numbers that  correspond  to  specific
   event classes and event types.  Tables 5-1 and 5-2 list the meaning of
   the numbers in the entry.  Refer to Section 4.3.3 for  information  on
   how to RETRIEVE network entries by event class.
|  
|  
|  Table 5-1:  Network Event Classes
|  
|  
           Event Class       Description


                 0           Network Management Layer
                 1           Applications Layer
                 2           Session Control Layer
                 3           Network Services Layer
                 4           Transport Layer
                 5           Data Link Layer
                 6           Physical Link Layer
             7-31            Reserved for other common event classes
             32-63           Reserved for RSTS specific event classes
             64-95           Reserved for RSX specific event classes
             96-127          Reserved for TOPS-20 specific event
                             classes
             128-159         Reserved for VMS specific event classes
             160-191         Reserved for RT specific event classes
             192-479         Reserved for future use
             480-511         Reserved for Customer specific event
                             classes


























                                    5-58

                             ENTRY DESCRIPTIONS


|  Table 5-2:  Network Events
|  
|  
            Class   Type     Entity          Event Text


              0       0      none            Event records lost
              0       1      node            Automatic node counters
              0       2      line,circuit    Automatic data link
                                             counters
              0       3      line,circuit    Automatic data link
                                             service
              0       4      line,circuit    Data link counters zeroed
              0       5      node            Node counters zeroed
              0       6      line,circuit    Passive loopback
              0       7      line,circuit    Aborted service request

              2       0      none            Local node state change
              2       1      none            Access control reject

              3       0      none            Invalid message
              3       1      none            Invalid flow control
              3       2      node            Data base reused

              4       0      none            Aged packet loss
              4       1      circuit         Node unreachable packet
                                             loss
              4       2      circuit         Node out-of-range packet
                                             loss
              4       3      circuit         Oversized packet loss
              4       4      circuit         Packet format error
              4       5      circuit         Partial routing update
                                             loss
              4       6      circuit         Verification reject
              4       7      circuit         Circuit down, circuit
                                             fault
              4       8      circuit         Circuit down, software
                                             fault
              4       9      circuit         Circuit down, operator
                                             fault
              4       10     circuit         Circuit up
              4       11     circuit         Initialization failure,
                                             circuit fault
              4       12     circuit         Initialization failure,
                                             software fault
              4       13     circuit         Initialization failure,
                                             operator fault
              4       14     node            Node reachability change

              5       0      line,circuit    Locally initiated state
                                             change
              5       1      line,circuit    Remotely initiated state
                                             change
              5       2      line,circuit    Protocol restart received
                                             in
                                             maintenance mode
              5       3      line,circuit    Send error threshold
              5       4      line,circuit    Receive error threshold
              5       5      line,circuit    Select error threshold
              5       6      line,circuit    Block header format error
              5       7      line,circuit    Selection address error


                                    5-59

                             ENTRY DESCRIPTIONS


              5       8      line,circuit    Streaming tributary
              5       9      line,circuit    Local buffer too small





























































                                    5-60

                             ENTRY DESCRIPTIONS


|  Table 6-2:  Network Events (Cont.)


            Class   Type     Entity          Event Text


              6                 0            line Data set ready
                                             transition
              6                 1            line Ring indicator
                                             transition
              6                 2            line Unexpected carrier
                                             transition
              6                 3            line Memory access error
              6                 4            line Communications
                                             interface error
              6                 5            line Performance error


|  The following are examples of three DECnet Version 3.0 entries
   in FULL format:


   ***********************************************
   PHASE III DECNET ENTRY
    LOGGED ON 7-Dec 03:01:49      MONITOR UPTIME WAS 0 DAY(S) 9:9:33
           DETECTED ON SYSTEM # 2102.
           RECORD SEQUENCE NUMBER: 19.
   ***********************************************

   Event type 4.10  Line up
   From node 118. (MCB), occurred 7-DEC-1981 0:00:00.400
   CIRCUIT = DMC-0

     NODE = 121


   ***********************************************
   PHASE III DECNET ENTRY
    LOGGED ON 7-Dec 03:01:50      MONITOR UPTIME WAS 0 DAY(S) 9:9:35
           DETECTED ON SYSTEM # 2102.
           RECORD SEQUENCE NUMBER: 20.
   ***********************************************

   Event type 4.14  Node reachability change
   From node 118. (MCB), occurred 7-DEC-1981 0:00:00.466
   REMOTE NODE = 103 ()

     STATUS = REACHABLE


   ***********************************************
   PHASE III DECNET ENTRY
    LOGGED ON 7-Dec 03:02:02      MONITOR UPTIME WAS 0 DAY(S) 9:9:47
           DETECTED ON SYSTEM # 2102.
           RECORD SEQUENCE NUMBER: 21.
   ***********************************************

   Event type 5.3  Send error threshold
   From node 118. (MCB), occurred 7-DEC-1981 0:00:18.000
   CIRCUIT = KDP-0-0



                                    5-61

                             ENTRY DESCRIPTIONS


|  The following are examples  of  the  same  three  DECnet  Version  3.0
   entries above but these are listed in SHORT format:

   19. 03:01:49 DECNET Event type 4.10  Line up
                       From node 118. (MCB)
                       occurred 7-DEC-1981 0:00:00.400


   20. 03:01:50 DECNET Event type 4.14  Node reachability change
                       From node 118. (MCB)
                       occurred 7-DEC-1981 0:00:00.466


   21. 03:02:02 DECNET Event type 5.3  Send error threshold
                       From node 118. (MCB)
                       occurred 7-DEC-1981 0:00:18.000

|  The following DECnet entry lists packet header information:
|  
|  ***********************************************
|  PHASE III DECNET ENTRY
|   LOGGED ON 27-Feb-84 07:23:29-EST      MONITOR UPTIME WAS 1 DAY(S) 0:2:17
|          DETECTED ON SYSTEM # 2871.
|          RECORD SEQUENCE NUMBER: 120.
|  ***********************************************
|  
|  Event type 4.1  Node unreachable packet loss
|  From node 143. (GIDDN), uptime was 1 day(s) 16:56:39
|  
|    Packet Header = 2 / 142 / 143 / 6
|  
|  From left to right, the four fields listed with the packet header mean
|  the following:
|  
|       Field one (2)     - is  a  hexidecimal  value   one   byte   long
|                           representing the message flags.
|  
|       Field two (142)   - is a decimal (unsigned) value two bytes  long
|                           representing the destination node address.
|  
|       Field three (143) - is a decimal (unsigned) value two bytes  long
|                           representing the source node address.
|  
|       Field four (6)    - is  a  hexidecimal  value   one   byte   long
|                           representing the forwarding data.
|  
|  Note if the packet is a control packet, the packet header will contain
|  only  two  fields,  the  message flags (Field one) and the source node
|  address (Field three).
|  
|  For more information on network event parameters, see Appendix F.
|  
|  For more information concerning DECnet Version 3.0 entries,  refer  to
|  the DECnet documentation for system managers and operators.









                                    5-62












                                 APPENDIX A

                               SPEAR MESSAGES



   There are four general categories of SPEAR messages;  User  Validation
   Messages,  Dialogue  Usage  Messages, Warning Messages, and Event File
   Messages.  The following tables  list  these  messages  and  suggested
   actions.


   Table A-1:  User Validation Messages


     The following messages can occur because  of  an  error  upon  the
     user's part.  Each message is preceded by the header:


|                         ?USER Validation failed
|  
|  
|    CODE or SEQUENCE not allowed in list of responses.
|  
|         You have selected CODE or SEQUENCE as  a  response  and  have
|         attempted to add another selection type.


     Does not match any valid response

          Typed a response that did not match one of the list of  valid
          responses.

     End time must be later than begin time

          Typed an ending date/time that is prior to or the same as the
|         beginning date/time in RETRIEVE.

     Invalid date format

          Typed date incorrectly.  The correct format is  dd-mmm-yy  or
          -dd.

     Invalid time format

          Typed time incorrectly.  The correct format is hh:mm:ss.

     Matches more than one valid response

          Typed a response that was not  unique.   Need  to  type  more
          characters before pressing the RETURN key or ESCAPE key.



                                    A-1

                               SPEAR MESSAGES


     May not select all at this prompt

          You tried to select ALL when you must respond  with  specific
          names or numbers.



























































                                    A-2

                               SPEAR MESSAGES


     No recognition for this prompt

          Typed ESCAPE key where  it  is  impossible  to  fill  in  the
          blanks.

     Not a valid name or number

          If a name, typed a special character or more than the maximum
          number of characters.  If a number, typed a special character
          or alphabetic character or more than the  maximum  number  of
          digits.

     That function is not available

|         You typed a function name that does not  exist  in  the  same
|         directory as SPEAR.




   Table A-2:  Dialogue Usage Messages


     The following messages can occur when you are  responding  to  the
     dialogue  incorrectly.  They are meant to give you some insight as
     to what the correct response is to the current prompt.


     Not one of the recognized types

          At RETRIEVE level, when specifying a device, you  typed  a  ?
          after  typing  a few characters.  SPEAR did not recognize the
          device as one of its physical devices.

     Please select function first

          Typed a switch that  requires  some  function  to  have  been
          selected  first  (for  example,  /GO  or /SHOW) at the SPEAR>
          prompt.

     Unable to complete this response

          You typed an ESCAPE to a prompt that SPEAR does not know  how
          to  complete.   This is true whenever the response is not one
          of a fixed list of possible responses, for example,  time  of
          day or file specification.

















                                    A-3

                               SPEAR MESSAGES


   Table A-2:  Dialogue Usage Messages (Cont.)


   No default response for this prompt

        Typed the ESCAPE key or another delimiter  where  there  is  no
        default (at SPEAR> prompt, for example).




   Table A-3:  Warning Messages


     The following is a list of warning messages you may receive during
     a  SPEAR operation.  Each message is introduced with the following
     sentence:


           -- The following should be noted before proceeding --


     Impossible to input event records from the terminal!

          You specified TTY:  in response  to  a  request  for  a  file
          specification.

     The input file will be superseded!

          In RETRIEVE, you named the output file the same name  as  the
          input file.  This means you will overwrite your input file if
          you proceed.

     Will overwrite input file with ASCII output!

          In RETRIEVE, you specified the same name for both  input  and
          output  files  and also specified ASCII as the output format.
          If you proceed, the input file  (which  is  binary)  will  be
          overwritten with ASCII output.

     Binary output to terminal is unreadable!

          In RETRIEVE, you requested the BINARY report format and  then
          specified TTY:  in response to Output to:

     Merging with self causes duplicate records!

          In RETRIEVE, you specified the same name for both  the  input
          file  and  the  merge  file.  If you proceed, you will end up
          with a file containing duplicate records.

     Will create an exact copy of the input file!

          In RETRIEVE, you selected all the events in the system  event
          file  and  then  requested  them in BINARY format.  This is a
          waste of effort because all you will have succeeded in  doing
          is duplicating the system event file.






                                    A-4

                               SPEAR MESSAGES


   Table A-3:  Warning Messages (Cont.)


     Will create an empty output file!

          In  RETRIEVE,  you  have  excluded  everything   during   the
          selection process.

     This function can cause SEVERE system degradation!

          You have turned on the KLSTAT switch which slows down  system
          operation to gather extra data into the system event file.




   Table A-4:  Event File Messages


     The following messages can occur as the result of an error in  the
     system  event  file.   The  message indicates a recoverable error.
     Each message is preceded with the following header:


        %SPEAR Event file error detected in module ____routine ____


     Bad header found - RESYNCHing

          Lost synchronization in file, resynchronizing  in  next  file
          block.  Some data has been lost.

     EOF encountered while skipping an entry

          Error file is truncated for some reason.  Some data has  been
          lost.

     Internal EOF found - RESYNCHing

          Internal end-of-file mark detected but still has data.  (This
          can  happen  if files are appended to each other.) No data is
          lost.

     Premature EOF detected in error file!

          Encountered an EOF in the middle of a header or entry.   File
          is truncated.  Some data is lost.


   You can also receive fatal error messages in the form:

        ?SPEAR Program error in module ____routine ____

   where the blanks are filled in with the module and routine names.

   These are SPEAR program errors over which you have no control.  If you
   receive  such  an  error,  fill  out  a  Software  Performance  Report
   describing the error and the situation leading up to the error.

   Another error over which you have no  control  is  an  error  from  an
   internal  program called XPORT.  XPORT does not identify itself in the


                                    A-5

                               SPEAR MESSAGES


   message.  However,  the  message  is  preceded  by  a  question  mark,
   indicating,  in this case, that this is a fatal error.  If you receive
   an  XPORT  error  message,  you  should  also  fill  out  a   Software
   Performance Report.



























































                                    A-6

                               SPEAR MESSAGES


   Other possible messages you can receive originate from  the  operating
   system.  For example:

|       ?SPEAR Monitor call failed         TOPS-20

        ?SCNxxx message                    TOPS-10

|  On TOPS-20, you should refer to the Monitor Calls Manual for a list of
   these   messages.    On   TOPS-10,   you  should  refer  to  the  SCAN
   documentation for a list of SCAN messages.





















































                                    A-7

|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|                                APPENDIX B
|  
|                         INSTALLATION PROCEDURES
|  
|  
|  
|  B.1  INTRODUCTION
|  
|  SPEAR consists of RETRIEVE, SUMMARIZE, and KLSTAT functions.
|  
|  SPEAR is distributed with the TOPS-10 and TOPS-20 monitor distribution
|  tape  and  has  two savesets containing all the files, <DOCUMENTATION>
|  and <SUBSYS>.
|  
|  
|  
|  B.1.1  SPEAR Files
|  
|  The documentation files included in <DOCUMENTATION> for SPEAR are:
|  
|        o  SPEAR.DOC - SPEAR installation document
|  
|        o  DEFINE.LIS - Event file documentation
|  
|  The files included in <SUBSYS> for SPEAR are:
|  
|        o  SPEAR.SPE  - Help file used during user interface
|  
|        o  SPEAR.EXE  - User interface and main control routines
|  
|        o  RFB.EYE    - Internal definitions for RETRIEVE package
|  
|        o  MSGARG.SPT - Binary file for RETRIEVE package
|  
|        o  RETRFB.SPE - Text file for RETRIEVE package
|  
|        o  SPRRET.SPE - Text file for RETRIEVE package
|  
|        o  SPRRET.EXE - Error file manipulation and translation  package
|                        for RETRIEVE
|  
|        o  SPRSUM.SPE - Text file for SUMMARIZE package
|  
|        o  SPRSUM.EXE - Device summarization package for SUMMARIZE
|  
|  
|  
|  B.1.2  Loading and Installing SPEAR
|  
|  Both  the  documentation  saveset  <DOCUMENTATION>   and   the   SPEAR
|  executable  saveset  <SUBSYS>  are located on the <SUBSYS> area of the
|  monitor distribution  tape.   Therefore,  you  need  not  worry  about


                                    B-1

|                         INSTALLATION PROCEDURES


|  installing  SPEAR  separately;  it is part of the monitor installation
|  package.
|  
|  All the files  listed  in  Section  B.1.1  must  reside  in  the  same
|  directory for SPEAR to operate properly.


























































                                    B-2

|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|                                APPENDIX C
|  
|                        COMMAND AND CONTROL FILES
|  
|  
|  
|  Because of dialogue changes in RETRIEVE and  SUMMARIZE,  if  you  have
|  existing SPEAR V1.0 command or control files, you must change them for
|  SPEAR V2.0 or they will not run.
|  
|  For RETRIEVE, the changes from V1.0 to V2.0 are in the Selection type,
|  Error  and  Nonerror fields.  No changes are necessary if your command
|  or control file specified a Selection type of Error, All.  See Section
|  4.3.3 for the RETRIEVE dialogue changes.
|  
|  You can maintain the same functionality  for  an  error  selection  by
|  changing the V1.0 dialogue to the following V2.0 dialogue:
|  
|       SPEAR V1.0          SPEAR V2.0
|  
|       @SPEAR              @SPEAR
|       *RETRIEVE           *RETRIEVE
|       *SERR:ERROR.SYS     *SERR:ERROR.SYS
|       *INCLUDED           *INCLUDED
|       *ERROR              *ERROR
|       *DISK               *DISK
|       *RP06               *RP06
|       *FINISHED           *ALL (Here's the difference.)
|       *EARLIEST           *FINISHED
|       *LATEST             *EARLIEST
|       *DSK:RETRIE.RPT     *LATEST
|       */GO                *DSK:RETRIE.RPT
|  
|  To RETRIEVE the events for a specific device error type,  replace  the
|  ALL  in  the  previous V2.0 control file with one or more device error
|  types, for example, Software, Bus, Channel-controller.
|  
|  For Nonerror selection, you can now select specific devices.   Instead
|  of Nonerror, specify Statistics, Configuration, Diagnostics, Other, or
|  a combination of these, separated by commas.
|  
|       SPEAR V1.0          SPEAR V2.0
|  
|       @SPEAR              @SPEAR
|       *RETRIEVE           *RETRIEVE
|       *SERR:ERROR.SYS     *SERR:ERROR.SYS
|       *INCLUDED           *INCLUDED
|       *NONERROR           *STATISTICS,DIAGNOSTICS (Change)
|       *EARLIEST           *DISK (Change)
|       *LATEST             *RA60,RA80,RA81 (Change)
|       *DSK:RETRIE.RPT     *FINISHED (Change)
|       */GO                *EARLIEST   


                                    C-1

|                        COMMAND AND CONTROL FILES


|                           *LATEST
|                           *DSK:RETRIE.RPT
|                           */GO
|  
|  For SUMMARIZE, two new  prompts  have  been  added  to  the  dialogue,
|  Category  and  Show  Error  Distribution.   You  can maintain the same
|  functionality by changing the V1.0  dialogue  to  the  following  V2.0
|  dialogue:
|  
|       SPEAR V1.0          SPEAR V2.0
|  
|       @SPEAR              @SPEAR
|       *SUMMARIZE          *SUMMARIZE
|       *SERR:ERROR.SYS     *SERR:ERROR.SYS
|       *EARLIEST           *ALL (Change)
|       *LATEST             *EARLIEST
|       *DSK:SUMMAR.RPT     *LATEST
|       */GO                *YES (Change)
|                           *DSK:SUMMAR.RPT
|                           */GO
|  
|  To get summaries for a specific device or class  of  devices,  replace
|  ALL in the previous V2.0 dialogue with device selection.  For example:
|  
|       SPEAR V2.0
|  
|       @SPEAR
|       *SUMMARIZE
|       *SERR:ERROR.SYS
|       *DISK
|       *RA60,RA80
|       *FINISHED
|       *EARLIEST
|       *LATEST
|       *YES
|       *DSK:SUMMAR.RPT
|       */GO
|  
|  To suppress the error distribution charts, change the YES to NO in the
|  dialogue.























                                    C-2

|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|                                APPENDIX D
|  
|                               EVENT CODES
|  
|  
|  
|  The following table contains the current list of TOPS-10  and  TOPS-20
|  event  codes  along  with  their  internal  class.   The  dashes (---)
|  indicate that the event  code  does  not  exist  under  the  specified
|  operating system.
|  
|  
|  Table D-1:  TOPS-10 and TOPS-20 Event Codes
|  
|  
|    -10     Name                    -20     Internal        Subsystem
|    Code                            Code    Class
|  
|    001     SYSTEMRELOAD            101     ERROR           MONITOR
|    002     MONITORBUGDATA          102     ERROR           MONITOR
|    005     EXTRACTEDCRASHINFO      ---     ERROR           MONITOR
|    006     CHANNELERRORREPORT      ---     ERROR           MAINFRAME
|    007     DAEMONSTARTED           ---     CONFIG          SOFTWARE
|    010     OLD DISK ERROR          ---     ERROR           DISK
|    011     MASSBUSERR              111     ERROR           DISK/TAPE
|    012     DX20ERR                 ---     ERROR           DISK/TAPE
|    014     SOFTWAREEVENT           ---     ERROR           SOFTWARE
|    ---     STATISTICS              114     STATISTICS      DISK/TAPE
|    015     CONFIGCHANGE            115     CONFIG          (ALL)
|    016     SYSERRORLOG             116     ERROR           SOFTWARE
|    017     SOFTWAREREQDATA         ---     ERROR           SOFTWARE
|    021     TAPEERR                 ---     ERROR           TAPE
|    030     FEDEVICE-ERR            130     ERROR/CONFIG    MAIN/UNIT/COMM
|    031     FERELOAD                131     CONFIG          MAINFRAME
|    033     KSHALTSTATUS            133     ERROR           MAINFRAME
|    040     OLDDISKSTATS            ---     STATISTICS      DISK
|    042     TAPESTATS               ---     STATISTICS      TAPE
|    045     DISKSTATS               ---     STATISTICS      DISK
|    050     DLHARDWAREERROR         ---     ERROR           COMM
|    052     KLPARNXMINT             ---     ERROR           MAINFRAME
|    054     KSNXMTRAP               ---     ERROR           MAINFRAME
|    055     KLORKSPARTRAP           ---     ERROR           MAINFRAME
|    056     NXMMEMORYSWEEP          ---     ERROR           MAINFRAME
|    057     PARMEMORYSWEEP          ---     ERROR           MAINFRAME
|    061     CPUPARTRAP              160     ERROR           MAINFRAME
|    062     CPUPARINT               162     ERROR           MAINFRAME
|    063     KLCPUSTATUS             163     ERROR           CRASH
|    064     DEVICESTATUS            ---     ERROR           CRASH
|    ---     MF20ERR                 164     ERROR           MAINFRAME
|    066     OLDKLADDRESSFAIL        ---     ERROR           MAINFRAME
|    067     KLADDRESSFAIL           ---     ERROR           MAINFRAME
|    071     LP100ERR                ---     ERROR           UNITRECORD


                                    D-1

|                               EVENT CODES


|    072     HARDCOPYERR             ---     ERROR           UNITRECORD
|    201     NETCONSTARTED           201     CONFIG          NETWORK
|    202     NODEDOWNLINELOAD        202     CONFIG          NETWORK
|    203     NODEDOWNLINEDUMP        203     CONFIG          NETWORK
|    210     NETHARDWAREERR          210     ERROR           NETWORK
|    211     NETSOFTWAREERR          211     ERROR           NETWORK
|    220     NETOPRLOGENTRY          220     ERROR           NETWORK
|    221     NNETTOPOLOGYCHANGE      221     CONFIG          NETWORK
|    222     NETCHECK11REPORT        222     CONFIG          NETWORK
|    230     NETLINESTATS            230     STATISTICS      NETWORK
|    231     NETNODESTATS            231     STATISTICS      NETWORK
|    232     OLDDN64STATS            232     STATISTICS      NETWORK
|    233     DN6XSTATS               233     STATISTICS      NETWORK
|    234     DN6XENABLEDISABLE       234     CONFIG          NETWORK
|    240     PHASE III DECNET        240     ERROR           NETWORK
|    242     HSC50 END PACKET        242     ERROR           DISK/TAPE
|    243     HSC50 ERROR LOG         243     ERROR           DISK/TAPE
|    244     KLIPA EVENT             244     ERROR           CI
|    245     MSCP ERROR              245     ERROR           CI
|    250     DIAGNOSTIC EVENT        250     DIAGNOSTIC      (ALL) 











































                                    D-2

|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|                                APPENDIX E
|  
|                        DISK SUBSYSTEM ERROR BITS
|  
|  
|  
|  The following charts list the categories into  which  the  error  bits
|  fall in the SUMMARIZE report for Disk Subsystems.
|  
|  For example, if the SUMMARIZE report states that your RP06 has 6 SK-SR
|  (SEEK-SEARCH)  errors,  you  may want to know what specific RP06 error
|  bits are considered to be in this category.  If you go  to  the  SK-SR
|  chart  and  look  under  device for RP04,5,6 (which means either RP04,
|  RP05, or RP06), you will see that this chart shows that any one of the
|  3 error bits listed is considered as a SEEK-SEARCH error.
|  
|  The headings have the following meanings:
|  
|       ERROR NAME          The name listed in the KL10 Maintenance Guide
|  
|       DEVICE              The device type
|  
|       REG                 The register containing the error bit
|  
|       BIT                 The position of the error bit
|  
|       COMMENTS            Any qualifiers if applicable
|  
|  The following is a list of the charts that will follow:
|  
|       TIMIN    =   TIMING
|       SK-SR    =   SEEK-SEARCH
|       READ     =   READ-WRITE
|       CH-CO    =   CHANNEL-CONTROLLER
|       BUS      =   BUS
|       SOFT     =   HARDWARE DETECTED SOFTWARE ERROR
|       MICRO    =   MICROPROCESSOR DETECTED ERROR
|       UNSAF    =   UNSAFE
|       WRTLK    =   WRITE LOCK
|       OFFLI    =   OFFLINE
|  
|  
|  
|                *-*-*-*-*-*-*-*-*-*-*
|                *                   *
|                *       TIMIN       *
|                *                   *
|                *-*-*-*-*-*-*-*-*-*-*
|  
|  ERROR NAME        DEVICE     REG     BIT     Comments
|  __________________________________________________
|  


                                    E-1

|                        DISK SUBSYSTEM ERROR BITS


|  OP INC            RP04,5,6   ERR 1   13
|  DRIVE TIMING ERR  RP04,5,6   ERR 1   12
|  INDEX ERROR       RP04,5,6   ERR 2   11
|  
|  INDEX UNSAFE      RP07       ERR 3   06
|  DRIVE TIMING ERR  RP07       ERR 1   12
|  OP INC            RP07       ERR 1   13
|  
|  OP INC            RM03,5     ERR 1   13
|  
|  OP INC            RK07       RKER    13
|  DRIVE TIMING ERR  RK07       RKER    12
|  
|  E0                RL02       RLCS            See note after last chart
|  E3                RL02       RLCS            See note after last chart
|  
|  
|  
|  
|                *-*-*-*-*-*-*-*-*-*-*
|                *                   *
|                *       SK-SR       *
|                *                   *
|                *-*-*-*-*-*-*-*-*-*-*
|  
|  
|  
|  
|  ERROR NAME        DEVICE     REG     BIT     Comments
|  __________________________________________________
|  
|  SEEK INC          RP04,5,6   ERR 3   14
|  OFF CYL           RP04,5,6   ERR 3   15
|  HEADER COMP ERR   RP04,5,6   ERR 1   07
|  
|  SEEK INC          RP07       ERR 3   14
|  LOSS CYL ERROR    RP07       ERR 3   09
|  HEADER COMP ERR   RP07       ERR 1   07
|  
|  HEADER COMP ERR   RM03,5     ERR 1   07
|  SEEK INC          RM03,5     ERR 2   14
|  
|  SEEK INCOMPLETE   RK07       RKER    01
|  DRIVE OFF TRACK   RK07       RKDS    05
|  HEADER VERTICALRC RK07       RKER    08
|  
|  SEEK TIME OUT     RL02       RLMP    12
|  E1                RL02       RLCS            See note after last chart
|  
|  
|  
|  
|  
|  
|  
|  
|                *-*-*-*-*-*-*-*-*-*-*
|                *                   *
|                *       READ        *
|                *                   *
|                *-*-*-*-*-*-*-*-*-*-*
|  
|  
                                    E-2

|                        DISK SUBSYSTEM ERROR BITS


|  
|  
|  
|  ERROR NAME        DEVICE     REG     BIT     Comments
|  __________________________________________________
|  
|  DATA CHECK        RP04,5,6   ERR 1   15
|  HEADER CRC ERR    RP04,5,6   ERR 1   08
|  FORMAT ERR        RP04,5,6   ERR 1   04
|  
|  BAD SECTOR ERR    RP07       ERR 3   15
|  DATA CHECK        RP07       ERR 1   15
|  HEADER CRC ERR    RP07       ERR 1   08
|  FORMAT ERR        RP07       ERR 1   04
|  SYNC BYTE ERROR   RP07       ERR 3   02
|  
|  BAD SECTOR ERR    RM03,5     ERR 2   15
|  DATA CHECK        RM03,5     ERR 1   15
|  HEADER CRC ERR    RM03,5     ERR 1   08
|  FORMAT ERR        RM03,5     ERR 1   04
|  
|  BAD SECTOR ERR    RK07       RKER    07
|  DATA CHECK        RK07       RKER    15
|  ECC HARD ERR      RK07       RKER    06
|  FORMAT ERR        RK07       RKER    04
|  
|  E2                RL02       RLCS            See note after last chart
       
|  
|  
|  
|                *-*-*-*-*-*-*-*-*-*-*
|                *                   *
|                *       CH-CO       *
|                *                   *
|                *-*-*-*-*-*-*-*-*-*-*
|  
|  
|  
|  
       
|  ERROR NAME        DEVICE     REG     BIT     Comments
|  __________________________________________________
|  
|  CHAN ERR          RH10       CONI    20
|  OVER RUN          RH10       CONI    22      and no drive errors
|  
|  CHAN ERR          RH20       CONI    22
|  OVER RUN          RH20       CONI    26      and no drive errors
|  
|  IS TIMEOUT        RH780      MBA SR  01
|  RD SUB            RH780      MBA SR  02
|  INV MAP           RH780      MBA SR  04
|  MAP PE            RH780      MBA SR  05
|  DATA LATE         RH780      MBA SR  11      and no drive errors
|  
|  NOM EX MEM        RH750      MBA SR  01
|  SPE               RH750      MBA SR  14
|  INV MAP           RH750      MBA SR  04
|  MAP PE            RH750      MBA SR  05
|  DATA LATE         RH750      MBA SR  11      and no drive errors
|  

                                    E-3

|                        DISK SUBSYSTEM ERROR BITS


|  NON EX MEM        RK07       RKCS2   11
|  DATA LATE         RK07       RKCS2   15
|  WRITECHECK        RK07       RKCS2   14      and Not Data Check
|  
|  E4                RL02       RLCS            See note after last chart
       
|  
|  
|  
|  
|                *-*-*-*-*-*-*-*-*-*-*
|                *                   *
|                *       BUS         *
|                *                   *
|                *-*-*-*-*-*-*-*-*-*-*
|  
|  
|  
|  
|  ERROR NAME        DEVICE     REG     BIT     Comments
|  __________________________________________________
|  
|  RAE                RH10      CONI    29
|  MDPE               RH10      CONI    18
|  PARITY ERR         RH10      ER 1    03
|  
|  RAE                RH20      CONI    24   
|  MDPE               RH20      CONI    18      and no Class B device errors
|  PARITY ERR         RH20      ERR 1   03
|  
|  MCPE               RH780     MBA SR  17   
|  NON EX DRIVE       RH780     MBA SR  18   
|  MDPE               RH780     MBA SR  06   
|  PARITY ERR         RH780     ERR 1   03
|  
|  MCPE               RH750     MBA SR  17   
|  NON EX DRIVE       RH750     MBA SR  18   
|  MDPE               RH750     MBA SR  06   
|  PARITY ERR         RH750     ERR 1   03
|  
|  PARITY ERR         RP07      ERR 1   03
|  DATA PARITY ERROR  RP07      ERR 3   03
|  
|  NON EX DRIVE       RK07      RKCS2   12
|  DR TO CNTRL PE     RK07      RKCS1   13
|  CNTRL TO DR PE     RK07      RKER    03
|  CONTROLLER TIMEOUT RK07      RKCS1   11
|  MULTIPLE DRIVE SEL RK07      RKCS2   09
|  UNIT FIELD ERR     RK07      RKCS2   08
|  
|  DRIVE SEL ERR      RL02      RLMP    08
|  
|  
       
|  
|  
|                *-*-*-*-*-*-*-*-*-*-*
|                *                   *
|                *       SOFT        *
|                *                   *
|                *-*-*-*-*-*-*-*-*-*-*
|  
|  
                                    E-4

|                        DISK SUBSYSTEM ERROR BITS


|  
|  
|  
|  ERROR NAME        DEVICE     REG     BIT     Comments
|  __________________________________________________
|  
|  INVALID ADDR ERR  RP04,5,6   ERR 1   10
|  ADDR OVERFLOW ERR RP04,5,6   ERR 1   09
|  REG MOD RFSD      RP04,5,6   ERR 1   02
|  ILL REG           RP04,5,6   ERR 1   01
|  ILL FUNCTION      RP04,5,6   ERR 1   00
|  
|  INVALID ADDR ERR  RP07       ERR 1   10
|  ADDR OVERFLOW ERR RP07       ERR 1   09
|  REG MOD RFSD      RP07       ERR 1   02
|  ILL REG           RP07       ERR 1   01
|  ILL FUNCTION      RP07       ERR 1   00
|  PROG ERR          RP07       ERR 2   15
       
|  INVALID ADDR ERR  RK07       RKER    10
|  PROGRAM ERROR     RK07       RKCS2   10
|  ADR OVERFLOW ERR  RK07       RKER    09
|  DRIVE TYPE ERR    RK07       RKER    05
|  NONEXECUTIBLE FNC RK07       RKER    02
|  ILL FUNCTION      RK07       RKER    00
|  
|  
|  
|  
|                *-*-*-*-*-*-*-*-*-*-*
|                *                   *
|                *       MICRO       *
|                *                   *
|                *-*-*-*-*-*-*-*-*-*-*
|  
|  
|  
|  
|  
|  ERROR NAME        DEVICE     REG     BIT     Comments
|  __________________________________________________
|  
|  CROM PARITY ERR   RP07       ERR 2   14
|  MP UNSAFE         RP07       ERR 2   13
|  DEFECT SKIP ERR   RP07       ERR 3   13
|  CONTROL LGIC FAIL RP07       ERR 3   11
|  LOSS OF BIT CLOCK RP07       ERR 3   10
|  MP HANDSHAKE      RP07       ERR 3   08
|  SERDES DATA FAIL  RP07       ERR 3   04
|  SYNC CLOCK FAIL   RP07       ERR 3   01
|  RUNTIME OUT       RP07       ERR 3   00
|  FAULT CODE        RPO7       ERR 2   00-07   Any nonzero value
       
       
|  
|                *-*-*-*-*-*-*-*-*-*-*
|                *                   *
|                *       UNSAF       *
|                *                   *
|                *-*-*-*-*-*-*-*-*-*-*
|  
|  
|  
                                    E-5

|                        DISK SUBSYSTEM ERROR BITS


|  
|  
|  ERROR NAME        DEVICE    REG     BIT     Comments
|  __________________________________________________
|  
|  AC LOW            RP04,5,6  ERR 3   06
|  DC LOW            RP04,5,6  ERR 3   05
|  WR OS             RP05,6    ERR 3   01
|  DC UN             RP05,6    ERR 3   00
|  NO H SEL          RP04,5,6  ERR 2   10
|  MULTI H SEL       RP04,5,6  ERR 2   09
|  TRAN UNSF         RP04,5,6  ERR 2   06
|  TRAN DET F        RP04,5,6  ERR 2   05
|  C_SW_UNSF         RP04,5,6  ERR 2   03
|  W SEL UNSF        RP04,5,6  ERR 2   02
|  C SK UNSF         RP04,5,6  ERR 2   01
|  ACUN              RP04      ERR 2   15
|  PLO UNS           RP04,5,6  ERR 2   13
|  30VU              RP04      ERR 2   12
|  WRITE UNSF        RP04,5,6  ERR 2   08
|  WR C UNSF         RP04,5,6  ERR 2   00
       
|  UNSAFE            RP07      ERR 1   14    REG 2<11-13>RD/WRT1-3,REG3<5>DC UNS
|  R/W 3 UNSAFE      RP07      ERR 2   12
|  
|  R/W 2 UNSAFE      RP07      ERR 2   11
|  R/W 1 UNSAFE      RP07      ERR 2   10
|  WRITE OVERRUN     RP07      ERR 2   09
|  WRITE READY UNSAF RP07      ERR 2   08
|  WRITE CURENT FAIL RP07      ERR 3   12
|  DC UNSAFE         RP07      ERR 3   05
       
|  UNSAFE            RM03,5    ERR 1   14
|  DEVICE CHK        RM03,5    ERR 2   07
|  
|  UNSAFE            RK06,7    RKER    14
|  SPEED LOSS        RK06,7    RKDS    04
|  ACLO              RK06,7    RKDS    03
       
|  WRITE DATA ERR    RL01,2    RLMP    15
|  CURRENT HEAD ERR  RL01,2    RLMP    14
|  SPEN ERR          RL01,2    RLMP    11
|  WRITE GATE ERR    RL01,2    RLMP    10   and Not Write Locked
       
|  
|  
|  
|  
|  
|  
|                *-*-*-*-*-*-*-*-*-*-*
|                *                   *
|                *       WRTLK       *
|                *                   *
|                *-*-*-*-*-*-*-*-*-*-*
|  
|  
|  
|  
|  
|  ERROR NAME        DEVICE     REG     BIT     Comments


                                    E-6

|                        DISK SUBSYSTEM ERROR BITS


|  __________________________________________________
|  
|  WRITE LOCK ERR    RP04,5,6   ERR 1   11
    
|  WRITE LOCK ERR    RP07       ERR 1   11
       
|  WRITE LOCK ERR    RM03,5     ERR 1   11
|  
|  WRITE LOCK ERR    RK07       RKER    11
       
|  WRITE LOCK        RL02       RLMP    13    and Write Gate Error
|  
|  
|                *-*-*-*-*-*-*-*-*-*-*
|                *                   *
|                *       OFFLI       *
|                *                   *
|                *-*-*-*-*-*-*-*-*-*-*
|  
|  
|  
|  
|  ERROR NAME        DEVICE     REG     BIT     Comments
|  __________________________________________________
|  
|  MEDIUM ON LINE    RP04,5,6   DS      12      OFFLINE when not true
|  
|  MEDIUM ON LINE    RP07       DS      12      OFFLINE when not true
|  
|  MEDIUM ON LINE    RM03,5     DS      12      OFFLINE when not true
|  
|  
|  
|  
|  !*****  RL02 NOTE ****
|  !
|  ! NOTE THAT THESE 3 BITS (10,11,& 12) OF THE CS REG ARE GROUPED
|  ! TO DETERMINE THE ERROR AS FOLLOWS (x means we don't care the state of the bit)
|  !  12   11      10      RESULT
|  !  DLT  CRC     OPI
|  !   0    0       1 = OPI                                E0
|  !   x    1       1 = HEADER CHECK                       E1
|  !   x    1       0 = DATA CRC IF READ OPERATION         E2
|  !                    WRITE CHECK IS WRITE OPERATION
|  !   1    x       1 = HEADER NOT FOUND                   E3
|  !   1    x       0 = DATA LATE                          E4
|  !
|  !*****















                                    E-7

|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|  
|                                APPENDIX F
|  
|                         NETWORK EVENT PARAMETERS
|  
|  
|  
|  Network Management Layer Event Parameters - Class 0
|  
|       Type               Keywords
|  
|       0                  SERVICE
|                          0 = LOAD  1 = DUMP
|       1                  STATUS
|                           Return code
|                            0 = REQUESTED
|                            >0 = SUCCESSFUL
|                            <0 = FAILED
|                           Error detail (if error)
|                           Error message (optional)
|       2                  OPERATION
|                           0 = INITIATED
|                           1 = TERMINATED
|       3                  REASON
|                           0 = Receive timeout    
|                           1 = Receive error
|                           2 = Line state change by higher level
|                           3 = Unrecognized request
|                           4 = Line open error
|  
|  
        
|  Session Control Layer Event Parameters - Class 2
|  
|       Type               Keywords
|  
|       0                  REASON
|                           0 = Operator command
|                           1 = Normal operation
|       1                  OLD STATE
|                           0 = ON   2 = SHUT
|                           1 = OFF  3 = RESTRICTED
|       2                  NEW STATE
|                           0 = ON   2 = SHUT
|                           1 = OFF  3 = RESTRICTED
|       3                  SOURCE NODE
|       4                  SOURCE PROCESS
|       5                  DESTINATION PROCESS
|       6                  USER
|       7                  PASSWORD (0 means password set; n
|                           parameter means not set)
|       8                  ACCOUNT
|  


                                    F-1

|                         NETWORK EVENT PARAMETERS


|  Network Services Layer Event Parameters - Class 3
|  
|       Type               Keywords
|  
|       0                  MESSAGE
|                           Message flags
|                           Destination link address
|                           Source link address
|                           Data
|       1                  CURRENT FLOW CONTROL
|                           0 = No flow control
|                           1 = Segment flow control
|                           2 = Message flow control
|  
|  Routing Layer Event Parameters - Class 4
|  
|       Type               Keywords
|  
|       0                  PACKET HEADER
|                           Message flags
|                           Destination node address
|                           (not for control packet)
|                           Source node address
|                           Forwarding data
|                           (not for control packet)
|       1                  PACKET BEGINNING
|       2                  HIGHEST ADDRESS
|       3                  NODE
|       4                  EXPECTED NODE
|       5                  REASON
|                           0 = Line synchronization lost
|                           1 = Data errors
|                           2 = Unexpected packet type
|                           3 = Routing update checksum error
|                           4 = Adjacent node address change
|                           5 = Verification receive timeout
|                           6 = Version skew
|                           7 = Adjacent node address out of range
|                           8 = Adjacent node block size too small
|                           9 = Invalid verification seed value
|                          10 = Adjacent node listener received timeout
|                          11 = Adjacent node listener received invalid
|                               data
|       6                  RECEIVED VERSION
|       7                  STATUS
|                           0 = REACHABLE  1 = UNREACHABLE
|  
|  Data Link Layer Event Parameters - Class 5
|  
|       Type               Keywords
|  
|       0                  OLD STATE
|                           0 = HALTED   3 = RUNNING
|                           1 = ISTRT    4 = MAINTENANCE
|                           2 = ASTRT
|       1                  NEW STATE
|                           0 = HALTED   3 = RUNNING
|                           1 = ISTRT    4 = MAINTENANCE
|                           2 = ASTRT
|       2                  HEADER
|       3                  SELECTED TRIBUTARY


                                    F-2

|                         NETWORK EVENT PARAMETERS


|       4                  PREVIOUS TRIBUTARY
|       5                  TRIBUTARY STATUS
|                           0 = Streaming
|                           1 = Continued send after timeout
|                           2 = Continued send after deselect
|                           3 = End streaming
|       6                  RECEIVED TRIBUTARY
|       7                  BLOCK LENGTH
|       8                  BUFFER LENGTH
|       9                  DTE
|      10                  REASON
|      11                  (Reserved)
|      12                  (Reserved)
|      13                  PARAMETER TYPE
|      14                  CAUSE
|      15                  DIAGNOSTIC
|  
|  Physical Line Layer Event Parameters - Class 6
|  
|       Type               Keywords
|  
|       0                  DEVICE REGISTER
|       1                  NEW STATE
|                           0 = OFF
|                           1 = ON






































                                    F-3

G-1












                                 APPENDIX G

                                  GLOSSARY



   The following is a list of terms explained within the context of  this
   document.

                Term                           Explanation

        Body section             The data portion  of  an  entry  in  the
                                 system event file.

        BUGCHK                   A  recoverable  error  detected  by  the
                                 TOPS-20 operating system.

        BUGHLT                   A non-recoverable error detected by  the
                                 TOPS-20 operating system.

        BUGINF                   A message informing you that  a  certain
                                 event  relating to the TOPS-20 operating
                                 system has occurred.

        CTY                      The system operator's terminal.

        Dump format              One of the three  output  forms  of  the
                                 RETRIEVE procedure.

        Entry type               The type of entry within a system  event
                                 file,  for  example,  a  MASSBUS  Device
                                 Error, or a Crash Restart Error.

        ERROR.SYS                The name of the  system  event  file  in
                                 both  the  TOPS-10 and TOPS-20 operating
                                 systems.

        Event code               The   octal   code   designated   to   a
                                 particular  event  in  the  system event
                                 file.














                                    G-1

                                  GLOSSARY


                Term                           Explanation

        FRU                      An acronym for Field  Replaceable  Unit.
                                 This  is  a  piece  of hardware that the
                                 Field Service engineer  can  replace  on
                                 the spot.

        Full format              A complete and detailed  listing  of  an
                                 event,   in  ASCII  as  translated  with
                                 RETRIEVE.

        Hard error               A non-recoverable error.

        Header section           The top  portion  of  an  entry  in  the
                                 system  event  file, after SPEAR formats
                                 it.

        MTTR                     An acronym for Mean Time To Repair.  The
                                 average  time  it  takes a Field Service
                                 engineer to isolate and repair a  system
                                 malfunction.

        NXM error                An  attempt  to  address  a  nonexistent
                                 memory location.

        Parity error             Indicates that one  or  more  bits  have
                                 been  picked  up  or  dropped to cause a
                                 nonparity condition.

        RETRIE.RPT               A file containing entries converted from
                                 binary to ASCII.

        RETRIE.SYS               A  file  in  binary  format   containing
                                 entries  extracted from the system event
                                 file.

        Retry count              The number  of  times  an  operation  is
                                 tried, in addition to the first time.

        Sequence number          The number given  to  an  entry  in  the
                                 system event file.

        Short format             A brief  version  of  an  entry  in  the
                                 system   event  file,  after  SPEAR  has
                                 translated it.

        Snapshot                 The   information   gathered   by    the
                                 operating   system   immediately   after
                                 recovering from a crash.

        Soft error               A recoverable error.

        Stopcode                 A message  containing  a  3-letter  code
                                 printed  at  the  CTY  indicating that a
                                 serious  error  has  occurred   in   the
                                 operating system's data base.

        System event file        The  file  where  the  operating  system
                                 records hardware and software events.




                                    G-2

                                  GLOSSARY


                Term                           Explanation

        Sweep                    After   certain   events   occur,    the
                                 operating system checks core looking for
                                 more of the same.


























































                                    G-3

                                        


                                   INDEX



   ABS, 4-32                           Detecting
   ACL, 4-32                             error, 3-1
   ACU, 4-32                           Device status block, 5-28
   AOE, 4-32                           Device types, 4-6
                                       Dialogue
   Body section, 2-5, G-1                SPEAR, 4-2
   /BREAK switch, 4-4                  Dialogue usage messages, A-3
   BUGCHK, 2-2, 5-30, G-1              DIS, 4-32
   BUGHLT, 2-2, 5-30, G-1              Disk statistics, 5-20
   BUGINF, 2-2, G-1                    DL10 communications error, 5-22
                                       DN, 5-14, 5-43
   Channel failures, 2-3               DPA, 4-33
   Checking                            DTE, 4-32, 4-33
     error, 3-1                        Dump format, G-1
     loop, 3-4                         DX20 device error, 5-9, 5-39
     range, 3-4
     software error, 3-4               ECH, 4-32
     sum, 3-4                          Entries
     validity, 3-4                       hardware, 2-2
   Checksum, 3-4                         performance, 2-4
   CM, 5-14, 5-43                        software, 2-2
   Command                               TOPS-10, 5-2
     HELP, 4-3                           TOPS-20, 5-30
   Command and Control Files, C-1      Entry descriptions, 5-1
   Completing next field, 4-4          Entry type, G-1
   Conclusive statement, G-1           Error bits, E-1
   CONFIG program, 5-14                Error checking, 3-1
   Configuration status change, 5-14,  Error detecting, 3-1
       5-43                            Error detectors
   Controller failures, 2-3              hardware, 3-1
   Conventions                           parity, 3-4
     record, 2-6                         threshold, 3-4
   COR/CRC, 4-33                         timing, 3-4
   CPU failures, 2-3                   Error register codes, 4-32
   CPU status block, 5-26              ERROR.SYS, G-1
   Crash extract, 5-4                  Event Codes, D-1
   CS/ITM, 4-33                        Event codes, G-1
   CSF, 4-32                           Event file, 4-10
   CSU, 4-32                           Event file messages, A-5
   CTRL/F, 4-4                         Executing SPEAR, 4-4
   CTRL/U, 4-4                         Exiting from SPEAR, 4-5
   CTRL/W, 4-4                         Extra error reporting, 4-41
   CTY, G-1
                                       Failures
   DAEMON started, 5-7                   channel, 2-3
   Data channel error, 5-7               controller, 2-3
   DCK, 4-32                             CPU, 2-3
   DCL, 4-32                             I/O device, 2-3
   DCU, 4-32                             intermittent, 3-1
   Deleting current line, 4-4            memory, 2-3
   Deleting previous field, 4-4          solid, 3-1








                                  Index-1

                                        


     types of, 3-1                     IXE, 4-32
   FCE, 4-33
   Features                            KL CPU status block, 5-49
     HELP, 4-3                         KL10 parity interrupt, 5-22
   FEN, 4-32                           KL10 parity trap, 5-24
   FER, 4-32                           KLERR entry, 4-9
   Field                               KLERR front end report, 5-53
     completing next, 4-4              KLSTAT function, 4-1
     deleting previous, 4-4            KLSTAT mode, 5-49
   File specifications, 4-4            KLSTAT procedure, 4-41
   Files                               KLSTAT switch, 5-49
     indirect, 4-2                     KS10 Halt status block, 5-19
   FMT, 4-33                           KS10 NXM trap, 5-23
   Format
     full, 4-23                        Library
     octal, 4-19                         SPEAR, 4-1
     record, 2-5                       Line printer error, 5-29
     short, 4-18                       Loop checking, 3-4
   Front end reload, 5-18
   Front end reloaded, 5-45            Magtape statistics, 5-19
   Front-end device report, 5-18,      Magtape system error, 5-16
       5-44                            MASSBUS device error, 5-33
   FRU, G-2                            MASSBUS disk error, 5-8
   Full format, 4-23, G-2              MASSBUS disk registers, 4-32
   Function                            Memory failures, 2-3
     KLSTAT, 4-1                       Memory sweep for NXM, 5-25
     RETRIEVE, 4-1, 4-5                Memory sweep for parity, 5-26
     SUMMARIZE, 4-1, 4-24              MF20 device report, 5-50
                                       MHS, 4-32
   Glossary, G-1                       Minimum analysis, G-2
   /GO switch, 4-4                     MSCP, G-2
                                       MSE, 4-32
   Hard error, G-2                     MTTR, G-2
   Hardware entries, 2-2
   Hardware error detectors, 3-1       NEF, 4-33
   HCE, 4-32                           NETCON, 5-53
   HCRC, 4-32                          Network CHECK11 report, 5-56
   Header                              Network control started, 5-53
     sample, 2-5                       Network down-line load, 5-54
   Header section, 2-5, G-2            Network entries, 5-53
   HELP command, 4-3                   Network event Classes, 4-7
   Help features, 4-3                  Network event Parameters, F-1
   /HELP switch, 4-3, 4-4              Network hardware error, 5-55
                                       Network line statistics, 5-57
   I/O device failures, 2-3            Network up-line dump, 5-54
   IAE, 4-32                           NHS, 4-32
   ILF, 4-32, 4-33                     Non-reload monitor error, 5-3
   ILR, 4-32, 4-33                     NSG, 4-33
   INC/UPE, 4-33                       NXM error, G-2
   Indirect files, 4-2
   Input                               Octal format, 4-19
     RETRIEVE, 4-5                     OCYL, 4-32
   Installation procedures, B-1        OPE, 4-32
   Intermittent failures, 3-1          OPI, 4-32, 4-33
   Isolation techniques, 3-5           OPR, 5-15, 5-44








                                  Index-2

                                        


   OT, 5-14, 5-43                      Software error checking, 3-4
                                       Software event, 5-13
   PAR, 4-32, 4-33                     Software requested data, 5-15
   Parity error, G-2                   Solid failures, 3-1
   Parity error detectors, 3-4         SPEAR dialogue, 4-2
   PEF/LRC, 4-33                       SPEAR library, 4-1
   Performance entries, 2-4            SPEAR messages, A-1
   PLU, 4-32                           SPEAR switches, 4-4
   PM, 5-14, 5-43                      STOPCD, 2-2, G-2
   Procedure                           Stopcodes, 2-2, G-2
     installation, B-1                 Sum checking, 3-4
     KLSTAT, 4-41                      SUMMARIZE function, 4-1, 4-24
     RETRIEVE, 4-9                     SUMMARIZE procedure, 4-34
     SUMMARIZE, 4-34                   SUMMARIZE report, 4-25
   Processor parity interrupt, 5-47    Switch
   Processor parity trap, 5-46           /GO, 4-4
   PSU, 4-32                             /HELP, 4-3, 4-4
                                         question mark, 4-4
   Question mark switch (/?), 4-4        /REVERSE, 4-4
                                         /SHOW, 4-5
   R&W, 4-32                           System event file, 5-1, G-2
   Range checking, 3-4                 System log entry, 5-15, 5-44
   Record conventions, 2-6             System reload, 5-2
   Record format, 2-5
   Report                              TDF, 4-32
     SUMMARIZE, 4-25                   Techniques
   RETRIE.RPT, G-2                       isolation, 3-5
   RETRIE.SYS, G-2                       verification, 3-6
   RETRIEVE error class, 4-6           Terminators, 4-2
   RETRIEVE function, 4-1, 4-5         TGHA, 5-50
   RETRIEVE input, 4-5                 Threshold error detectors, 3-4
   RETRIEVE output, 4-7                Time window, 3-6
   RETRIEVE procedure, 4-9             Timing error detectors, 3-4
   Retry count, G-2                    TOPS-10 entries, 5-2
   Returning to previous prompt, 4-4   TOPS-20 entries, 5-30
   Returning to SPEAR prompt, 4-4      TOPS-20 system reloaded, 5-30
   /REVERSE switch, 4-4                TUF, 4-32
   RMR, 4-32, 4-33                     Types of failures, 3-1
   RP06, 5-9
   Running SPEAR, 4-1                  Unit record error, 5-30
                                       UNS, 4-32, 4-33
   Sample header, 2-5                  User validation messages, A-1
   Sample RETRIEVE session, 4-18       UWR, 4-32
   Sample SUMMARIZE session, 4-39
   Section                             35V, 4-32
     body, 2-5                         Validity checking, 3-4
     header, 2-5                       Verification techniques, 3-6
   Separators, 4-2                     30VU, 4-32
   Sequence number, G-2                VUF, 4-32
   Short format, 4-18, G-2
   /SHOW switch, 4-5                   Warning messages, A-4
   SKI, 4-32                           WCF, 4-32
   Snapshot, G-2                       WCU, 4-32
   Soft error, G-2                     WLE, 4-32
   Software entries, 2-2               WOF, 4-32








                                  Index-3

                                        


   WRU, 4-32                           XPORT messages, A-7
   WSU, 4-32





























































                                  Index-4