Trailing-Edge
-
PDP-10 Archives
-
SRI_NIC_PERM_SRC_3_19910112
-
stanford/5-swskit/debugging-galaxy.mem
There are 6 other files named debugging-galaxy.mem in the archive. Click here to see a list.
Debugging the GALAXY System
---------------------------
1.0 INTRODUCTION
The GALAXY system presents a unique problem to the software specialist
who is trying to debug one of its components. Usually, any user mode
program can be debugged under TOPS-20 by running a copy of it, loaded
with DDT, taking appropriate care that nothing is done which will
affect any users of the system. For GALAXY, however, it is very
difficult to not affect users of the system. For example, if you are
trying to debug BATCON, you will find that QUASAR will very happily
schedule batch jobs submitted by other users to be run by your BATCON.
If you are not careful, you can cause those batch jobs to be lost, or
at least slowed down, while you are debugging.
Debugging QUASAR or ORION would be even worse. Users would see PRINT,
SUBMIT, etc. commands hang when you hit a breakpoint in QUASAR.
Operators would be unable to control any system components if you were
breakpointed in ORION. On top of this, the monitor knows about
QUASAR, and you may lose messages which happen when users close a
spooled lineprinter file, or when a job logs out.
To solve these problems, the concept of a "private GALAXY system" has
been implemented by software engineering in version 4 of GALAXY. When
a private GALAXY system is operating, all of its components are
completely independent of the primary GALAXY system. QUASAR, the
queue maintainer, keeps queues that are separate from the system
queues and are failsofted to a different master queue file. This
QUASAR communicates only with other components in the same private
system. It is even possible to run several complete private GALAXY
systems, with the restrictions that:
1. All components in a private system must run under the same
user name.
2. Only one private system may be run by a given user.
3. Each private QUASAR must be connected to a different
directory.
4. Each private ORION must be connected to a different
directory.
Page 2
BUILDING A PRIVATE GALAXY SYSTEM
2.0 BUILDING A PRIVATE GALAXY SYSTEM
Since the changes necessary to create a private GALAXY system were
implemented in the version 4 source code, it is relatively simple to
build the system. The recommended procedure is as follow:
1. Create a directory to for the private GALAXY system.
2. Restore the file EXEC-FOR-DEBUGGING-GALAXY.EXE from the
SWSKIT to this newly created directory. For Release 5 of the
EXEC, the distributed EXEC replaces the need for this special
program.
3. Restore each of the following files from the "Subsys files
for TOPS20 V5" saveset on the TOPS-20 distribution tape to
this directory.
BATCON.EXE
CDRIVE.EXE
GLXLIB.EXE
LPTSPL.EXE
OPR.EXE
ORION.EXE
PLEASE.EXE
QMANGR.EXE
QUASAR.EXE
SPRINT.EXE
SPROUT.EXE
4. For each component in the above list except GLXLIB.EXE and
QMANGR.EXE, perform the following steps:
1. Give the EXEC command "GET xxxxxx.EXE"
2. Give the command "DEPOSIT 135 -1"
3. Give the command "SAVE xxxxxx"
Page 3
EXAMPLE OF A PRIVATE GALAXY BUILD
3.0 EXAMPLE OF A PRIVATE GALAXY BUILD
It is not strictly necessary to restore all of the GALAXY components
for a one time only debugging session. To debug a component like
BATCON, you would need at a minimum:
1. Your own copy of BATCON
2. Your own copy of QUASAR for BATCON to speak to
3. Your own copy of ORION for BATCON and QUASAR to speak to
4. A copy of OPR to speak to ORION to control BATCON
5. An EXEC which knows about your QUASAR to make queue entries
The following is a log of an example build of a private GALAXY system:
TOPS-20 Command processor 4(560)
@ENABLE (CAPABILITIES)
$!
$! First connect to a debugging directory
$!
$CONNECT (TO DIRECTORY) MISC:<HEMPHILL.GALAXY.DEBUG>
$!
$! Now build and save debugging .EXE files
$!
$! QUASAR, the queue maintainer
$!
$GET (PROGRAM) SYS:QUASAR.EXE.55
$DEPOSIT (MEMORY LOCATION) 135 (CONTENTS) -1
[Shared]
$SAVE (ON FILE) QUASAR.EXE.1 !New file! (PAGES FROM)
QUASAR.EXE.1 Saved
$!
$! ORION, the message clearinghouse
$!
$GET (PROGRAM) SYS:ORION.EXE.53
$DEPOSIT (MEMORY LOCATION) 135 (CONTENTS) -1
[Shared]
$SAVE (ON FILE) ORION.EXE.1 !New file! (PAGES FROM)
ORION.EXE.1 Saved
$!
$! OPR, the operator interface
$!
$GET (PROGRAM) SYS:OPR.EXE.55
$DEPOSIT (MEMORY LOCATION) 135 (CONTENTS) -1
[Shared]
$SAVE (ON FILE) OPR.EXE.1 !New file! (PAGES FROM)
OPR.EXE.1 Saved
$!
$! BATCON, the batch controller
$!
Page 4
EXAMPLE OF A PRIVATE GALAXY BUILD
$GET SYS:BATCON.EXE.39
$DEPOSIT (MEMORY LOCATION) 135 (CONTENTS) -1
[Shared]
$SAVE (ON FILE) BATCON.EXE.1 !New file! (PAGES FROM)
BATCON.EXE.1 Saved
$!
$! Now a directory of what we've got
$!
$VDIRECTORY (OF FILES) *.*.*
MISC:<HEMPHILL.GALAXY.DEBUG>
BATCON.EXE.1;P777700 16 8192(36) 13-Feb-80 22:00:37
EXEC-FOR-DEBUGGING-GALAXY.EXE.1;P777700
82 41984(36) 13-Feb-80 04:33:50
OPR.EXE.1;P777700 31 15872(36) 13-Feb-80 22:00:09
ORION.EXE.1;P777700 44 22528(36) 13-Feb-80 21:59:45
QUASAR.EXE.1;P777700 40 20480(36) 13-Feb-80 21:59:27
Total of 213 pages in 5 files
$
Page 5
RUNNING THE PRIVATE GALAXY SYSTEM
4.0 RUNNING THE PRIVATE GALAXY SYSTEM
Starting and running a private GALAXY system is similar to running
GALAXY in the usual manner. First QUASAR and ORION are started, then
the component you wish to debug. You will also need OPR to issue
operator commands and the modified EXEC to make queue entries. Since
you will need about five jobs, it is usually most convenient to run
each component as a separate subjob under PTYCON.
4.1 Starting QUASAR
QUASAR and ORION should be started before everything else. Nothing
evil happens if you start them last, but all the other components will
be waiting for these two to start. A suggested procedure is:
1. Define a subjob "Q"
2. Connect to it
3. LOGIN a job under the same user name
4. CONNECT that job to the directory in which you did the
private GALAXY build
5. ENABLE
6. RUN QUASAR
4.2 Starting ORION
Starting ORION is as painless as starting QUASAR:
1. Define a subjob "O"
2. Connect to it
3. LOGIN a job under the same user name
4. CONNECT that job to the directory in which you did the
private GALAXY build
5. ENABLE
6. RUN ORION
Page 6
Starting OPR
4.3 Starting OPR
OPR starts up using the same formula as QUASAR and ORION:
1. Define a subjob "OPR"
2. Connect to it
3. LOGIN a job under the same user name
4. CONNECT that job to the directory in which you did the
private GALAXY build
5. ENABLE
6. RUN OPR
7. You may now type OPR commands to see if QUASAR and ORION
appear to be healthy.
4.4 Starting The Component To Be Debugged
If the component you wish to debug is QUASAR, ORION, or OPR, then you
have already started it. Breakpoints could have been set, and when
they were hit, the component could have been debugged without any
noticable affect on other users of the system. If you wish to debug
PLEASE, BATCON, LPTSPL, CDRIVE, SPRINT, or SPROUT, do the following:
1. Define a subjob with an appropriate ID (e.g. B for BATCON)
2. Connect to it
3. LOGIN a job under the same user name
4. CONNECT that job to the directory in which you did the
private GALAXY build
5. ENABLE
6. GET the component
7. Enter DDT
8. Set breakpoints, then start the program
Page 7
Starting the Modified EXEC
4.5 Starting The Modified EXEC
The file "EXEC-FOR-DEBUGGING-GALAXY.EXE" which has been supplied on
the SWSKIT has exactly two commands added to its repertoire. These
are "^ESET DEBUGGING-GALAXY" and "^ESET NO DEBUGGING-GALAXY". The
effect of these commands is to select which one of two PIDs (Process
IDs) to communicate with: the system QUASAR or the private QUASAR.
If "NO DEBUGGING-GALAXY" is set, then PRINT, SUBMIT, CANCEL, MODIFY,
and the INFORMATION commands will all cause communication with the
system QUASAR. If "DEBUGGING-GALAXY" is set for this EXEC, then the
commands listed will communicate with the private QUASAR run by that
user. For Release 5 of TOPS-20, the distributed EXEC incorporates
this functionality in the "^ESET PRIVATE-QUASAR" and "^ESET NO
PRIVATE-QUASAR" commands, and the special EXEC is unneeded.
1. Define a subjob "E"
2. Connect to it
3. LOGIN a job under the same user name
4. CONNECT that job to the directory in which you did the
private GALAXY build
5. RUN EXEC-FOR-DEBUGGING-GALAXY (or the Release 5 EXEC)
6. ENABLE
7. ^ESET DEBUGGING-GALAXY (or PRIVATE-QUASAR)
Page 8
EXAMPLE DEBUGGING SESSION
5.0 EXAMPLE DEBUGGING SESSION
The following is a log of a sample debugging session:
TOPS-20 Command processor 4(560)
@!
@! First run PTYCON, so we can control five jobs from one terminal
@!
@PTYCON.EXE.7
PTYCON> !
PTYCON> ! Now start up QUASAR as subjob Q
PTYCON> !
PTYCON> DEFINE (SUBJOB #) 0 (AS) Q
PTYCON> CONNECT (TO SUBJOB) Q
[CONNECTED TO SUBJOB Q(0)]
2102 Development System, TOPS-20 Monitor 4(3245)
@LOG HEMPHILL (PASSWORD)
Job 21 on TTY222 13-Feb-80 22:18:05
Structure PS: mounted
Structure MISC: mounted
@ENABLE (CAPABILITIES)
$!
$! Connect to directory where debugging .EXE files are
$!
$CONNECT (TO DIRECTORY) MISC:<HEMPHILL.GALAXY.DEBUG>
$!
$! Finally run the component
$!
$RUN (PROGRAM) QUASAR.EXE.1
% QUASAR GLXIPC Becoming [HEMPHILL]QUASAR (PID = 66000031)
% QUASAR GLXIPC Waiting for ORION to start
^X
PTYCON> !
PTYCON> ! Now start up ORION as subjob O
PTYCON> !
PTYCON> DEFINE (SUBJOB #) 1 (AS) O
PTYCON> CONNECT (TO SUBJOB) O
[CONNECTED TO SUBJOB O(1)]
2102 Development System, TOPS-20 Monitor 4(3245)
@LOG HEMPHILL (PASSWORD)
Job 22 on TTY223 13-Feb-80 22:19:25
Structure PS: mounted
Structure MISC: mounted
@ENABLE (CAPABILITIES)
$!
$! Connect to directory where debugging .EXE files are
$!
$CONNECT (TO DIRECTORY) MISC:<HEMPHILL.GALAXY.DEBUG>
$!
$! Finally run the component
$!
Page 9
EXAMPLE DEBUGGING SESSION
$RUN (PROGRAM) ORION.EXE.1
% ORION GLXIPC Alternate [HEMPHILL]QUASAR (PID = 66000031)
% ORION GLXIPC Becoming [HEMPHILL]ORION (PID = 70000032)
**** Q(0) 22:19:58 ****
% QUASAR GLXIPC Alternate [HEMPHILL]ORION (PID = 70000032)
**** O(1) 22:19:58 ****
^X
PTYCON> !
PTYCON> ! Now start up OPR as subjob OPR
PTYCON> !
PTYCON> DEFINE (SUBJOB #) 2 (AS) OPR
PTYCON> CONNECT (TO SUBJOB) OPR
[CONNECTED TO SUBJOB OPR(2)]
2102 Development System, TOPS-20 Monitor 4(3245)
@LOG HEMPHILL (PASSWORD)
Job 23 on TTY224 13-Feb-80 22:20:29
Structure PS: mounted
Structure MISC: mounted
@ENABLE (CAPABILITIES)
$!
$! Connect to directory where debugging .EXE files are
$!
$CONNECT (TO DIRECTORY) MISC:<HEMPHILL.GALAXY.DEBUG>
$!
$! Finally run the component
$!
$RUN (PROGRAM) OPR.EXE.1
% OPR GLXIPC Alternate [HEMPHILL]QUASAR (PID = 66000031)
% OPR GLXIPC Alternate [HEMPHILL]ORION (PID = 70000032)
OPR>
22:19:59 -- Network Node 1031 is Online --
22:19:59 -- Network Node 2137 is Online --
22:19:59 -- Network Node 4097 is Online --
22:19:59 -- Network Node DN20A is Online --
22:19:59 -- Network Node MILL20 is Online --
22:19:59 -- Network Node SYS880 is Online --
OPR>!
OPR>! Let's take a look at our brand new queues
OPR>!
OPR>SHOW QUEUES
OPR>
22:21:21 --The Queues are Empty--
OPR>SHOW STATUS PRINTER
OPR>
22:21:27 --There are no Devices Started--
OPR>^X
PTYCON> !
PTYCON> ! Now start up BATCON as subjob B
Page 10
EXAMPLE DEBUGGING SESSION
PTYCON> !
PTYCON> DEFINE (SUBJOB #) 3 (AS) B
PTYCON> CONNECT (TO SUBJOB) B
[CONNECTED TO SUBJOB B(3)]
2102 Development System, TOPS-20 Monitor 4(3245)
@LOG HEMPHILL (PASSWORD)
Job 24 on TTY225 13-Feb-80 22:21:49
Structure PS: mounted
Structure MISC: mounted
@ENABLE (CAPABILITIES)
$!
$! Connect to directory where debugging .EXE files are
$!
$CONNECT (TO DIRECTORY) MISC:<HEMPHILL.GALAXY.DEBUG>
$!
$! Finally run the component
$!
$RUN (PROGRAM) BATCON.EXE.1
% BATCON GLXIPC Alternate [HEMPHILL]QUASAR (PID = 66000031)
% BATCON GLXIPC Alternate [HEMPHILL]ORION (PID = 70000032)
^X
PTYCON> !
PTYCON> ! Now start up special EXEC as subjob E
PTYCON> !
PTYCON> DEFINE (SUBJOB #) 4 (AS) E
PTYCON> CONNECT (TO SUBJOB) E
[CONNECTED TO SUBJOB E(4)]
2102 Development System, TOPS-20 Monitor 4(3245)
@LOG HEMPHILL (PASSWORD)
Job 19 on TTY226 13-Feb-80 22:23:00
Structure PS: mounted
Structure MISC: mounted
@CONNECT (TO DIRECTORY) MISC:<HEMPHILL.GALAXY.DEBUG>
@!
@! Run the special EXEC, which is provided on the SWSKIT
@!
@RUN (PROGRAM) EXEC-FOR-DEBUGGING-GALAXY.EXE.1
TOPS-20 Command processor 4(560)-1
@ENABLE (CAPABILITIES)
$!
$! Make this EXEC switch from system queues to private queues
$!
$^ESET DEBUGGING-GALAXY
$!
$! Use ordinary EXEC commands to examine private queues
$!
$INFORMATION (ABOUT) OUTPUT-REQUESTS
[The Queues are Empty]
$INFORMATION (ABOUT) BATCH-REQUESTS
[The Queues are Empty]
$!
Page 11
EXAMPLE DEBUGGING SESSION
$! Now switch back to look at system queues
$!
$^ESET NO DEBUGGING-GALAXY
$INFORMATION (ABOUT) OUTPUT-REQUESTS
Printer Queue:
Job Name Req# Limit User
-------- ---- ----- ------------------------
* KLERR 6 1197 DEUFEL On Unit:0
Started at 22:05:47, printed 314 of 1197 pages
XXX 3 18 KAMANITZ /Dest:4097
MS-OUT 18 117 BRAITHWAITE /Unit:0
There are 3 Jobs in the Queue (1 in Progress)
$INFORMATION (ABOUT) BATCH-REQUESTS
Batch Queue:
Job Name Req# Run Time User
-------- ---- -------- ------------------------
* DUMP 16 02:00:00 OPERATOR In Stream:0
Job# 17 Running DUMPER Last Label: A Runtime 0:23:55
BATCH 2 00:05:00 BLIZARD /Proc:FOO
SOURCE 8 00:05:00 BLOUNT /After:14-Feb-80 0:00
SRCCOM 12 00:05:00 MURPHY /After:14-Feb-80 0:00
QJD4R 13 00:05:00 SROBINSON /After:19-Feb-80 0:00
QAR 10 00:05:00 BLOUNT /After:19-Feb-80 0:14
SAVE 1 00:05:00 FICHE /After:19-Feb-80 9:10
There are 7 Jobs in the Queue (1 in Progress)
$!
$! Now let's submit a batch job to our own BATCON
$!
$^ESET DEBUGGING-GALAXY
$!
$! Make a trivial batch control file
$!
$COPY (FROM) TTY: (TO) A.CTL.1 !New file!
TTY: => A.CTL.1
@SY A
^Z
$!
$! And submit the job
$!
$SUBMIT (BATCH JOB) A.CTL.1
[Job A Queued, Request-ID 1, Limit 0:05:00]
$!
$! Now examine private queues
$!
$INFORMATION (ABOUT) BATCH-REQUESTS
Batch Queue:
Job Name Req# Run Time User
-------- ---- -------- ------------------------
Page 12
EXAMPLE DEBUGGING SESSION
A 1 00:05:00 HEMPHILL
There is 1 Job in the Queue (None in Progress)
$!
$! Our job is in the batch queue, but no batch-streams have been started
$!
$^X
PTYCON> CONNECT (TO SUBJOB) OPR
[CONNECTED TO SUBJOB OPR(2)]
OPR>START (Object) BATCH-STREAM (Stream Number) 0
OPR>
22:25:40 Batch-Stream 0 --Startup Scheduled--
22:25:40 Batch-Stream 0 --Started--
OPR>
22:25:40 Batch-Stream 0 --Begin--
Job A Req #1 for HEMPHILL
OPR>
22:25:51 Batch-Stream 0 --End--
Job A Req #1 for HEMPHILL
OPR>
^X
PTYCON> !
PTYCON> ! Cleaning up is easy
PTYCON> !
PTYCON> KILL (SUBJOB) ALL
PTYCON> EXIT (FROM PTYCON)
@
Page 13
TECHNICAL DETAILS
6.0 TECHNICAL DETAILS
This section is to explain what happens differently when a component
has had location 135 (.JBOPS) poked to -1, and to present a few
helpful tidbits of information about debugging some of the programs.
.JBOPS incidentally is the word in the job data area (defined under
TOPS-10) which is reserved for a program's OTS. GALAXY references
this location by the symbol "DEBUGW".
6.1 GLXLIB
GLXLIB is the GALAXY library. It consists of a code segment which
starts at address 400000 and a data segment at address 600000. Each
of the programs QUASAR, ORION, OPR, PLEASE, BATCON, LPTSPL, CDRIVE,
SPRINT, and SPROUT uses it. Part of the initialization code of each
of these programs maps in GLXLIB as a "high segment". This is in
effect an object time system for GALAXY, with many commonly used
routines. Most of the support for the private GALAXY system is in
this library, enough so that OPR, PLEASE, BATCON, LPTSPL, SPRINT and
SPROUT actually have no code which cares whether they are part of a
private GALAXY. The initialization code in each component looks in
three places to find GLXLIB.EXE: first on the structure and directory
that the component itself came from, second on DSK:, third on SYS:.
This search order is the same for both the system GALAXY and the
private one. The actual changes implemented for the private GALAXY
are as follows:
1. Ordinarily, a component which stopcodes will save a crash
file on disk. When debugging, however, the crash file is not
written. In either case, if DDT is loaded with the program,
the stopcode will invoke a jump to DDT.
2. GALAXY components do not require receiving privileged packets
under debugging.
3. Ordinarily, QUASAR and ORION get special system PIDs for IPCF
communications. When debugging, they get PIDs with names of
the form "[username]QUASAR" and "[username]ORION". All
GALAXY components will then look for these PID names. Even a
pseudo-GALAXY component, such as MOUNTR or IBMSPL, will be
able to find these PIDs if its location 135 has been poked to
-1, simply because it uses GLXLIB.
4. GALAXY components print messages like:
"% QUASAR GLXIPC Waiting for ORION to start"
only while debugging.
5. ORION and QUASAR print messages about PIDs they acquire,
like:
"% QUASAR GLXIPC Becoming [HEMPHILL]QUASAR (PID =
66000031)"
Page 14
TECHNICAL DETAILS
6. All components print messages about the special PIDs they
find for QUASAR and ORION, like:
"% ORION GLXIPC Alternate [HEMPHILL]QUASAR (PID =
66000031)"
6.2 QUASAR
1. QUASAR reads and writes private queues from its connected
directory. The full filespec is
"DSK:PRIVATE-MASTER-QUEUE-FILE.QUASAR"
2. QUASAR does absolutely no privilege checking. Anyone can
modify or kill any request in the queues (if they know how to
speak to this private QUASAR).
6.3 ORION
1. ORION will create a log file under the name of
"DSK:ORION-TEST.LOG" instead of
"PS:<SPOOL>ORION-SYSTEM-LOG.001", and does no renaming of any
old log files present.
2. ORION will not set up any NSP servers when debugging. It
therefore will not speak to remote nodes to run OPRs for
them. However, there are hooks for ORION to initialize
"SRV:128" instead of the usual "SRV:47" when debugging.
6.4 QMANGR
QMANGR has also been modified to look for a private QUASAR's PID if
the low segment has a non-zero entry in .JBOPS.
6.5 CDRIVE
CDRIVE can pose a problem to debug, since it has potentially many
inferior forks all executing the same code, so each fork automatically
loads SDDT into its address space and jumps to it when it starts up.
After setting any breakpoints or otherwise modifying this fork's code,
the debugger types "GO<ESC>G" to resume the fork. While debugging, if
the fork terminates (crashes), CDRIVE will not go through its normal
purging of the crashed fork, so that its status can be examined.
Page 15
EXAMINING GALAXY CRASH FILES
7.0 EXAMINING GALAXY CRASH FILES
All GALAXY components use the stopcode facility supplied by GLXLIB.
This facility dumps the ACs, program error codes, associated error
messages, program version numbers, and the last nine locations of the
stack onto the controlling terminal of the program executing the
stopcode. In addition, a crash file is created with the name of the
form: PS:<SPOOL>program-stopcode-CRASH.EXE. This .EXE file contains
the entire core image of the program which has crashed, and is
extremely useful in determining the cause of the crash. In
particular, there is a block of data referred to as the "crash block"
which usually contains the information most pertinent to the debugger.
This information can be read with either DDT or FILDDT. Its contents
are tabulated as follows:
Location Data
.SPC PC of stopcode
.SCODE SIXBIT name of stopcode
.SERR Last TOPS-20 error code
.SACS Contents of the sixteen accumulators
.SPTBL Base address of page table used by
GLXMEM
.SPRGM Name of program in SIXBIT
.SPVER Program version number
.SPLIB GLXLIB version number
.LGERR Last GALAXY error code
.LGEPC PC of last GALAXY error return