Trailing-Edge
-
PDP-10 Archives
-
BB-M081Z-SM
-
monitor-sources/bugs.mac
There are 102 other files named bugs.mac in the archive. Click here to see a list.
; BUG list created by MAKBUG 1(25) on 27-Sep-90 06:04:16
; TOPS-20AN DECnet System, TOPS-20AN Monitor 7(21733)
; Switches are: DEBUG=OFF, DCN=ON, NETN=ON
BUG.(HLT,ABKSKD,PAGEM,SOFT,<Address break from scheduler context>,,<
Cause: A page failure occurred while the monitor was running in scheduler
context, and the page fail word indicated an address break. Address
breaks can only be set in code that runs in process context.
>)
BUG.(CHK,ACJDIE,MEXEC,SOFT,<ACJ fork has crashed>,,<
Cause: The ACJ fork was running under monitor context but has terminated.
The monitor is killing the fork.
Action: Find out why the ACJ fork died. It is possible it crashed because
of a coding error within the program. Debug the ACJ in user mode.
The ACJ can be restarted in monitor context.
>,,<DB%NND>)
BUG.(HLT,AGSETX,PAGEM,SOFT,<AGESET - XB needs checking>,,<
Cause: A index block that has been "forced out" needs to be swapped in and
checked.
>)
BUG.(CHK,ANIOPF,IMPANX,HARD,<IO page fail from AN20>,<<P1,NCT>>,<
Cause: The AN20 was found to be the cause of an IO Page Fail.
This means that the hardware got a page fault during an interrupt
instruction. This may indicate a hardware problem with the AN20.
Data: NCT - NCT at the time of the BUGCHK.
>,,<DB%NND>)
BUG.(HLT,APRAPE,APRSRV,HARD,<Address parity error>,,<
Cause: An APR interrupt occurred because a memory controller signaled that it
received an address with even parity from the processor. There is a
description of the problem on the CTY.
Action: This is usually seen with broken hardware. Field Service should check
the system. Using SBUS diag 0 for all memory controllers, check the
address parity error bit(s). Test the bus and controller.
>)
BUG.(HLT,APRNX1,APRSRV,HARD,<NXM detected by APR>,,<
Cause: An APR interrupt occurred because the processor attempted to access a
memory that did not respond within a preset time. This can indicate
broken hardware or a software bug. The monitor has printed a
description of the problem on the CTY.
Action: This BUGHLT is usually caused by faulty hardware, and Field Service
should check the system.
The analysis of this BUGHLT is extremely complicated. The physical
address from the error register is printed on the CTY ("ERA="). If
there is physical memory at this address, the problem is probably in
the hardware. If the address does not exist, the problem may be in
either hardware or software.
One software problem that has led to this BUGHLT in the past is code
that returns an SPT slot to the free pool while leaving a pointer to
that slot in some page table. The content of the SPT entry, instead of
being a pointer to memory, is a pointer to another SPT slot. In this
case, a page fault has occurred just before the interrupt. The
PC points into the page fault handler. The page fault word and PC
(TRAPSW and TRAPPC, respectively) indicates the virtual address and
instruction at the time of the page fault. Tracing this virtual address
to the SPT produces the erroneous SPT entry.
If this BUGHLT is seen with healthy hardware and is reproducible,
submit an SPR along with the dump and instructions on reproducing the
problem.
>)
BUG.(HLT,APRNX2,APRSRV,HARD,<NXM detected by APR>,,<
Cause: A page fault occurred, indicating that the processor attempted to
access a memory that did not respond within a preset time. The monitor
is presently processing an interrupt or running in the scheduler and
the interrupt system is turned on. Since non-existent memory also
produces an APR interrupt, which results in an APRNX1 BUGHLT, this
BUGHLT does not normally occur.
Action: This is usually a hardware problem. See the action for APRNX1. Note,
however, that the occurrence of this BUGHLT instead of APRNX1 may
indicate a failure in the interrupt system.
>)
BUG.(CHK,ARCASS,JSYSF,HARD,<ARCF - File directory and mapped directory do not match>,,<
Cause: The directory number of the currently mapped directory does not
match the directory number of file that we are attempting to set
tape information for in a ARCF% .ARRST request.
>)
BUG.(CHK,ARCVER,IPCF,HARD,<ARCMSG - NOUT failed>,,<
Cause: ARCMSG attempted to NOUT the generation number of a file and it
failed.
Action: If this problem becomes chronic, then change this BUGCHK to a BUGHLT.
Determine why NOUT% failed by looking at the error it has returned.
It is possible that the disk we are trying to write to is having
hardware problems. If this is the case, have field service look at
the disk.
>)
BUG.(HLT,ARSTXX,JSYSF,SOFT,<ARRST - FDB disappeared for destination file>,,<
Cause: The FDB for a file being restored from offline does not exist.
>)
BUG.(CHK,ASAASG,DSKALC,SOFT,<DSKASA - Assigning already assigned disk address>,<<T1,STRCOD>,<T2,SECTOR>>,<
Cause: The sector being assigned on the disk is already assigned.
This may happen during creation of structure when assigning
swapping space, and the sector is already assigned in the
BAT blocks. This can also be caused by redundant BAT block entries.
Data: STRCOD - Structure Unique Code
SECTOR - Sector Number on Disk Relative to Start of Structure
>,,<DB%NND>)
BUG.(CHK,ASGBAD,DSKALC,SOFT,<DSKASA - Assigning bad disk address>,<<T3,STRCOD>,<T2,SECTOR>>,<
Cause: The sector being assigned was not within the legal range
of sector numbers.
Data: STRCOD - Structure Unique Code
SECTOR - Sector Number on Disk Relative to Start of Structure
>)
BUG.(CHK,ASGBPG,DSKALC,SOFT,<INIBTB - Failed to assign bad page(s)>,<<T1,STRCOD>,<T2,AMOUNT>>,<
Cause: The bit table is being initialized; home blocks, pages
in the BAT blocks, and swapping space are being assigned.
Address(es) in the BAT blocks were not assigned.
Data: STRCOD - Structure unique code
AMOUNT - Number of addresses not assigned
>,,<DB%NND>)
BUG.(HLT,ASGFR0,FREE,SOFT,<ASGFRE - Illegal to assign 0 free space>,,<
Cause: An illegal request for free space is being made. The calling routine
is asking for zero words of free space.
Action: Look at the dump. By backing up the stack you should be able to
tell what routine called for the illegal free space.
>)
BUG.(CHK,ASGINT,FREE,SOFT,<ASGFRE called OKINT>,<<C,CALLER>>,<
Cause: This is a free space problem. Calls to swapable free space
routines should be made only while the calling process is NOINT. The
calling routine is not protecting itself from losing free space. It is
OKINT and it could get interrupted and never return,
thus losing the free block assigned.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. The dump shows the routine
which is calling OKINT. Make the routine be NOINT until it has
ensured that the block is freed when it is interrupted
(e.g. JSB stack).
Data: CALLER - The address of the calling routine
>)
BUG.(CHK,ASGREP,FREE,SOFT,<Illegal priority given to ASGRES>,,<
Cause: This is a free space problem. The caller is asking for resident
free space. In T3 the caller gives a priority for this request.
The priority determines how ASGRES is going to handle this request
when free space is low. This priority is out of range.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,ASGREQ,FREE,SOFT,<Illegal pool number given to ASGRES>,,<
Cause: This is a free space problem. The caller is requesting resident
free space. In T2 the caller is providing the pool number from
which the free space should come. This pool number is incorrect.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(HLT,ASGSW2,PAGEM,SOFT,<SWPOMG - Cannot assign reserved drum address>,,<
Cause: The monitor is swapping a group of pages to a set of contiguous pages
in the swapping space. The swapping space manager has provided a
starting address for a block of free pages. An attempt to assign one
of the pages has failed. This indicates an inconsistency in the
monitor's data or a race condition. For example, a context switch may
have occurred when it was not expected.
>)
BUG.(CHK,ASGSWB,SWPALC,SOFT,<SWPINI - Cannot assign bad address>,,<
Cause: Cannot assign bad drum address because it is an illegal address or
already assigned.
Action: There could be a problem with the boot structure on this system. If
this problem is reproducable with a healthy boot structure, and this
BUGCHK can be reproduced, change it to a BUGHLT and submit an SPR along
with instructions on reproducing the problem.
>)
BUG.(CHK,ASNTVJ,TVTSRV,SOFT,<Invalid TTACTL entry>,<<T2,TTYLIN>>,<
Cause: Routine ASNTVT has discovered a bad entry in TTACTL. The monitor will
ignore this TVT and keep looking for a free one.
Data: TTYLIN - TTY line number that was passed to ASNTVT
>)
BUG.(HLT,ASOFNF,DISC,SOFT,<DELFIL - ASGOFN gave fail return for long file XB>,,<
Cause: A long file was being deleted and ASGOFN could not assign a system file
number (OFN). This usually happens because there were not enough OFN
slots.
Action: If this happens frequently, rebuild the monitor with more OFN slots.
>)
BUG.(HLT,ASTJFN,LOOKUP,SOFT,<GETFDB - Called for JFN with output stars>,,<
Cause: The monitor tried to get a pointer to the FDB of an output file, and
the file specification used as an argument to GTJFN contains
asterisks.
>)
BUG.(CHK,BADADR,MNETDV,SOFT,<No NCT for address>,<<T1,ADR>>,<
Cause: The multinet output queuing mechanism was called for a local
address that has not been defined.
Data: ADR - Host address
>)
BUG.(CHK,BADBAK,FILINI,HARD,<FILIN2 - Backup copy of ROOT-DIRECTORY is not good>,,<
Cause: The backup copy of the root directory on the boot structure is bad.
Action: Re-create the backup copy of the root directory. If this is not
possible, rebuild the boot structure. If the trouble persists, have
Field Service check the system.
>,,<DB%NND>)
BUG.(CHK,BADBAT,DSKALC,HARD,<BAT blocks unreadable>,,<
Cause: BAT block header contains bad information.
>,,<DB%NND>)
BUG.(HLT,BADBUF,MNETDV,SOFT,<Null buffer address>,,<
Cause: Multinet has been called to queue an output buffer with a null
address.
>)
BUG.(HLT,BADDAC,DIRECT,SOFT,<INSACT - Null account string seen>,,<
Cause: A null account string was given for insertion into the FDB by the
monitor during the creation of a file or while executing a SACTF JSYS.
>)
BUG.(HLT,BADFEV,PHYSIO,SOFT,<CHKPDB - Wrong or bad front-end version>,,<
Cause: The RSX20F front-end did not send a type 40 message to idenify the
front-end disk serial numbers.
Action: Run a more recent copy of the RSX20F front-end software. Version 16-00
of RSX20F was released with TOPS-20 version 7.0.
>)
BUG.(CHK,BADIDX,FILINI,HARD,<IDXINI - Partially unsuccessful index table rebuild>,,<
Cause: IDXINI failed to create the index-table file during file structure
creation. Some of the directories may have been entered into the file
before this failure occurred.
Action: There appears to be a hardware problem on this system. If the trouble
persists, have Field Service check the system. If the problem can be
reproduced on healthy hardware, send in an SPR along with a dump and
instructions on reproducing the problem.
>,,<DB%NND>)
BUG.(HLT,BADIRB,PHYSIO,SOFT,<Bad IORB passed to GIVIRB>,,<
Cause: An IORB was passed to GIVIRB that does not have a legal address. This
indicates a software problem in the monitor.
>)
BUG.(HLT,BADPTR,PAGEM,SOFT,<Bad section pointer - SECMAP>,,<
Cause: A caller to SECMAP is trying to delete a section pointer and the
section pointer being deleted was not one of the types (share or
indirect) expected.
>)
BUG.(HLT,BADREC,FILINI,HARD,<FILINI - Reconstruction of ROOT-DIRECTORY failed>,,<
Cause: One of the following failures occurred during attempted reconstruction
of the root-directory during system startup: could not get OFN for
backup root-directory; could not get OFN for the root-directory; could
not assign a page in the job data area to build the backup index block;
or the backup root-directory is clobbered.
Action: There could be a hardware or software problem. If the trouble
persists, have Field Service check the system. If the problem can be
reproduced on healthy hardware, send in an SPR along with a dump and
instructions on reproducing the problem.
>)
BUG.(HLT,BADROT,FILINI,HARD,<FILIN2 - ROOT-DIRECTORY is invalid>,,<
Cause: When the system was coming up, BLKSCN was called to check the
consistency of the root directory. This error means that BLKSCN found
that the root directory had an unrecognizable type, the last block did
not have the expected length, or some block had an incorrect format.
Action: Use EDDT to break after BLKSCN and examine the error code in AC1. This
code indicates what is wrong with the root directory. If
other BUGCHKs or BUGINFs occur with this BUGHLT, they also can provide
helpful information.
If the system can be brought up using another structure as PS:, the bad
structure can sometimes be fixed with various tools such as FILDDT. If
this restoration fails, the pack can be started afresh and pertinent
DUMPER backup tapes can be used to restore the structure.
If the trouble persists, have Field Service check the system. If the
problem can be reproduced on healthy hardware, send in an SPR along
with a dump and instructions on reproducing the problem.
>)
BUG.(CHK,BADTAB,JSYSA,HARD,<VERACT - Spurious hash table encountered>,,<
Cause: This BUG indicates that a block has been found in the monitor's account
data base that is not an account block. The account data base is
corrupted.
Action: There is a problem with the accounts data base. A new accounts data
base should be installed with ACTGEN. If this BUGCHK is reproducible,
set this bug dumpable and submit an SPR along with the dump along with
instructions on reproducing the problem.
>)
BUG.(HLT,BADTTY,TTYSRV,SOFT,<Transfer to nonexistent TTY code>,,<
Cause: The transfer vector for a non-existent TTY line type was referenced.
The stack should indicate which routine caused the reference.
>)
BUG.(HLT,BADTYP,TAPE,HARD,<Bad label field desc>,,<
Cause: This is a bug in TAPE. The internal routines in TAPE have a table with
codes that describe the type of data in particular label fields
(octal,string,decimal). One of these tables has a code that is out of
range. Try to find out where the out of-range-code came from.
>)
BUG.(HLT,BADXT1,FILINI,HARD,<FILINI - Index table missing and can not be created>,,<
Cause: During system startup, MAKIDX failed to recreate the index table file.
This will occur, for instance, if one of the following conditions
exist: ASGJFR fails to get free space, STRST fails to create a
filespec; GTJFN fails to create the file; OPENF fails to open it.
Action: Select a different disk pack to build the system on. If the trouble
persists, have Field Service check the system.
>)
BUG.(INF,BADXT2,FILINI,HARD,<FILINI - Index table missing and was created>,,<
Cause: During system startup, FNDIDX failed to get an OFN for the index table
file so MAKIDX has been called to create a new one.
>,,<DB%NND>)
BUG.(HLT,BADXTB,FILINI,SOFT,<FILINI - Could not initialize index table>,,<
Cause: This can happen either because IDXINI failed during normal system
startup, or because MAKIDX failed during a special startup while the
boot structure was being created.
>)
BUG.(HLT,BKUPDF,PAGEM,SOFT,<BKUPD - Bad CST1 entry or inconsistent CST>,,<
Cause: A routine has been called to swap a core page to disk or drum. It has
decided to swap to the disk. The BUGHLT indicates that no back address
was found in the CST. This usually indicates bad data in the CST or a
bad pointer in a page table.
>)
BUG.(CHK,BLKF1,IO,HARD,<BYTINA - BLKF set before calling service routine>,,<
Cause: This is a consistency check in BYTINX. The environment is in IO where
sequential input is being processed. The code is getting ready to
jump to the device dependant code. Before doing so it sees if a bit
(BLKF) is set in STS (AC 8). This bit indicates that the service
routine wants to block. Therefore, no matter what the device
dependent routines do, the process ultimately blocks. It is
unlikely that this is being done on purpose. It is more likely that
somewhere BLKF is not being cleaned up properly.
Action: If this is becoming a problem change the BUGCHK to a BUGHLT and
look at the dump. If FILSTS for the current JFN has the bit on then
the problem gets a little tricky since the previous use of it left
BLKF on. If BLKF is off in FILSTS then somewhere past the call to
CHKJFN it is being turned on.
>)
BUG.(CHK,BLKF2,IO,HARD,<BYTOUA - BLKF set before call to service routine>,,<
Cause: This is a consistency check in BYTOUA. The environment is in IO just
before it gets ready to call the device dependent routines to do
output. Bit BLKF in STS (AC 8) is on. It should be off. It causes
the process to block. It is unlikely that this sort of knowledge
is available. It is more likely that this is an error.
Action: If the problem persists change the BUGCHK to a BUGHLT and look at
the dump. If FILSTS for the current JFN has the BLKF bit on then
the last one to user the JFN left it in that state. A hard problem
to find. If BLKF is off in FILSTS then somewhere after the call to
CHKJFN the bit is being set on not reset.
>)
BUG.(CHK,BLKF3,JSYSF,HARD,<CLZDO - BLKF set before call to service routine>,,<
Cause: BLKF has been set before the call to the device dependent service
routine which should be responsible for setting this bit.
Action: If this BUGCHK persists, change it to a BUGHLT and find out where
the bit is being set.
>)
BUG.(CHK,BLKF4,JSYSF,HARD,<.GDSTS - BLKF set before call to device routine>,,<
Cause: The bit indicating that a device routine wishes to block has been
set before the call to the device routine has been made. This bit
must be set to zero before the call so we do not block needlessly
(maybe never to wake up).
Action: The bit is being cleared. If this problem persists, change the
BUGCHK to a BUGHLT and find out where BLKF is being set.
>)
BUG.(CHK,BLKF5,JSYSF,HARD,<.MTOPR - BLKF set before call to device routine>,,<
Cause: The bit indicating that a device routine wishes to block has been
set before the call to the device routine has been made. This bit
must be set to zero before the call so we do not block needlessly
(maybe never to wake up).
Action: The bit is being cleared. If this problem persists, change the
BUGCHK to a BUGHLT and find out where BLKF is being set.
>)
BUG.(CHK,BLKF6,JSYSF,HARD,<.SDSTS - BLKF set before call to device routine>,,<
Cause: The bit indicating that a device routine wishes to block has been
set before the call to the device routine has been made. This bit
must be set to zero before the call so we do not block needlessly
(maybe never to wake up).
Action: The bit is being cleared. If this problem persists, change the
BUGCHK to a BUGHLT and find out where BLKF is being set.
>)
BUG.(INF,BREAKI,JSYSA,SOFT,<Password guess threshold exceeded>,<<T1,CTRLTT>,<T2,USERNO>,<T3,STRNAM>,<T4,DIRNUM>>,<
Cause: Someone has typed more than MXFLCT incorrect passwords. All password
validation attempts by this job are denied for 3 minutes (MINTVL).
It is possible the person is trying to guess passwords.
Action: See if someone is trying to guess a user's password or if the user is
really making an honest mistake.
Data: CTRLTT - The line number of the job
USERNO - The user number if the job is logged in
STRNAM - The sixbit name of the structure of the target
DIRNUM - The directory number of the target
>,,<DB%NND>)
BUG.(HLT,BTBCR1,FILINI,HARD,<FILINI - No bit table file and unable to create one>,,<
Cause: MNTBTB failed, so the system restart logic called CRTBTB to create a
new bit table. CRTBTB failed too, so the BTBCR1 BUGHLT happened.
CRTBTB fails if INIBTB fails. This can happen if DSKASA fails to
assign a disk page, or if SWPASN fails to assign swapping space.
Action: Select a different disk pack to build the system on. If the trouble
persists, have Field Service check the system.
>)
BUG.(HLT,BTBCRT,FILINI,HARD,<FILRFS - Could not initialize bit table for Boot Structure>,,<
Cause: During special system startup in which PS: was being refreshed, CRTBTB
failed to build a new bit table. See CRTBT1 for more detail on why
CRTBTB fails.
Action: Select a different disk pack to build the system on. If the trouble
persists, have Field Service check the system.
>)
BUG.(INF,CCBROT,DIRECT,HARD,<CPYBAK - Can't copy backup root-directory>,<<T1,LSTERR>>,<
Cause: The monitor has detected a problem with the backup root-directory and
is attempting to copy the primary root-directory to the backup. The
copy failed.
Action: Determine which disk was being used at the time and have Field Service
check the device to see if it is working properly.
Data: LSTERR - Error returned from CPYBAK
>,R,<DB%NND>)
BUG.(HLT,CDILVT,CDRSRV,HARD,<Illegal device function code>,,<
Cause: In CDRSRV an illegal function code was specified. The
function codes allow for opening, closing, reseting, etc. A code
was specified that is out of range.
Action: Use the stack to find the routine that is specifying the wrong code.
The code is usually specified as the address field of a CALL
instruction.
>)
BUG.(HLT,CFACCF,CFSSRV,SOFT,<CFSSRV - SC.ACC failed>,<<T2,NODE>,<T1,CID>,<T3,ERR>>,<
Cause: The call to SC.ACC to accept a connection failed. This indicates a
possible problem with SCAMPI.
Data: NODE - Node Number
CID - Connect ID
ERR - Error returned by SC.ACC
>)
BUG.(HLT,CFANAE,CFSSRV,SOFT,<CFSSRV - No allocation entry>,,<
Cause: The caller wanted to update a directory allocation entry, and no such
entry could be found.
>)
BUG.(HLT,CFAOFM,CFSSRV,SOFT,<CFSSRV - OFN mismatch in cached token>,,<
Cause: A cached access token has been found for a certain OFN but the OFN
stored in the hash block does not agree.
>)
BUG.(HLT,CFBAFN,CFSSRV,SOFT,<CFSSRV - Bad function to CFSDAU>,,<
Cause: CFSDAU was called with an invalid function.
>)
BUG.(CHK,CFCCLZ,CFSUSR,SOFT,<CFSSRV - Can't close CFS connection>,<<T2,NODE>,<T1,CID>,<T3,ERROR>>,<
Cause: A "set CI offline" has been requested and SCA refuses to close a CFS
connection. The call to SC.DIS to disconnect from the remote node
failed. This may result in a CFRECN BUGHLT when the CI is put on-line.
Action: It is unlikely that this can be caused by a software problem. Field
Service should run diagnostics on the CI20. Field Service should make
sure that you have up to date hardware (most recent link boards) in
your CI20. If this problem is reproducible, set this BUGCHK dumpable
and send in an SPR along with the dump and instructions on reproducing
the problem.
Data: NODE - node number of remote
CID - connect ID
ERROR - error code returned by SCA
>)
BUG.(INF,CFCCML,CFSSRV,SOFT,<CFSSRV - Cluster cease message lost>,,<
Cause: Another system sent a "cluster cease" that could not be queued because
there was no available resident free space.
Action: If this becomes persistent, find out why there is no freespace
available. Run SYSDPY and use the RE display to see if the general
pool is being used. If there is no one particular freespace hog,
consider building your monitor with an increase in the freespace
general pool. If this problem is reproducible, set this BUGINF
dumpable and submit an SPR along with the dump and instructions on
reproducing the problem.
>,,<DB%NND>)
BUG.(INF,CFCDCF,CFSSRV,SOFT,<CFSSRV - Cluster dump connect attempt failed>,<<Q2,NODE>,<T1,ERR>>,<
Cause: An attempt to connect to the cluster dump listener on another node has
failed. Thus, when this node crashes, the remote node does not.
Chances are, this BUGINF does not appear on the CTY or in ERROR.SYS,
but it should be queued in the dump.
Action: If this problem is reproducible, set this BUGCHK dumpable and submit an
SPR along with the dump and instructions on reproducing the problem.
Data: NODE - CI node number of remote system
ERR - Error code returned by SC.CON
>,,<DB%NND>)
BUG.(HLT,CFCLDP,CFSSRV,SOFT,<CFSSRV - Forced cluster dump>,,<
Cause: A call was made to CFSDMP to force a cluster dump. The other systems
in the cluster should have crashed with a KLPDMP BUGHLT. This occurs
when location 67 has a non-zero value or it the code actually calls
into CFSDMP directly.
Action: If the cluster dump was not performed legitimately, then it is possible
that the monitor mistakenly trashed location 67 causing the cluster
dump. In this case, an SPR should be submitted along with the dump(s)
from the systems and any insturctions on reproducing the problem.
>)
BUG.(INF,CFCONN,CFSSRV,SOFT,<CFSSRV - CFS connection>,<<T2,NODE>,<T1,CID>,<T3,SERNUM>>,<
Cause: A CFS connection has been received from another node on the CI20.
Data: NODE - Number of connecting node
CID - Connect ID
SERNUM - Serial number of remote node
>,,<DB%NND>)
BUG.(HLT,CFCTNF,CFSSRV,SOFT,<CFSSRV - Could not find cached token while uncaching>,,<
Cause: CFSUNC has been called to uncache the token for a certain OFN. The
file access token should have been cached and in the hash table but it
was not found. To diagnose the problem, attempt to locate the token in
question and find out where it is (probably on the free list). It
is not easy to determine how the token got there.
>)
BUG.(CHK,CFDDSN,CFSSRV,HARD,<CFSSRV - Duplicate DSN detected>,<<T1,HGHDSN>,<T2,LOWDSN>,<T3,ALIAS>>,<
Cause: Routine CFMDSN was called to register a DSN for a disk mount. However,
the DSN supplied was already is use by a structure of another name.
The most likely cause of this BUGCHK is that there is more than one
drive with the same serial number available to the system.
Action: Contact Field Service and have them change the serial number on one of
the drives.
Data: HSHDSN - High order disk serial number
LOWDSN - Low order disk serial number
ALIAS - Alias name of the disk being mounted
>)
BUG.(HLT,CFDGON,CFSSRV,SOFT,<CFMDSN - DSN token has disappeared>,,<
Cause: In routine CFMDSN, CFSGET has granted us access to an already existing
DSN token. But when we tried to look up token in the CFS data base, it
could not be found. Examination of the dump should try to determine
how the DSN token was released. The only way a DSN token can be
released is by dismounting the structure to which it belongs. This
should not happen because we have the device tables locked while we are
in CFMDSN.
>)
BUG.(INF,CFDISC,CFSSRV,SOFT,<CFSSRV - CFS disconnect>,<<T2,NODE>,<T1,CID>,<T3,SERNUM>>,<
Cause: A CFS disconnect request has been received from a remote node on the
CI20.
Data: NODE - Remote node number
CID - Connect ID
SERNUM - Serial number of remote node
>,,<DB%NND>)
BUG.(INF,CFDLSF,CFSSRV,SOFT,<CFSSRV - Failed to get dump listener>,<<T1,ERR>>,<
Cause: The call to SC.LIS to set up the dump listener failed. This system
does not participate in any cluster dump.
Action: If this problem is reproducible, set this BUGINF dumpable and submit an
SPR along with the dump and instructions on reproducing the problem.
Data: ERR - Error code returned by SC.LIS
>)
BUG.(HLT,CFEQHF,CFSSRV,SOFT,<ENQ token not found>,,<
Cause: An ENQ token was just created and one already existed so the block
which was passed has been released. Now, we are attempting to find the
original block and this has failed. To diagnose this problem, look in
CFS free space to try to find the remains of the original block. Try
to determine how the block could have been released even though this
routine acquired it.
>)
BUG.(HLT,CFEQSF,CFSSRV,SOFT,<CFSSRV - Could not convert OFN to JFN>,,<
Cause: ENQSET was called to set up the root and qualifier for the ENQ token
for a certain OFN. This OFN was dismounted so the routine must look in
the JFN block for the structure name because the STRTAB entry is zero.
However, the conversion from OFN to JFN failed. To diagnose this
problem, investigate the dump to try to determine why the call to
OFNJFN failed.
>)
BUG.(HLT,CFGARD,CFSSRV,SOFT,<CFSSRV - Vote packet address is bad>,,<
Cause: A bad vote packet address has been given to CFSWDN. The guard word did
not contatin 252525.
>)
BUG.(HLT,CFKBNS,CFSSRV,SOFT,<CFSSRV - Keep bit not set>,,<
Cause: CFSAWT/CFSAWP was called to get a CFS resource block for a token. This
routine ALWAYS sets the HSHKPH bit when it obtains this token. This
BUGHLT was arrived at when the routine was returning to its caller and
the keep bit was not set. The dump should reveal how the keep bit got
cleared or who cleared it.
>)
BUG.(HLT,CFLISF,CFSSRV,SOFT,<CFSSRV - SC.LIS failed>,<<T1,ERR>>,<
Cause: The call to SC.LIS failed in CFSLSN. This indicates a possible problem
with SCAMPI.
Data: ERR - Error code returned by SC.LIS
>)
BUG.(HLT,CFNLTK,CFSSRV,SOFT,<CFSSRV - Null disk address given to CFSAWT>,,<
Cause: A call was made to create an OFN access token but SPTH for the OFN is
not set up.
>)
BUG.(HLT,CFRECN,CFSSRV,SOFT,<CFSSRV - Illegal reconnect>,<<T2,NODE>,<T1,CID>>,<
Cause: The VC between this system and another has been continued illegally.
This BUGHLT only occurs in a cluster and on the system
with the lowest CPU serial number. It happens when a system joins the
cluster and that system has been away longer than the CFS vote delay
period. This time period is calculated to be
(DLYTIM*maximum_CI_node_seen)+NDSTTM. Currenty this is 5 seconds per
CI node plus 10 seconds grace period.
Action: This BUGHLT usually occurs when a system has been hung for a long
period of time on the CI or if someone has stopped and continued a
halted system in the cluster (using EDDT breakpoints or other methods).
If this problem is persistent, there is a good chance that one of the
systems in the cluster is having CI problems. Another step which could
be taken would be to increase the value in DLYTIM and see if this
provides enough time for the cluster to stablize.
Data: NODE - Number that has re-established a connection
CID - Connect ID
>)
BUG.(HLT,CFSBNO,CFSSRV,SOFT,<CFSSRV - Broadcast of unknown OFN>,,<
Cause: CFSBEF was called to broadcast the EOF pointer for an OFN. This OFN
does not have an entry in the CFSOFN table.
>)
BUG.(HLT,CFSBTP,CFSSRV,SOFT,<CFSSRV - Bad token packet>,,<
Cause: CFSAWT has been called to acquire an access token for an OFN. The OFN
access token already exists on this system and the block address is in
CFSOFN. But the OFN recorded in the block does not match the one
passed into CFSAWT.
>)
BUG.(HLT,CFSICN,CFSSRV,SOFT,<CFSSRV - Illegal configuration>,,<
Cause: This system has detected an illegal configuration. There may be too
many nodes in the network. The caller of this routine should be
examined for more details.
>)
BUG.(HLT,CFSIGT,CFSSRV,SOFT,<CFSSRV - Illegal return from CFSGET>,,<
Cause: A call to CFSGET, CFSGTT or CFSGTL returns +1 even though a
wait-until-successful was requested.
>)
BUG.(CHK,CFSILJ,CFSUSR,SOFT,<CFSSRV - Illegal Local Job Number>,,<
Cause: LCL2GL as been called to convert a local job number to a global index,
but the local job number is invalid.
Action: If this problem is reproducible, set this BUGCHK dumpable and send in
an SPR along with the dump and instructions on reproducing the problem.
>)
BUG.(HLT,CFSION,CFSSRV,SOFT,<CFSSRV - Invalid OFN number>,,<
Cause: CFSAWT or CFSAWP has been called with an invalid OFN number. These
routines expect the OFN number to be in the right half of T1 and to be
in the expected range of values. The caller in the dump should be
examined to determine why it passed a bad OFN number.
>)
BUG.(HLT,CFSKPD,CFSSRV,HARD,<CFSSRV - The KLIPA failed>,,<
Cause: The KLIPA hardware or the CI has failed and CFS cannot continue.
Action: Have field service check out the system's CI20.
>)
BUG.(HLT,CFSMPB,CFSSRV,SOFT,<CFSSRV - CFSMAP returned in-use entry>,,<
Cause: CFSMAP has returned a resource block that is already in use. This bug
is a debugging check.
>)
BUG.(CHK,CFSMTF,CFSSRV,SOFT,<CFSSRV - MTA resource lock fouled>,,<
Cause: CFSMTA was called to acquire the MTA access token for a MTA device,
but the share count for the token was not zero. This indicates that
CFSMTA has been called twice for the same MTA access token.
Action: If this problem is reproducible, set this BUGCHK dumpable and submit an
SPR along with the dump and instructions on reproducing the problem.
>,CFSRSK)
BUG.(CHK,CFSMTO,CFSSRV,SOFT,<CFSSRV - MTA resource not owned at release>,,<
Cause: CFSMTR was called to release the MTA access token for a MTA device,
but the resource was not locked. This indicates a problem in getting
the resource at the time that the MTA device was assigned to the job.
Action: If this problem is reproducible, set this BUGCHK dumpable and submit an
SPR along with the dump and instructions on reproducing the problem.
>)
BUG.(CHK,CFSMTU,CFSSRV,SOFT,<CFSSRV - CFSMTA called for assigned device>,,<
Cause: CFSMTA was called to acquire the MTA access token for a MTA device,
but the MTA device was already assigned to a different job. This check
should have been done by the calling code.
Action: If this problem is reproducible, set this BUGCHK dumpable and submit an
SPR along with the dump and instructions on reproducing the problem.
>,RSKP)
BUG.(HLT,CFSNAF,CFSSRV,SOFT,<Allocation entry not found>,,<
Cause: An allocation entry has just been created and CFSSRV can't find it in
the hash table.
>)
BUG.(HLT,CFSNOT,CFSSRV,SOFT,<CFSSRV - OFN token table and hash table disagree>,,<
Cause: CFSSRV is trying to remove a file access token and has found the token
in the hash table but not in the OFN token table. This indicates that
one of the data bases is incorrect.
>)
BUG.(HLT,CFSOAC,CFSSRV,SOFT,<CFSSRV - Invalid access to cached OFN>,,<
Cause: CFS has received a message from a remote system that an OFN has changed
and needs to be verified again. But CFS has found that the OFN is
cached and the OFNLAC bit is not set. This should not happen since
this bit should be set when the other system was granted access.
>)
BUG.(HLT,CFSOFB,CFSSRV,SOFT,<CFSSRV - OFN owned at CFSOFC>,,<
Cause: CFS has received a message from a remote system that an OFN has changed
and needs to be verified again. But CFS has found that it owns the OFN.
It should never have received such a message.
>)
BUG.(HLT,CFSRNM,CFSSRV,SOFT,<CFRDSN - Could not rename DSN entry>,,<
Cause: A pack of a mounted structure has been moved to a new unit and the new
CFS mount resource already exists for the new drive. Or, a drive on
which there is a pack of a mounted structure has been given a new drive
serial number and the new CFS mount resource already exists for the new
drive. This indicates either the CFS database is wrong, or PHYSIO's
database is wrong.
>)
BUG.(HLT,CFSSEZ,CFSSRV,SOFT,<CFSSRV - Section 0>,,<
Cause: HSHLOK was called from section zero. HSHLOK must be called from a
non-zero section. Examine the stack and change the caller of HSHLOK to
run in a non-zero section.
>)
BUG.(HLT,CFSSUF,CFSSRV,SOFT,<CFSSUG - Could not find entry to upgrade>,,<
Cause: A request was made to change the mount type of a structure, and the CFS
data base has no record of the structure being mounted.
>)
BUG.(HLT,CFSTCM,CFSSRV,SOFT,<CFSSRV - Access token cached but not marked>,,<
Cause: A cached access token is being requested. As expected, it is in the
hash table and not in CFSOFN. But HSHTAM is not set as expected.
>)
BUG.(HLT,CFSTND,CFSSRV,SOFT,<CFSSRV - Access token not deleted>,,<
Cause: CFSCON was called to verify that an access token has been deleted
before an OFN is released. This BUGHLT indicates that the token has not
been deleted or cached.
>)
BUG.(HLT,CFSTUC,CFSSRV,SOFT,<CFSSRV - Unexpected error encountered during structure operation>,<<T2,CODE>>,<
Cause: A structure mount or dismount failed and generated an unexpected or
illegal error code of zero. This should never happen.
Data: CODE - Bogus error code
>)
BUG.(HLT,CFSUCM,CFSSRV,SOFT,<CFSSRV - Uncaching mismatch>,,<
Cause: CFSUNC has been called to uncache the token for a certain OFN. We
found the access token for the OFN but the OFN stored in the resource
block does not match the one we should be uncaching.
>)
BUG.(HLT,CFSUCN,CFSSRV,SOFT,<CFSSRV - Uncaching non-cached token>,,<
Cause: CFSUNC has been called to uncache the token for a certain OFN. We
found the access token for the OFN but it is not cached.
>)
BUG.(HLT,CFSVFL,CFSSRV,SOFT,<CFSSRV - Structure verify failed>,,<
Cause: CFS could not verify an existing structure resource during the join
operation. This probably means there is a structure naming conflict.
There is one known scenario for this BUGHLT. If the CFS joining
process did not complete properly, then this system may have acquired
some of the resources exclusively when they were also held on other
nodes. By the time STRVVT is called to verify the structure access,
all the CFS connections have completed. So, now that all the other
cluster systems are voted with, the verification of exclusive access
has failed.
Action: Insure that there is no naming conflict or drive serial number conflict
with the structures. Insure that the BS: and PS: are not mounted
exclusively by any other system in the cluster. If the structures all
appear to be in order, submit an SPR with the dump and a copy of
MONITR.EXE.
>)
BUG.(HLT,CFSWMC,APRSRV,HARD,<Wrong UCODE for CFS>,,<
Cause: The KL10 microcode currently running does not support CFS. KL10
microcode edit 442 or later is suggested, as this microcode implements
PMOVE/PMOVEM instructions.
Action: Install the correct microcode on the front end and reload the
system. Be sure to answer "YES" to the "RELOAD MICROCODE" prompt
from KLI.
>)
BUG.(HLT,CFTYAM,CFSSRV,SOFT,<CFSSRV - Type of Access Mismatch>,,<
Cause: The local system was going to broadcast to all other systems that
it was about to update an OFN. However, it was found that the local
system does not have exclusive access to the OFN. If an OFN has been
changed in anyway, it should have acquired the write token and this
should theoretically never happen.
>)
BUG.(HLT,CFWTNF,CFSSRV,SOFT,<CFSSRV - Cached OFN not found when freed>,,<
Cause: CFSFWT was called to free the file access token for an OFN and the OFN
is cached. But the call to HSHLOK did not find the resource block for
the file access token. One should be there.
>)
BUG.(HLT,CFZCNT,CFSSRV,SOFT,<CFSSRV - Zero HSHCNT before decrement>,,<
Cause: A routine wants to decrement the resource share count but the count is
already zero.
>)
BUG.(INF,CGROFN,DIRECT,SOFT,<CHKBAK - Can't get root-directory OFN>,<<T1,LSTERR>>,<
Cause: An OFN cannot be assigned for the backup Root-Directory of a file.
Action: There may be insufficient OFNs on your system. If this problem
persists, increase NOFN and rebuild your monitor. If this does not
help, then use the DOB% facility to take a dump and submit an SPR.
Data: LSTERR - Error returned from ASGOFN
>,R,<DB%NND>)
BUG.(HLT,CHKRNR,SCHED,HARD,<CHKR fork not run for too long>,,<
Cause: The monitor creates a fork in job zero that exists for the life of the
system. This fork runs periodically to perform essential system
functions. The BUGHLT occurs when the scheduler detects that the CHKR
fork has not run for too long a time.
Possible causes for CHKR not running include:
1. A disk failure that prevents job 0 from updating the disk
2. Removal of a disk that is mounted
3. An HSC or MSCP server disk is hung
4. Logic errors in the monitor.
Action: Check the console output from this system. Try to find out if any disk
problems are blocking DDMP. It is unlikely that this is a software
problem. Examination of the dump will probably show that some fork is
at the top of the go list and is blocked and NOSKED. If this fork is
DDMP then it is a disk problem. If CHKR is at the top of the go list
there is probably a software problem.
>)
BUG.(CHK,CINACF,CIDLL,SOFT,<Accept failed>,<<T1,ERRCOD>>,<
Cause: CIDLL decided to accept an incoming connection, but the accept
call to SCA failed.
Data: ERRCOD - The error code returned by SCAMPI
>)
BUG.(CHK,CINBCD,CIDLL,SOFT,<Bad CID>,<<CID,CID>,<MB,DSPTCH>>,<
Cause: SCAMPI supplied a bad connect ID on a callback to CIDLL.
CIDLL does not have any connection open with the particular
connect ID. This bugcheck may occur if one system crashes
while a DECnet/CI message is outstanding.
Data: CID - The bad connect ID
DSPTCH - The function dispatch word
>,RTN)
BUG.(CHK,CINBSC,CIDLL,SOFT,<Unexpected SCA callback>,<<T1,SCAFUN>,<CID,SCACID>>,<
Cause: SCAMPI issued a callback to CIDLL with an unexpected callback.
Data: SCAFUN - The bad function code
SCACID - The connect ID that SCAMPI supplied on the call
>,RTN)
BUG.(CHK,CINFRB,CIDLL,SOFT,<Failed to recycle buffer>,<<CID,SCACID>,<P1,BUFADR>>,<
Cause: CIDLL has received a datagram, and failed to return the buffer to
the SCA receive queue. One less buffer is now on the port
datagram receive queue.
Data: SCACID - The connect ID
BUFADR - The address of the buffer that could not be posted
>)
BUG.(CHK,CINLER,CIDLL,SOFT,<Local port # equal to remote>,,<
Cause: The ACTIVATE routine noticed that we tried to connect to ourselves.
CIDLL should have detected this before.
>,RTN)
BUG.(CHK,CINLIE,CIDLL,SOFT,<Listen failed>,<<T1,ERRCOD>>,<
Cause: CIDLL asked SCA for a "promiscuous listen" but the call failed.
As a consequence, the system does not accept any future incoming
DECnet/CI connections.
Data: ERRCOD - The error code returned from SCAMPI
>,RTN)
BUG.(CHK,CINNIC,CIDLL,SOFT,<Received illegal packet format>,<<T4,FLAGS>>,<
Cause: CIDLL received a DECnet datagram, but the packet mode was not
industry compatible.
Data: FLAGS - The flags and mode of the packet
>,RTN)
BUG.(INF,CINRRL,CIDLL,SOFT,<Remote rejected our protocol version>,<<T1,LOCVER>>,<
Cause: The remote port does not understand the protocol version of DECnet/CI.
The other system may be running another version of TOPS-20, or the
remote system may be running VMS. DECnet/CI between TOPS-20
and VMS is not supported.
Action: If this is caused by a VMS system, then remove the system from
your cluster or disable the CI dircuit to that system. If it is
caused by a TOPS-20 system, then you should make both systems
run the same version of the software or disable the CI circuit
between them.
Data: LOCVER - TOPS-20 DECnet/CI protocol version
>)
BUG.(INF,CINRWP,CIDLL,SOFT,<Remote supplied wrong protocol version>,<<T1,REMVER>,<T2,LOCVER>>,<
Cause: The remote port is not running the same protocol version of DECnet/CI.
This may happen if you are running DECnet/CI between two different
versions of TOPS-20, or if you are running DECnet/CI between TOPS-20
and a VMS system. DECnet/CI between TOPS-20 and VMS is not
supported.
Action: If this is caused by a VMS system, then remove the system from
your cluster or disable the CI dircuit to that system. If it is
caused by a TOPS-20 system, then you should make both systems
run the same version of the software or disable the CI circuit
between them.
Data: REMVER - The remote ends protocol version
LOCVER - The local TOPS-20 systems protocol version
>,RTN,<DB%NND>)
BUG.(CHK,CINUCB,CIDLL,SOFT,<Unexpected SCA callback>,<<T1,STATE>,<CID,SCACID>,<MB,ROUADR>>,<
Cause: SCA issued a callback to DECnet/CI that was not expected in the
current state of the connection.
Data: STATE - The DECnet/CI connection state
SCACID - The connect ID for the connection in question
ROUADR - The address of the SCA callback processing routine
>,RTN)
BUG.(CHK,CINUDR,CIDLL,SOFT,<Unexpected datagram receive>,<<CID,SCACID>>,<
Cause: CIDLL received a datagram with the connection state not being RUN.
Data: SCACID - The connect ID
>,RTN)
BUG.(CHK,CINUEC,CIDLL,SOFT,<Unexpected connect response>,<<CID,SCACID>>,<
Cause: CIDLL received a callback from SCA stating that a connection response
was available. CIDLL was not expecting any such callback for
the port.
Data: SCACID - The connect ID
>,RTN)
BUG.(CHK,CINWNB,CIDLL,SOFT,<Wrong number of buffers>,<<T2,COUNT>>,<
Cause: CIDLL asked SCAMPI to allocate 1 buffer, but received more than
one. The extra buffers are now lost.
Data: COUNT - The number of buffers SCAMPI allocated and returned
>)
BUG.(CHK,CIPBAD,IPCIDV,SOFT,<IP host number conflicts with CI node number>,<<T1,HOST>,<T2,NODE>>,<
Cause: The low order octet of the local internet host address for the CI
interface (from the SYSTEM:INTERNET.ADDRESS file) disagrees with the
systems CI node number.
Action: Fix the SYSTEM:INTERNET.ADDRESS file. The low order (right most)
octet in the address must be the same as the CI node number.
Data: HOST - CI node number from SYSTEM:INTERNET.ADDRESS file
NODE - CI node number reported by the CI20 hardware
>,,<DB%NND>)
BUG.(CHK,CIPBLP,IPCIDV,SOFT,<IPCIDV input buffer list problem>,<<T1,CNT>,<T2,BFR>>,<
Cause: The internet SCA interface requested a buffer for an incoming datagram
and none were available.
Action: If the problem does not clear up in a short period of time, change
the BUGCHK to a BUGHLT and submit an SPR. It is possible that there
are not enough input buffers for IPCI.
Data: CNT - Number of IPCI buffers
BFR - Pointer to first buffer
>)
BUG.(INF,CIPDFQ,PHYKLP,SOFT,<PHYKLP - Datagram free queue empty>,,<
Cause: The CI20 port found the datagram free queue empty.
Action: It is likely that there is CI20 microcode or monitor software problem.
It is less likely that there is a CI20 hardware problem. If this
problem can be reproduced, change this to a BUGHLT and send in an SPR
with a dump and instructions on how to reproduce the problem.
>,,<DB%NND>)
BUG.(INF,CIPDNS,IPCIDV,SOFT,<Datagram not sent>,<<T1,ERROR>>,<
Cause: The internet SCA interface attempted to queue a buffer to SCA
and the buffer was refused. This indicates a problem with SCA.
Action: If this problem becomes chronic, change the BUGINF to a BUGHLT
and submit an SPR.
Data: ERROR - Error from SCA.
>)
BUG.(INF,CIPMFQ,PHYKLP,SOFT,<PHYKLP - Message free queue empty>,,<
Cause: The port found the message free queue empty.
Action: It is likely that there is CI20 microcode or monitor software problem.
It is less likely that there is a CI20 hardware problem. If this
problem can be reproduced, change this to a BUGHLT and send in an SPR
with a dump and instructions on how to reproduce the problem.
>)
BUG.(CHK,CIPNBA,IPCIDV,SOFT,<No IPCI input buffers available.>,<<T1,BFRCNT>>,<
Cause: The internet SCA interface was unable to assign any buffers from
the internet free space manager. This should be a temporary condition.
Action: If the problem does not recitfy itself in a reasonable period of
time, then it is possible that there insufficient input buffers
for IPCI.
Data: BFRCNT - Number of buffers available.
>)
BUG.(CHK,CKPLEN,JSYSM,HARD,<USGINI - Illegal checkpoint entry length>,,<
Cause: While executing USGINI, an active checkpoint entry was found with an
illegal length. This could be caused by a trashed checkpoint file.
Action: Delete ACCOUNT:CHECKPOINT.BIN and reload system.
>)
BUG.(CHK,CLABIU,CLUDGR,SOFT,<CLUDGR - All buffers in use>,<<T2,ID>,<T3,BUF>>,<
Cause: The CLUDGR SYSAP is only allowed to use 2*NFKS buffers per CID.
When it reaches this limit, it cannot queue up anymore. The CLUDGR
fork is supposed to return buffers when it is done with them. Also,
any user process that uses SCA buffers is supposed to return them.
This BUGCHK is telling you that all buffers are in use by the
CLUDGR SYSAP.
Action: Determine who is using up all of the SCA buffers.
Data: ID - Connect ID using lots of buffers
BUF - Number of buffers in use by this CID
>)
BUG.(INF,CLDISC,CLUDGR,SOFT,<CLUVAC - CLUDGR SYSAP disconnect>,<<T2,NODE>,<T1,CID>>,<
Cause: A CLUDGR disconnect request has been received from a remote
node on the CI-20.
Data: NODE - Remote CI node number
CID - Connect ID.
>,,<DB%NND>)
BUG.(HLT,CLFNSB,CLUFRK,SOFT,<No SCA buffers for request>,,<
Cause: This BUG indicates that a request came in from a remote system
for the local CLUDGR fork to perform. However, after the CL%ALL
bit was set in the request block, there were no SCA buffers to
reassemble. CL%ALL must have been erroneously set.
>)
BUG.(CHK,CLNOLA,CLUDGR,SOFT,<CLUINI - SCA set online failed>,<<T1,ERR>>,<
Cause: Calling SC.SOA to notify SCA of CLUDGR's online address table
failed. This indicates a problem with SCA.
Data: ERR - Error code as returned by SCA.
>)
BUG.(HLT,CLNOLS,CLUDGR,SOFT,<CLULSN - CLUDGR listener not created>,<<T1,ERR>>,<
Cause: Calling SC.LIS to establish a CLUDGR listener failed. This
indicates a problem with SCA.
Data: ERR - Error code as returned by SCA.
>)
BUG.(INF,CLORBF,CLUDGR,SOFT,<Orphaned buffer received by CLUDGR>,,<
Cause: The CLUDGR SYSAP received an SCA buffer from a remote system
but it could not find a request to associate it with. Somehow,
the request block dissappeared. This could be due to a cluster
state transition.
>)
BUG.(HLT,CLOUTB,CLUFRK,SOFT,<System out of SCA buffers>,,<
Cause: The CLUDGR fork was trying to send a response to a remote
system. However, it could not obtain the SCA bufferage to
do so.
Action: Find out who is eating up the SCA buffers.
>)
BUG.(CHK,CLUACF,CLUDGR,SOFT,<CLUDGR - Accept connect failed>,<<T1,FAIL>>,<
Cause: The call to SC.ACC failed and the CLUDGR SYSAP failed to
accept a legitimate connection.
Action: If this problem becomes malignant, change this BUGCHK to a BUGHLT
and submit an SPR.
Data: FAIL - Error code as returned by SC.ACC
>)
BUG.(INF,CLUCON,CLUDGR,SOFT,<CLUDGR - Connection completed>,<<T2,NODE>,<Q2,CID>>,<
Cause: A connection to the CLUDGR SYSAP has been completed with
another TOPS-20 node on the CI-20.
Data: NODE - CI node number of connector
CID - Connect ID
>,,<DB%NND>)
BUG.(INF,CLUEIN,CLUDGR,SOFT,<CLUDGR - Establishing initial connection>,<<Q1,NODE>>,<
Cause: This BUG. appears when the CLUDGR joining code stumbles on a CI node
that a CLUDGR connection has not been established for. CLUJYN then
attempts to make a CLUDGR connection to this node. Theoretically,
SCA should have notified CLUDGR of all nodes that are online. This
BUG. is simply telling you that one of them has been missed by SCA
and CLUDGR noticed it.
Action: CLUJYN will attempt to establish a connection to the missing node.
There is no manual intervention required.
Data: NODE - CI node that SCA didn't tell us about
>)
BUG.(HLT,CLUFNC,CLUFRK,SOFT,<CLSFRK - Could not create CLUDGR fork>,<<T1,ERROR>>,<
Cause: CFORK% was unable to create a fork for CLUDGR or MSFRK%
was not able to start the CLUDGR fork in monitor mode.
Data: ERROR - Error code returned from JSYS
>)
BUG.(CHK,CLULES,CLUDGR,SOFT,<No CLUDGR connection established with remote node>,<<T1,WHY>>,<
Cause: The call to SC.CON failed to allow this machine to establish a
CLUDGR connection with the remote machine or there are no more
entries in the CLUHST table.
Action: If this is due to no more entries in CLUHST (WHY=0) then there may
be too many systems in this cluster. If the cause is a failing
call to SC.CON (WHY<>0) and this problem becomes chronic then
change this to a BUGHLT and submit an SPR.
Data: WHY - 0 if CLUHST table is full or
Error code from call to SC.CON
>)
BUG.(CHK,CLUNFE,CLUDGR,SOFT,<CLUDGR - No free entry in table>,,<
Cause: CLULSN was called to set up a CLUDGR listener for the local
system. However, this routine could not find a free entry in
the CLUHST table. This indicates a possible coding problem,
SCA malfunction, or an oversized cluster (possibly too many
KLs in the cluster).
Action: Check to see if there are more than the supported number of
KLs in the cluster. If so, remove the excess machines. If this
is not the cause and this problem becomes persistent, change
the BUGCHK to a BUGHLT and submit an SPR.
>,,<DB%NND>)
BUG.(CHK,CLUNKR,CLUDGR,SOFT,<CLUDGR - Unknown callback, returning>,<<T1,CBACK>>,<
Cause: The CLUDGR SYSAP received an unexpected callback from SCA. It
was not prepared to handle this callback. Therefore, a BUGCHK
is issued and the monitor simply returns to SCA. This could be
due to a malfunction in SCA.
Action: If this BUGCHK occurs often, change it to a BUGHLT and submit
an SPR.
Data: CBACK - Callback function from SCA
>)
BUG.(HLT,CLUNSB,CLUFRK,SOFT,<No SCA buffers for response>,,<
Cause: This BUG indicates that a response for an INFO% request was
received by the local system from a remote system and the CL%ALL
bit was set in the request block. However, there were no SCA
buffers with the data. CL%ALL might have been erroneously set
somehow.
>)
BUG.(HLT,CLUOSB,CLUFRK,SOFT,<CLUDGR fork could not get an SCA buffer>,,<
Cause: A request was made for CLUDGR fork to do something. Unfortunately,
this request failed for one reason or another. When CLUDGR fork
tried to tell the remote system it failed, the fork could not
get an SCA buffer to reply. This should not happen as SCA uses
an entire section for buffers.
>)
BUG.(HLT,CLUSCM,CLUFRK,SOFT,<CLUDGR fork send unconditionally failed>,,<
Cause: The local system attempted to send a response to a remote system
MAXTRY times. This BUGHLT indicates that the send failed each
time due to lack of receive credit on the remote system. The
system has used up all of the credits, and the send ties up a fork
on the remote system.
Action: It is possible that CI problems caused this BUGHLT.
>)
BUG.(HLT,CLUWTF,CLUDGR,SOFT,<CLUDGR - Wrong type of format for connection>,<<T2,NODE>>,<
Cause: Some node tried to connect to our CLUDGR SYSYAP. Unfortunately,
the connecting node is lower than us. This BUGHLT indicates something
is definitely wrong as this is not the way CLUDGR was designed.
Data: NODE - CI node number of offending node
>)
BUG.(CHK,CLZABF,JSYSF,HARD,<CLZFFW - Service routine blocked on an abort close>,,<
Cause: The device dependent service routine for a CLOSF% wants to block,
but the user has specified an abort close.
Action: The user will block anyway in an attempt to close the file.
>)
BUG.(INF,CNTOUT,DIAG,HARD,<Read of performance counter timed out>,,<
Cause: The KLIPA did not respond to a read of the performance counters
in the allotted time.
>)
BUG.(CHK,COMBNN,D36COM,SOFT,<Bad local node number>,,<
Cause: The node number set with the NODE command in the CONFIG file
is higher than the DECNET MAXIMUM-ADDRESS value set in the same file.
DECnet cannot initialize.
Action: Make the startup file consistent.
>,RTN)
BUG.(CHK,COMCID,D36COM,SOFT,<Couldn't initalize DECNET>,,<
Cause: SCTINI has found some reason to object about the DECnet environment.
See SCTINI for the reasons it takes a non-skip return.
>,RTN)
BUG.(CHK,COMDNP,D36COM,SOFT,<DNGPOS called with bad MS>,,<
Cause: The AC MS points to memory not used by message blocks.
This was found during range checking.
Action: Trace back to the caller and find out why there is a bad pointer.
>,RTN)
BUG.(HLT,COMDT1,DTESRV,SOFT,<MOVSLJ failed>,,<
Cause: DTESRV attempted to execute a MOVSLJ instruction, but it failed.
Action: Look at the dump and try to determine why the instruction failed.
>)
BUG.(HLT,COMDTE,DTESRV,SOFT,<MOVSLJ failed>,,<
Cause: DTESRV attempted to execute a MOVSLJ instruction, but it failed.
Action: Look at the dump and try to determine why the instruction failed.
>)
BUG.(CHK,COMFWZ,D36COM,SOFT,<Tried to free words at zero>,,<
Cause: DNFWDS was called with a 0 pointer.
Action: Find the caller on the stack and determine why it has no valid
pointer to free space.
>,RTN)
BUG.(CHK,COMIEL,D36COM,SOFT,<Illegal end of list pointer>,,<
Cause: CHAVL, the available count, indicated there was at least one block
on the free list, but the first pointer was zero.
Action: A forward pointer in a block which was returned some time ago was
probably smashed.
>,DONRET)
BUG.(HLT,COMMMS,D36COM,SOFT,<Bad pointer passed to memory manager>,<<T1,BUFFER>,<T2,CALLER>>,<
Cause: When DNGWDS gives out a block of memory, a check word is left
right before the first word of memory given to the user. This
word contains the length of the block in the right half, and a "check"
quantity in the left to verify that this block is what is expected.
This bug means that this word has either been trashed, or the
pointer we have been passed is bad.
Action: First determine if the pointer is bad, if the check word is trashed,
or if the check word is 63D. If the check word is 63D the memory has
already been returned and we are trying to return it again. If the
check word is trashed then possible the owner trashed it or the user
of the memory block previous to this one wrote too far. If FTD36MM=0
then the owner of the memory block is recorded in the block's header.
It has also been determined that COMMMS BUGHLTs can occur because
AC 0 got trashed.
For more detail see FREE.MAC.
Data: BUFFER - Address of faulty buffer
CALLER - Address of caller that provided the buffer
>,RTN)
BUG.(CHK,COMMTS,D36COM,SOFT,<New message block too short>,,<
Cause: A MOVSLJ instruction in D36COM has failed.
Action: If this problem persists and the DOB% facility does not produce
a dump, then change this BUGCHK to a BUGHLT and submit an SPR.
It is possible that there could be a KL microcode bug here so
be sure to include the version you are running in the SPR.
>,RTN)
BUG.(CHK,COMMZP,D36COM,SOFT,<DNMINI was passed a zero pointer>,,<
Cause: Some caller probably meant to ask for zero bytes of user data in T2
and mistakenly put the count in T1, which is supposed to be the
pointer to the message block to refresh.
Action: Find caller on the stack and fix it.
>,RTN)
BUG.(CHK,COMODP,D36COM,SOFT,<DNGOPS called with bad MS>,,<
Cause: The AC MS points to memory not used by message blocks.
This was found during range checking.
Action: Trace back to the caller and find out why there is a bad pointer.
>,RTN)
BUG.(CHK,COMSTB,D36COM,SOFT,<Smear request too big>,,<
Cause: The caller has requested that a very large block be smeared.
Action: Find out what the caller really wanted to smear and fix the call.
>,DNSWD1)
BUG.(CHK,CPTMAP,PAGUTL,SOFT,<SETCPT - CPTPG already mapped>,,<
Cause: A routine was called to setup CPTPG while CPTPG was already setup. All
callers should call RELCPT if CPTPG is mapped.
Action: RELCPT has been called and the system continues to run. If this
bug is reproducable, set it dumpable, and send in an SPR with the dump
and how to reproduce the problem.
>)
BUG.(CHK,CRDBAK,JSYSF,HARD,<CRDIR3 - Could not make backup copy of ROOT-DIRECTORY>,,<
Cause: CPYBAK failed to create a backup copy of the root directory during
a CRDIR% call. CPYBAK is called to re-create the backup copy of the
root directory if the root directory is the superior of the
directory being manipulated with CRDIR%.
Action: The backup copy of the root directory is now corrupted. It must be
fixed by hand.
>,,<DB%NND>)
BUG.(CHK,CRDBK1,JSYSF,HARD,<CRDIR4 - Could not make backup copy of ROOT-DIRECTORY>,,<
Cause: During an attempt to delete a directory directly inferior to the
root directory, CPYBAK failed to update the backup copy of the root
directory.
Action: The backup copy of the root directory is now corrupt and must be
repaired by hand.
>,,<DB%NND>)
BUG.(CHK,CRDNOM,JSYSF,HARD,<CRDIR - Failed to make MAIL.TXT file>,,<
Cause: While creating a new directory that is not FILES-ONLY, CRDIR% could
not create the MAIL.TXT.1 file.
Action: Find out why the MAIL.TXT.1 file could not be created. It is
most likely a problem with the disk.
>)
BUG.(CHK,CRDOLD,JSYSF,HARD,<CRGDGB - Old format CRDIR is illegal>,,<
Cause: The old format of specifying user groups to CRDIR% has been
attempted. This format is no longer supported.
Action: Change the application to use the format specified in the current
Monitor Calls Reference Manual.
>)
BUG.(CHK,CRDSDF,JSYSF,HARD,<CRDIR1 - SETDIR failed on new directory>,,<
Cause: SETDIR failed to map in a directory which has been newly created by
CRDIR%. The CRDIR% call fails and the directory is not created.
>)
BUG.(CHK,CRSPAG,JSYSA,HARD,<VERACT - Account data block crosses a page boundary>,,<
Cause: The monitor's account data base illegally crossed a page boundary.
This indicates a problem with the account data base.
Action: There is a problem with the accounts data base. A new accounts data
base should be installed with ACTGEN. If this BUGCHK is reproducible,
set this bug dumpable and submit an SPR along with the dump along with
instructions on reproducing the problem.
>,,<DB%NND>)
BUG.(CHK,CSHCLR,PAGUTL,HARD,<CLROFN - Cached OFN found at CLROFN>,,<
Cause: The monitor is removing an OFN by calling CLROFN. It is found that the
OFN is "cached". CLROFN calls FRECFS and this is incorrect for cached
OFNs since FRECFS has already been called. This BUG is for debugging
and the OFN deassignment does not proceed.
Action: If this problem can be reproduced, set the bug dumpable, and send in an
SPR along with the dump and how to reproduce the problem.
>,R)
BUG.(CHK,CSHSCF,PAGUTL,SOFT,<Unable to flush cached pages to disk>,,<
Cause: An OFN is being cached so SCNOFK was called to write its pages to disk
but not flush them from memory. However, SCNOFK was not able to write
the OFN pages to disk. This BUGCHK should have followed a ILIBPT
BUGCHK since this is the only way that SCNOFK can fail.
>)
BUG.(HLT,CSKBUG,SCHED,SOFT,<ECSKED when not CSKED>,,<
Cause: An ECSKED was done when the code was not really CSKED. This is clearly
a software problem. This may cause sensitive code to be ruined because
of races.
>)
BUG.(HLT,CSONTV,TTYSRV,SOFT,<TVTCSO called with non-TVT>,,<
Cause: TVTCSO has been called to start output on a TVT with a terminal number
that does not appear to be a TVT line.
>)
BUG.(HLT,CST2I1,PAGEM,SOFT,<Page table core pointer and CST2 fail to correspond>,,<
Cause: A routine has been called to change the map for a page of a process.
The page is being mapped to a file page that is not already shared.
The code is going to create an entry for the file page in the SPT so
that the destination can have a share pointer. The page pointer in the
index block contains a core address. The BUGHLT indicates that the
owner of the core page is not the file page that points to it. This
means that there is an inconsistency in the monitor's data.
>)
BUG.(HLT,CST2I2,PAGEM,SOFT,<MVPT - CST2 inconsistent>,,<
Cause: A routine has been called to move a page from one page table to
another. The source page table has an immediate pointer to a page in
memory. The BUGHLT indicates that the CST entry for that page contains
a different owner from the source identifier that points to it. This
indicates an inconsistency in the monitor's data.
>)
BUG.(HLT,CST2I3,PAGEM,SOFT,<Page table core pointer and CST2 fail to correspond>,,<
Cause: A routine has been called to remove a page from a process's map. The
map contains a share pointer to a file page. The SPT entry to which the
map points contains a core page number. The BUGHLT indicates that the
CST entry for that core page does not point back to the SPT entry.
This is an inconsistancy in the monitor's database.
>)
BUG.(HLT,CTDCHB,CTHSRV,SOFT,<CTERM hibernate routine called>,,<
Cause: The CTERM hibernate routine was called by a misguided DECnet.
It should never be called.
>)
BUG.(INF,CTDEPF,CTHSRV,SOFT,<CTERM host enter passive failed>,,<
Cause: There was a free space allocation failure during an enter passive
for a CTERM host.
Action: Go into SYSDPY's RE display and see which freespace pool is
being used up. If this happens frequently, there may be a
software bug loosing the freespace. However, there may be
insufficient freespace in the pool that has run out. You
could try to increase that pool's size in your monitor.
>)
BUG.(CHK,CTDFRK,MEXEC,SOFT,<Cannot create CTERM fork>,,<
Cause: The CTERM system fork could not be created and started at system
startup.
>)
BUG.(CHK,CTDFSA,CTHSRV,SOFT,<Can't get free space for CTERM>,,<
Cause: During system startup CTERM couldn't get enough free space.
Action: Go into SYSDPY's RE display and see which freespace pool is
being used up. If this happens frequently, there may be a
software bug loosing the freespace. However, there may be
insufficient freespace in the pool that has run out. You
could try to increase that pool's size in your monitor.
>)
BUG.(CHK,CTDILS,CTHSRV,SOFT,<CTERM link is in an unexpected state>,,<
Cause: A CTERM link is in one of these states: Connect Sent, Connect
Rejected; or some illegal state.
Action: The DOB% facility should produce a dump for this bug. If not,
then you have to change the BUGCHK to a BUGHLT to get a
dump before submitting an SPR.
>)
BUG.(INF,CTDPRR,CTHSRV,SOFT,<CTERM protocol error>,<<T2,COUNT>,<T4,BEGIN>,<CDB,CDB>>,<
Cause: A server has sent TOPS-20 a message which it does not like.
Action: The DOB% facility should have taken a dump of this BUG. If
not and this BUGINF persists, change it to a BUGHLT. Examine the
message in the dump to determine the problem.
Data: COUNT - The current byte count
BEGIN - The pointer to the beginning of the message
CDB - The CDB
>)
BUG.(CHK,CTYSTK,TTYSRV,HARD,<FE reload requested because CTY is stuck>,,<
Cause: A job 0 fork was trying to output to the console, but was unable to.
The job entered the J0TCOT scheduler test to wait for the CTY to clear,
so that output could begin again. However, the CTY has remained hung
for a while and a FE reload has been requested.
Action: Check the CTY to see if it is functioning properly, has not run out of
paper, and has not been left at the RSX20F PARSER prompt. If the
problem persists, contact Field Service to have them check out the CTY
and front end hardware.
>,,<DB%NND>)
BUG.(CHK,DDMINT,MEXEC,SOFT,<Unexpected interrupt in DDMP process>,<<ITFPC,ITFPC>,<LSTERR,LSTERR>>,<
Cause: An unexpected error has occurred in the process which handles
migration of pages to disk. The error handler attempts
to reinitialize the context and resume processing. The
stack may be examined for an indication of where the error
occurred.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: ITFPC - PC when error occurred.
LSTERR - Last error code in fork.
>)
BUG.(HLT,DDMPNR,SCHED,HARD,<DDMP fork not run for too long>,,<
Cause: The monitor creates a fork in job zero that exists for the life of the
system. This fork runs periodically to move pages from the swapping
space to files on disk. This is an essential system function. The
BUGHLT occurs when the scheduler detects that the DDMP fork has not run
in too long a time.
Possible causes for DDMP not running include:
1. A disk failure that prevents DDMP from updating the disk
2. Removal of a disk that is mounted causing DDMP to block
3. An HSC or MSCP server disk is hung causing DDMP to block
4. Logic errors in the monitor.
Action: Check the console output from this system. Try to find out if any disk
problems are blocking DDMP. It is unlikely that this is a software
problem. Examination of the dump will probably show that some fork is
at the top of the go list and is blocked and NOSKED. If this fork is
DDMP then it is a disk problem. If CHKR is at the top of the go list
there is probably a software problem.
>)
BUG.(HLT,DDXFRK,MEXEC,SOFT,<Cannot create CHKR fork>,<<T1,ERRCOD>>,<
Cause: CFORK% failed to create the old "Job 0" fork that runs CHKR or the
fork could not be be started in monitor mode with the MSFRK% JSYS.
Data: ERRCOD - Error code returned from JSYS
>)
BUG.(HLT,DDXIN,PAGUTL,HARD,<DDMP - Bad XB>,,<
Cause: DDXBI was called to swap in a forced out index block but the index
block is bad.
Action: Field Service should run SPEAR to examine the SYSERR file for errors on
the boot structure.
>)
BUG.(CHK,DEABAD,DSKALC,SOFT,<DSKDEA - Deassigning bad disk address>,<<T3,STRCOD>,<T2,SECTOR>>,<
Cause: The sector being deassigned was not within the legal
range of sector numbers.
Data: STRCOD - Structure Unique Code
SECTOR - Sector Number on Disk Relative to Start of Structure
>)
BUG.(CHK,DEAUNA,DSKALC,SOFT,<DEDSK - Deassigning unassigned disk address>,<<T1,STRCOD>,<T2,SECTOR>>,<
Cause: The disk address being deassigned was never assigned.
Data: STRCOD - Structure Unique Code
SECTOR - Sector Number on Disk Relative to Start of Structure
>,,<DB%NND>)
BUG.(INF,DELBDD,JSYSF,HARD,<DELDIR - Bad directory deleted>,<<A,STRNAM>>,<
Cause: After a bad directory has been deleted, the attempt to delete and
expunge it's contents has failed. The bit table is now incorrect.
Action: Use CHECKD to rebuild the structure's bit table.
Data: STRNAM - sixbit structure name
>,,<DB%NND>)
BUG.(CHK,DEVUCF,DEVICE,SOFT,<DEVAV - Unexpected CHKDES failure>,,<
Cause: While checking to see if a device is available to the job,
an invalid device designator was passed to a subroutine.
Action: If this problem persists, submit a dump provided by the DOB%
facility or create a dump by changing this to a BUGHLT. Look
at the dump and see who the caller is that is passing in bad
information.
>)
BUG.(HLT,DGUTPG,DIAG,HARD,<DIAG - Locked page list page locked at DIAG unlock>,,<
Cause: The subroutine DGUNLK was called to release the interlock for the
DIAG JSYS. In the case that user pages were locked down, the left
half of the location DIAGFK contains the page containing a list of
the locked pages. The routine DGEXFL should have been called
previously to release this page. However, DGUNLK found that the
page was still assigned.
>)
BUG.(HLT,DGZTPA,DIAG,HARD,<DIAG - Locked page list page was zero>,,<
Cause: The routine DGEXFL in the module DIAG was called to unlock any
user pages when terminating use of the DIAG JSYS. A pointer to
a list of these pages should be in the left half of location
DIAGFK. This pointer was zero. This BUGHLT should never occur,
since DGEXFL returns if the pointer is zero.
>)
BUG.(CHK,DIRACT,DIRECT,SOFT,<ACTBAD - Illegal format for directory account block in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: The file account string block is not correct in the symbol table.
Action: Delete and expunge file, then restore it.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>,,<DB%NND>)
BUG.(CHK,DIRB2L,DIRECT,SOFT,<RLDFB2 - Directory free block too large in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: A bad directory block is being returned.
Action: No immediate action is required. Run CHECKD to reclaim lost pages.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>,,<DB%NND>)
BUG.(CHK,DIRB2S,DIRECT,SOFT,<RLDFB1 - Directory free block too small in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: A bad directory block is being returned. Disk space is lost until
CHECKD is run on the structure.
Action: No immediate action is required. Run CHECKD to reclaim lost space.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>,,<DB%NND>)
BUG.(CHK,DIRBAF,DIRECT,SOFT,<RLDFB5 - Block already on directory free list in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: The directory block returned already.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>,,<DB%NND>)
BUG.(CHK,DIRBCB,DIRECT,SOFT,<RLDFB3 - Directory free block crosses page boundary in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: A bad directory block is being returned.
Action: No immediate action is required. Run CHECKD to reclaim lost pages.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>,,<DB%NND>)
BUG.(CHK,DIRBLK,DIRECT,SOFT,<BLKSCN - Illegal block type in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: There is an unknown code in a directory block.
Action: Use the DELETE command with subcommand DIRECTORY to delete the
directory file, then rebuild the directory.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>,,<DB%NND>)
BUG.(CHK,DIRDNL,CFSSRV,SOFT,<CFSSRV - Directory not locked>,,<
Cause: CFSRDR was called to unlock a directory, but the directory is not
locked.
Action: If this problem is reproducible, set this BUGINF dumpable and submit an
SPR along with the dump and instructions on reproducing the problem.
>)
BUG.(CHK,DIREXT,DIRECT,HARD,<EXTBAD - Illegal format for directory extension block in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: The file extension block is not correct in symbol table.
Action: Delete and expunge file, then restore it.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>,,<DB%NND>)
BUG.(CHK,DIRFDB,DIRECT,HARD,<Illegal format for FDB in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: The format for a FDB in a directory is incorrect.
Action: The directory should be rebuilt.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - The FDB address within the directory
>,,<DB%NND>)
BUG.(CHK,DIRFRE,DIRECT,SOFT,<FREBAD - Illegal format for directory free block in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: The directory free block is not correct.
Action: Use the DELETE command with subcommand DIRECTORY to delete the
directory file, then rebuild the directory.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>,,<DB%NND>)
BUG.(CHK,DIRITD,DIRECT,HARD,<GETIDX - Structure INDEX-TABLE has been damged>,<<A,STRNAM>>,<
Cause: The non-storage related bits in the INDEX-TABLE are not 0. The
structure's INDEX-TABLE is damaged.
Action: Determine the structure name (it's in SIXBIT in the additional data)
and RECONSTRUCT the INDEX-TABLE of this structure with CHECKD.
Data: STRNAM - SIXBIT structure name
>,,<DB%NND>)
BUG.(CHK,DIRNAM,DIRECT,HARD,<NAMBAD - Illegal format for directory name block in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: The file name block is not correct in the symbol table.
Action: Delete and expunge file, then restore it.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>)
BUG.(CHK,DIRPG0,DIRECT,HARD,<DR0CHK - Illegal format for directory page 0 in directory>,<<A,DIRNUM>,<B,STRNAM>>,<
Cause: The directory header contains incorrect information.
Action: Delete directory and rebuild it.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,DIRPG1,DIRECT,HARD,<DRHCHK - Directory header block is bad in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: The directory header contains incorrect information.
Action: Delete the directory and rebuild it.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>,,<DB%NND>)
BUG.(CHK,DIRRHB,DIRECT,SOFT,<RLDFB6 - Attempting to return a header block in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: The address of a block being returned is illegal.
Action: There is an inconsistancy in either the monitor's data structure or on
the file structure. Dismount the structure and run CHECKD on it. If
this does not fix the problem, and this BUGCHK is reproducible on a
healthy file structure, set this bug dumpable and submit an SPR along
with the dump and instructions on reproducing it.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>,,<DB%NND>)
BUG.(CHK,DIRRNA,JSYSA,SOFT,<Remote node alias list inconsistency>,<<T1,DIRNUM>>,<
Cause: GTDRN1 was called to allocate space for the user's remote node alias
block but the pointer to the monitor's remote node alias block provided
by the caller does not contain the correct block type.
Action: There could be a problem with the directory named in the BUGCHK or
there could be a software problem. If no hardware problem is
suspected, and this BUGCHK is reproducible, set this bug dumpable and
submit an SPR along with the dump along with instructions on
reproducing the problem.
Data: DIRNUM - Directory Number
>)
BUG.(CHK,DIRSY1,DIRECT,HARD,<DELDL8 - Directory symbol table fouled up for directory>,<<A,DIRNUM>,<B,STRNAM>>,<
Cause: A disordered directory symbol table was found while expunging a
directory or rebuilding a symbol table.
Action: Rebuild the symbol table. If that fails, delete directory with DELETE
command using the DIRECTORY subcommand and rebuild the directory.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,DIRSY2,DIRECT,SOFT,<MDDNAM - Symbol table fouled up in directory>,<<A,DIRNUM>,<B,STRNAM>>,<
Cause: A bad symbol table format was found when looking up a directory.
Action: Use the EXPUNGE command with subcommand REBUILD to rebuild index table
of the directory listed in the additional data. If this doesn't cure
the problem, delete the directory and rebuild it.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,DIRSY3,DIRECT,HARD,<LOOKUP - Symbol search fouled up in directory>,<<C,DIRNUM>,<B,STRNAM>>,<
Cause: A disordered symbol table was found while looking for string in a
directory.
Action: Use the EXPUNGE command with subcommand REBUILD to rebuild index table
of the directory listed in the additional data. If this doesn't cure
the problem, delete the directory and rebuild it.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,DIRSY4,DIRECT,SOFT,<NAMCM4 - Directory symbol table fouled up in directory>,<<A,DIRNUM>,<B,STRNAM>>,<
Cause: A disordered symbol table was found while comparing name strings.
Action: Use the EXPUNGE command with subcommand REBUILD to rebuild index table
of the directory listed in the additional data. If this doesn't cure
the problem, delete the directory and rebuild it.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,DIRSY5,DIRECT,HARD,<SYMBAD - Illegal format for directory symbol table in directory>,<<A,DIRNUM>,<B,STRNAM>>,<
Cause: A symbol table header contains incorrect information.
Action: Use the EXPUNGE command with subcommand REBUILD to rebuild index table
of the directory listed in the additional data. If this doesn't cure
the problem, delete the directory and rebuild it.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,DIRSY6,DIRECT,SOFT,<RBLDST - Prematurely ran out of room in symbol table in directory>,<<A,DIRNAM>,<B,STRNAM>>,<
Cause: Symbol table space was exhausted while rebuilding symbol table on a
DELDF JSYS.
Action: Move some files out of the directory.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,DIRULK,DIRECT,HARD,<ULKMD2 - Attempt to unlock illegally formatted directory>,<<T1,DIRNUM>,<T2,STRNAM>>,<
Cause: Either there was an attempt to unlock a directory that is disordered,
or a bad argument was given to a subroutine to unlock directory.
Action: Use the DOB% facility to take a dump of this BUGCHK. If you have a
reliable case for reproducing this problem, please include this
procedure when you submit the dump as an SPR.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,DIRUNS,DIRECT,HARD,<UNSBAD - Illegal format for directory user name block in directory>,<<A,DIRNUM>,<B,STRNAM>>,<
Cause: The user name string block is incorrect in the symbol table.
Action: Use the DELETE command with subcommand DIRECTORY to delete the
directory file, then rebuild the directory.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(INF,DLDEF,RSXSRV,HARD,<Logical name define failed for FE CTY>,,<
Cause: A CRLNM was performed to define the logical names for the FE but it
failed.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(HLT,DLLBPA,NISRV,SOFT,<Illegal Portal supplied by PHYKNI>,<<UN,UN>,<PR,PR>>,<
Cause: PHYKNI returned an illegal portal block address when it called back
NISRV.
Data: UN - UN block
PR - Bad portal block address
>)
BUG.(CHK,DMPIOM,DISC,SOFT,<DSKDM - I/O disk dump mode I/O called from monitor>,,<
Cause: DSKDMI or DSKDMO called and the previous context indicates
an exec mode DUMPI% or DUMPO% JSYS, where there aren't any.
Action: If this becomes persistent, change this to a BUGHLT and submit
an SPR. Look at the code because something is broken.
It is also possible that some code has been changed to
do dump mode I/O.
>)
BUG.(INF,DN20ST,DTESRV,HARD,<DTESRV - DN20 stopped>,<<B,DTENO>>,<
Cause: A DN20 has crashed.
Action: Reload the DN20.
Data: DTENO - DTE number
>,,<DB%NND>)
BUG.(CHK,DNDCGE,DNADLL,SOFT,<Couldn't get emergency buffer for DLL>,,<
Cause: DNADLL requires that the memory manager save at least 2 buffers
per link for DNADLL; one for the routing messages ROUTER keeps for
each circuit and one to guarantee some level of route-through ability.
DNADLL was asked to open a data link, but the memory manager could
not guarantee the buffers.
Action: Allocate more memory or settle for fewer circuits.
>)
BUG.(INF,DNDCGV,DNADLL,SOFT,<Couldn't get memory for event arg block>,,<
Cause: DECnet has used all of its available memory and could not give us any.
Action: Try to determine who is using all the memory and why. Setting FTDEBUG
to non-zero gives more information about who is using each block
of memory. You may also consider building a monitor with more DECnet
freespace incorporated.
>,RTN)
BUG.(CHK,DNDCIZ,DNADLL,SOFT,<Callback ID is zero>,,<
Cause: DTESRV has lost the callback ID for this line or never had one.
Action: See if protocol was started when Router thought the circuit state was
off. Or, check DCNCID in DTESRV to see what it has for a callback ID.
>,RTN)
BUG.(CHK,DNDEMF,DNADLL,SOFT,<Enable Ethernet multicast address failed>,<<T1,ERRCOD>>,<
Cause: NISRV returned an error when trying to enable a multicast address.
Action: Check the error code returned in T1 and investigate the problem in
NISRV.
Data: ERRCOD - Error code returned by NISRV
>)
BUG.(CHK,DNDNCE,DNADLL,SOFT,<Error from NISRV when closing portal>,<<T1,ERRCOD>>,<
Cause: NISRV returned an error when asked to close our portal.
Action: Check error code returned in T1. It may be caused by DNADLL trying
to close a portal that wasn't open or a problem in NISRV
Data: ERRCOD - Error code returned by NISRV
>)
BUG.(CHK,DNDNNF,DNADLL,SOFT,<Network management failed>,<<T1,ERRCOD>>,<
Cause: NISRV returned an error when asked to read network management
parameters or counters.
Action: Check error code returned in T1 to see what NISRV thinks is wrong.
Data: ERRCOD - Error code returned by NISRV
>)
BUG.(CHK,DNDNOF,DNADLL,SOFT,<Attempt to open an ethernet portal failed>,<<T1,ERRCOD>>,<
Cause: NISRV returned an error when trying to open a portal.
Action: Check error code returned in T1 to see what NISRV thinks is wrong.
Data: ERRCOD - Error code returned by NISRV
>,NIFOF)
BUG.(CHK,DNDRLF,DNADLL,SOFT,<Read channel list failed>,<<T1,ERRCOD>>,<
Cause: NISRV returned an error when asked to return the channel list.
Action: Check error code returned in T1 to see what NISRV thinks is wrong.
Data: ERRCOD - Error code returned by NISRV
>,NIDIN1)
BUG.(CHK,DNDXMF,DNADLL,SOFT,<Transmit message to Ethernet failed>,<<T1,ERRCOD>>,<
Cause: NISRV returned an error when trying to queue a message for transmit.
Action: Check error code returned in T1 and investigate problem in NISRV.
Data: ERRCOD - Error code returned by NISRV
>,FREUNB)
BUG.(HLT,DNSBPB,D36COM,SOFT,<DNSBP called with OWGBP>,,<
Cause: DNSBP was called with a one-word global byte pointer. DNSPB is
only set up to handle local one-word and two-word byte pointers,
without indexing or indirection.
Action: Either change the caller to pass a two-word byte pointer or upgrade
DNSBP to handle OWGBPs.
>)
BUG.(CHK,DNSLJ,CIDLL,SOFT,<MOVSLJ failed>,,<
Cause: A MOVSLJ instruction did not skip.
Action: If this problem persists and the DOB% facility does not produce
a dump, then change this BUGCHK to a BUGHLT and submit an SPR.
It is possible that there could be a KL microcode bug here so
be sure to include the version you are running in the SPR.
>)
BUG.(CHK,DNSURE,MNETDV,SOFT,<DNS UDP receive error>,<<T1,ERROR>>,<
Cause: When reading a UDP DNS reply from a host, an error was returned from
the RCVIN% JSYS. If the error code is SNDIX1 (600732), this indicates
that the message was too big to fit in the DNS reply buffer of 1280
eight bit bytes.
Action: If the error is SNDIX1, find out why the message is too large and
have the DNS name server host return less data. DNS messages can get
very large when there are a large number of "additional" and
"authority" records. If the error is not SNDIX1, and other TCP/IP
software is functioning normally, and the problem can be reproduced,
set this BUG dumpable and submit an SPR along with the dump,
MONITR.EXE, and the system's Internet configuration files.
Data: ERROR - RCVIN% JSYS error code
>,,<DB%NND>)
BUG.(CHK,DRMFUL,PAGEM,SOFT,<Drum completely full>,,<
Cause: The monitor is attempting to swap a core page to the drum. There is no
space available. The general handling of drum assignments should
insure that here are always a few pages available for "critical"
assignments such as this case.
Action: When this bug is seen when reloading the system it can safely be
ignored. It is possible that some user program could overtax the
normal reserves and cause this failure.
If no user program can be found to blame for running out of swapping
space, or this bug is seen whenever a particular system is being
reloaded this bug dumpable, get a dump and send in an SPR describing
how to reproduce the problem.
>,,<DB%NND>)
BUG.(HLT,DRMIBT,SWPALC,HARD,<DRMASN - Bit table inconsistent>,,<
Cause: During the assignment of drum a page, DRMCNT for a track showed there
was space on that track. However, there was no free space available
according to the bit table for the track.
>)
BUG.(HLT,DRMNFR,SWPALC,HARD,<DRMAM - Cannot find page when DRMFRE non-0>,,<
Cause: During assignment of multiple contiguous drum addresses, DRMFRE said
there was space on the drum. However, none of the DRMCNT's for each
track showed any free space.
>)
BUG.(CHK,DRXRNA,DIRECT,SOFT,<DIRRNA - Illegal formatted remote alias block in directory>,<<A,DIRNUM>,<B,STRNAM>,<C,ADDR>>,<
Cause: Illegal formatted remote alias block.
Action: Use the DELETE command with subcommand DIRECTORY to delete the
directory file, then rebuild the directory.
Data: DIRNUM - Directory Number
STRNAM - Sixbit Structure Name
ADDR - Address in directory
>)
BUG.(HLT,DSKBRP,DSKALC,SOFT,<DSKDEA - Pages on multiple cylinders>,<>,<
Cause: DSKDEA has been called to delete a number of pages, and the pages
are on multiple cylinders. This is not allowed.
>)
BUG.(CHK,DSKBT1,DSKALC,HARD,<DSK bit table fouled, cannot find free page on track with non-0 count>,<<T2,STRCOD>,<T3,CYLNDR>>,<
Cause: The bit table for this disk cylinder indicated there
were free pages for assignment. However, none could be
found.
Action: Find out which structure this is for and run CHECKD on the disk.
Then check the consistency of bits of the disk.
Data: STRCOD - Structure Unique Code
CYLNDR - Cylinder Number
>,,<DB%NND>)
BUG.(CHK,DSKBT3,DSKALC,SOFT,<DISK Bit table already locked at LCKBTB>,<<T1,SPTIDX>>,<
Cause: A structure bit table being locked is already locked.
Data: SPTIDX - Offset in SPT for entry to copy into BTBBAS SPT and SPTH
slots.
>)
BUG.(HLT,DST2SM,SWPALC,SOFT,<SWPINI - DST too small>,,<
Cause: There are more pages for swapping than there are entries in the DST.
Action: Either rebuild the monitor after changing STG.MAC to have a larger
value for NDST or use a boot structure with less swapping pages.
>)
BUG.(INF,DTEBWS,DTESRV,SOFT,<DTE MCB handshake incorrect>,<<A,DTE>,<B,PC>>,<
Cause: The KL detected that the MCB's init bit was not correct during a QP2
protocol initialization handshake.
Action: Try again. If it still doesn't work, check the MCB software. Failing
that, have field service check out the DTE.
Data: A - DTE number
B - PC of caller
>)
BUG.(CHK,DTECAR,DTESRV,HARD,<Carrier FNC with no line number>,<<A,DTENO>>,<
Cause: A TO-10 transfer completion interrupt from the DTE under RSX20F
protocol that indicates a line has hung up or dailed up was
received. The packet contains no line number due to a format
error.
Action: If problem persists, contact Field Service.
Data: DTENO - DTE number
>)
BUG.(INF,DTECDM,DTESRV,HARD,<DTESRV - TO-10 counts do not match>,<<A,DTNENO>>,<
Cause: TOPS-20 received a doorbell interrupt from the DTE but the TO-10
counter in the front end does match the counter kept by TOPS-20.
Action: TOPS-20 is reloading the FE.
Data: DTENO - DTE number
>)
BUG.(CHK,DTECGB,DTESRV,SOFT,<DTE MCB initialization timed out>,<<A,DTE>,<B,PC>>,<
Cause: Couldn't allocate memory for section zero input or output buffers.
Action: Try again later.
Data: DTE - DTE number
PC - PC of caller
>,,<DB%NND>)
BUG.(CHK,DTEDAT,DTESRV,HARD,<TAKTOD - Illegal format for time/date>,,<
Cause: A TO-10 transfer completion interrupt from the DTE
under RSX20F protocol that indicates the -11 is providing the time
of day was recieved. The packet format is incorrect.
>)
BUG.(CHK,DTEDEV,DTESRV,HARD,<Illegal device>,<<A,DTENO>>,<
Cause: A TO-10 transfer completion interrupt from the DTE under RSX20F
protocol that indicates the -11 is ending line allocation
information was recieved. The device code provided is out of
range.
Action: If the problem persists, contact Field Service
Data: DTENO - DTE number
>)
BUG.(INF,DTEDIN,DTESRV,HARD,<DTESRV - TO-10 in progress on doorbell>,<<A,DTENO>>,<
Cause: The 10 received a doorbell from the 11, but a TO -10 transfer was
already in progress.
Data: DTENO - DTE number
>)
BUG.(INF,DTEDME,DTESRV,HARD,<DTESRV - Zero queue count>,<<A,DTENO>>,<
Cause: A transfer from the 11 to the 10 is about to be started, but the
count of free bytes in the transfer queue is zero.
Action: Examine the DTE to see if it has hardware problems.
Data: DTENO - DTE number
>)
BUG.(CHK,DTEERR,DTESRV,HARD,<DTESRV - DTE device error>,<<A,DTENO>,<F,STATUS>>,<
Cause: A packet has been received from a DTE that has a flag lit
indicating a TO -10 or TO -11 error.
Action: Contact Field Service
Data: DTENO - DTE number
STATUS - Result of CONI DTEN,
>,,<DB%NND>)
BUG.(CHK,DTEIDP,DTESRV,HARD,<Bad indirect packet>,<<A,DTENO>>,<
Cause: An indirect packet from a DTE was received but the DTE status word
indicated that there is no transfer active.
Data: DTENO - DTE number
>,,<DB%NND>)
BUG.(CHK,DTEIFR,DTESRV,HARD,<DTESRV - Illegal FNC request>,<<A,DTENO>>,<
Cause: A packet from a DTE was received that contained an invalid function
code.
Action: If the problem persists, call Field Service.
Data: DTENO - DTE number
>)
BUG.(INF,DTEKPA,DTESRV,SOFT,<DTE keep alive fail>,<<B,DTENO>>,<
Cause: The DTE keep alive counter is not being updated by the 11.
Action: Check the 11 to see if it is running properly.
Data: DTENO - DTE number
>,,<DB%NND>)
BUG.(INF,DTELPI,DTESRV,HARD,<DTECHK - DTE lost PI assignment>,<<B,DTENO>>,<
Cause: A CONI of the DTE has indicated it has lost it's channel assignment.
Action: Have Field Service check the DTE.
Data: DTENO - DTE number
>,,<DB%NND>)
BUG.(CHK,DTEMCC,DTESRV,SOFT,<DOFRGM - DN20 disagrees with count>,<<A,D>,<P5,REST>,<C,RGN10>,<P2,RGN11>>,<
Cause: Either the monitor has calculated a different value than it was
given by the front end, or an invalid count (.LE 0) has been found.
Action: Check for DTE hardware problems; if there is none, the code in the
DN20 should be investigated.
Data: D - Comm region address
REST - residual count of transfer
RGN10 - Flags from -10's region
RGN11 - Flags from -11's region
>)
BUG.(CHK,DTEODD,DTESRV,HARD,<TAKLC - Odd byte count for line characters>,,<
Cause: A TO-10 transfer completion interrupt was recieved from the DTE
under RSX20F protocol that indicates the -11 has sent line
characters and the byte count is odd.
>)
BUG.(CHK,DTEP2S,DTESRV,HARD,<TO10DN - Packet too small>,,<
Cause: The packet size field in a TO10 packet from a DTE contains an
invalid length.
Action: If the problem persists, call Field Service.
>,,<DB%NND>)
BUG.(CHK,DTEPGF,DTESRV,HARD,<DTE transfer page fail>,<<A,DTENO>>,<
Cause: A transfer from a DTE has generated a page fail. The FE is now
marked as dead.
Action: Check the DTE for hardware problems.
Data: DTENO - DTE number
>)
BUG.(INF,DTEPNR,DTESRV,HARD,<DTESRV - Incorrect indirect setup>,<<A,DTENO>>,<
Cause: While trying to complete an indirect from an RSX20F FE the
TO -10 count provided by RSX did not match the count kept by
TOPS-20.
Data: DTENO - DTE number
>)
BUG.(INF,DTESUI,DTESRV,SOFT,<Front end requested reload or init>,<<A,DTE>,<B,STATUS>>,<
Cause: An -11 has requested a reload or init but the enabled
protocol for this DTE is not DECnet.
Data: DTE - DTE number.
STATUS - Status word from -11's comm region
>,,<DB%NND>)
BUG.(CHK,DTETIP,DTESRV,HARD,<DTETDN - TO-10 done received with no transfer in progress>,<<A,DTE>>,<
Cause: The KL received indication from the DTE that a TO-10 transfer has
completed but the DTE status did not indicate that a transfer was
in progress.
Action: If the problem persists, call Field Service.
Data: DTE - DTE number
>,,<DB%NND>)
BUG.(INF,DTETPR,DTESRV,SOFT,<DTE protocol terminated>,<<Q2,DTENO>>,<
Cause: The protocol on the DTE has been terminated due to a BOOT% request.
Data: DTENO - DTE number
>,,<DB%NND>)
BUG.(CHK,DTETTY,DTESRV,HARD,<Non-TTY device>,<<A,DTENO>>,<
Cause: A TO-10 transfer completion interrupt from the DTE was recieved
under RSX20F protocol that indicates the -11 has sent line
characters, but the device type provided in the packet is not a
TTY.
Data: DTENO - DTE number
>)
BUG.(CHK,DTEUIF,DTESRV,HARD,<DTESRV - Unimplemented function from 11>,<<A,DTENO>>,<
Cause: A packet from a DTE was received that contained a function code
that is not supported by TOPS-20.
Data: DTENO - DTE number
>)
BUG.(CHK,DVCHRX,JSYSF,HARD,<DVCHR1 - Unexpected CHKDES failure within .DVCHR>,,<
Cause: CHKDES failed to get the device code for a TTY or PTY after using
either TTYPTY to convert a TTY number to a PTY number or PTYTTY to
convert a PTY number to a TTY.
Action: If this persists, use the DOB% facility to obtain a dump of this
BUGCHK and submit an SPR.
>)
BUG.(CHK,DX2DIE,PHYX2,HARD,<PHYX2 - DX20 halted>,<<T2,CHAN>,<T3,DX20>,<T1,REG26>>,<
Cause: During a check for DX20 errors, HARCHK discovered that the DX20 was not
running. This could be due to one or more of the following:
o The DX20 has been powered down. Reload the microcode with DX20LD.
o The DX20 microcode has detected a fatal error.
o The microcode could have been halted by a program such as DX20LD.
o The DX20 is seeing microbus parity errors while fetching an
instruction from its memory.
Action: The DX20 may be having problems, look for other DX2xxx BUGCHKs. If the
DX20 won't run or starts halting frequently, call Field Service.
Data: CHAN - Channel number
DX20 - DX20 number
REG26 - Extended status register (register 26) contents
>,,<DB%NND>)
BUG.(CHK,DX2DNF,PHYX2,HARD,<PHYX2 - Drive number not found in UDBs>,<<T1,CHAN>,<Q2,DX20>,<T4,UNIT>>,<
Cause: A DX20 returned an 8-bit drive number, and routine DRVSRC was called to
determine which UDB was associated with that drive number. None of the
currently existing UDBs had that number.
Action: Field Service should check the hardware. There probably is a hardware
problem with the DX20 or TX02. Look for other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
UNIT - The unit number that was not found
>,,<DB%NND>)
BUG.(CHK,DX2FGS,PHYX2,HARD,<PHYX2 - Fail to get sense bytes>,<<T1,CHAN>,<T2,DRIVE>,<T3,REG21>,<T4,REG31>>,<
Cause: TOPS-20 could not read the sense bytes for a magtape drive.
Action: Field Service should check the hardware. There could be a DX20, TX02,
or drive problem. Look out for other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DRIVE - DX20,,slave number that the monitor is working with
REG21 - Reg 21 of DX20 (unit and slave that DX20 is working with)
REG31 - Reg 31 of DX20 (PC of the DX20)
>,,<DB%NND>)
BUG.(CHK,DX2FUS,PHYX2,HARD,<PHYX2 - Fail to update sense bytes>,<<T1,CHAN>,<Q2,DX20>>,<
Cause: GETEXS could not update the DX20 sense bytes because the DX20 would
not respond to the request.
Action: Field Service should check the DX20 and TX02. The DX20 may be broken.
Look out for other DX2xxx BUGs.
Data: CHAN - Channel number
DX20 - DX20 number
>,,<DB%NND>)
BUG.(CHK,DX2HLT,PHYX2,HARD,<PHYX2 - DX20 halted>,<<T1,CHAN>,<T2,DX20>,<T3,REG1>,<T4,2AND26>>,<
Cause: The DX20 controller's microcode is no longer running. This could be
due to one or more of the following:
o The DX20 has been powered down. Reload the microcode with DX20LD.
o The DX20 microcode has detected a fatal error.
o The microcode could have been halted by a program such as DX20LD.
o The DX20 is seeing microbus parity errors while fetching an
instruction from its memory.
Action: The DX20 may be having problems, look for other DX2xxx BUGCHKs. If the
DX20 won't run or starts halting frequently, call Field Service.
Data: CHAN - Channel number
DX20 - DX20 number
REG1 - Contents of device register 1 (status register).
2AND26 - Device register 2 in left half (error register), and
device register 26 in right half (possible error code).
>,,<DB%NND>)
BUG.(CHK,DX2IDM,PHYX2,HARD,<PHYX2 - Illegal data mode at done interrupt>,<<T1,CHAN>,<T3,DX20>,<T2,DATMOD>>,<
Cause: At a done interrupt, the data mode in the IORB was found to be illegal.
Action: Field Service should check the DX20, it could be broken or flakey.
Look for other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
DATMOD - Data mode in IORB
>)
BUG.(CHK,DX2IDX,PHYX2,HARD,<PHYX2 - Illegal retry byte pointer index>,<<T1,CHAN>,<T2,DX20>,<T3,UNIT>,<T4,FUNC>>,<
Cause: While trying to do error recovery for a non-fatal DX20 error, it was
found that the byte pointer to the retry type was zero. This indicates
that a retry is not possible for this error type. The index is
retrived from IRBSTS of the IORB.
Action: Field Service should check the DX20 and TX02, as they may be broken or
flakey. Look for other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
UNIT - Tape drive unit
FUNC - Bad function code from the IORB
>)
BUG.(CHK,DX2IEC,PHYX2,HARD,<PHYX2 - Illegal error class code>,<<T2,CHAN>,<T2,DX20>,<T1,DXERR>>,<
Cause: The error class code returned by the DX20 is illegal.
Action: Field Service should check the DX20. Look for other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
DXERR - DX20 error register
>)
BUG.(CHK,DX2IFS,PHYX2,SOFT,<PHYX2 - Illegal function at start IO>,<<T1,CHAN>,<T2,DX20>,<T3,UNIT>,<Q1,FNCCOD>>,<
Cause: DX2SIO was called to start IO for a DX20. This bug indicates that
either the function code in the IORB was invalid or the short form
(PAGEM) bit was set in the IORB status word.
Action: If this BUGCHK persists, change it to a BUGHLT, and submit an SPR along
with a dump. Examination of the dump tells who is setting up the
IORB incorrectly.
Data: CHAN - Channel number
DX20 - DX20 number
UNIT - Tape drive unit
FNCCOD - Function code
>)
BUG.(CHK,DX2IRF,PHYX2,HARD,<PHYX2 - Illegal function during retry>,<<T1,CHAN>,<T2,DX20>,<T3,UNIT>>,<
Cause: During error recovery, an illegal retry function was discovered.
Action: Field Service should check the DX20 and TX02, as they may be broken or
flakey. Look for other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
UNIT - Tape drive unit
>)
BUG.(CHK,DX2MCF,PHYX2,HARD,<PHYX2 - DX20 microcode check failure>,<<T1,CHAN>,<Q2,DX20>>,<
Cause: A check of CRAM locations 7 through 11 in the DX20 indicates that
the microcode running in the DX20 is bad.
Action: Unless the DX20 has just been powered up, the DX20 could be broken.
Try reloading the microcode and watch for other DX2xxx BUGCHKs. If
this error persists the DX20 is broken, call Field Service.
Data: CHAN - Channel number
DX20 - DX20 unit number
>,,<DB%NND>)
BUG.(INF,DX2N2S,PHYX2,HARD,<PHYX2 - More TU70s than table space, excess ignored>,,<
Cause: DX2INI was called to build a KDB and a UDB for a tape drive. This BUG
indicates that the MTCUTB table of CDBs and UDBs is already full.
MTCUTB is of length MTAN and it is full.
Action: Change MTAN in STG and rebuild the monitor to accommodate more tape
drives.
>,,<DB%NND>)
BUG.(CHK,DX2NRT,PHYX2,HARD,<DX2ERR - IS.NRT set on successful retry>,<<T1,CHAN>,<T2,DX20>,<T3,UNIT>>,<
Cause: A retry from a DX20 error was performed successfully, but the IORB
indicates that this should have been a hard error.
Action: Field Service should check the DX20 and TX02, as they may be broken or
flakey. Look for other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
UNIT - Tape drive unit
>)
BUG.(CHK,DX2NUD,PHYX2,HARD,<PHYX2 - Channel done interrupt but no unit active>,<<T1,CHAN>,<T2,DX20>>,<
Cause: DX2INT was called to process a channel done interrupt for a DX20, but
there was no known active UDB on the controller.
Action: Field Service should check the DX20, it is probably broken. Look for
other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
>,,<DB%NND>)
BUG.(CHK,DX2NUE,PHYX2,HARD,<PHYX2 - DX20 detected hardware problem>,<<T2,CHAN>,<Q2,DX20>,<T4,STSREG>,<T1,ERRREG>>,<
Cause: A channel done interrupt for a DX20 was being processed, but there was
no known active UDB on the controller and the composite error bit in
the DX20 status register was set. The DX20 runs internal diagnostics,
and sets this bit when it discovers an error. If the status
register contains 100 (octal) and the error register contains 6000
(octal) then the DX20 is probably seeing internal microbus parity
errors. Otherwise, the error register contains the error code.
Action: Look at the error register. Field Service should check the DX20, as
the DX20 is probably broken. Look for other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
STSREG - Status register
ERRREG - Error register
>,,<DB%NND>)
BUG.(CHK,DX2RFU,PHYX2,HARD,<PHYX2 - Error recovery confused>,<<T1,CHAN>,<T2,DX20>,<T3,UNIT>>,<
Cause: The error recovery procedure specified in the UDB is incorrect.
Action: Field Service should check the DX20 and TX02, as they may be broken or
flakey. Look for other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
UNIT - Tape drive unit
>)
BUG.(CHK,DX2UNA,PHYX2,HARD,<PHYX2 - Attention interrupt and UDB not active>,<<T1,CHAN>,<T2,DX20>,<T3,UNIT>>,<
Cause: At a done interrupt, the UDB marked as active in the KDB was not itself
marked as active.
Action: Field Service should check the DX20 and TX02, as they may be broken or
flakey. Look for other DX2xxx BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
UNIT - Tape drive unit
>,,<DB%NND>)
BUG.(CHK,DX2UPE,PHYX2,HARD,<PHYX2 - Fail to update sense bytes during initialization>,<<T1,CHAN>,<Q2,DX20>>,<
Cause: GETEXS failed to read in the extended status bytes for a DX20 during
initialization.
Action: Field Service should check the DX20 and TX02. Look for other DX2xxx
BUGCHKs.
Data: CHAN - Channel number
DX20 - DX20 number
>,,<DB%NND>)
BUG.(CHK,DXBASD,PHYP2,HARD,<PHYP2 - Asynchronous status from non-positioning drive>,<<T1,CHAN>,<T2,CTRL>,<T3,UNIT>>,<
Cause: An asynchronous interrupt occurred from a DX20 drive that was not in
the process of doing a seek operation.
Action: There probably is a hardware problem with the RH20, DX20B, or RP20
hardware and Field Service should be notified.
Data: CHAN - Channel number
CTRL - Controller number
UNIT - Unit number
>)
BUG.(CHK,DXBDIE,PHYP2,HARD,<PHYP2 - DX20B microcode halted>,<<T1,CHAN>,<Q2,DX20>>,<
Cause: The microcode in a DX20B has halted. This could be due to one or more
of the following:
o The DX20 has been powered down. Reload the microcode with DX20LD.
o The DX20 microcode has detected a fatal error.
o The microcode could have been halted by a program such as DX20LD.
o The DX20 is seeing microbus parity errors while fetching an
instruction from its memory.
Action: The DX20 may be having problems, look for other DXBxxx BUGCHKs. If the
DX20 won't run or starts halting frequently, call Field Service.
Data: CHAN - Channel number
DX20 - DX20 number
>,,<DB%NND>)
BUG.(CHK,DXBDMI,PHYP2,SOFT,<PHYP2 - DX20B microcode is invalid>,<<T1,CHAN>,<Q2,DX20>>,<
Cause: TOPS-20 could not verify the microcode in a DX20B (RP20 controller).
Action: Reload the microcode in the DX20B. If the problem still occurs, Field
Service should check out the DX20 and the RH20 it connects to.
Data: CHAN - Channel number
DX20 - DX20 number
>,,<DB%NND>)
BUG.(CHK,DXBEUI,PHYP2,HARD,<PHYP2 - Error trying to initialize a unit>,<<T1,CHAN>,<Q2,DX20>,<T2,UNIT>>,<
Cause: TOPS-20 detected a drive error while trying to initialize an RP20
disk.
Action: Field Service should check out the drive specified in the additional
data, as it appears to have a hardware problem.
Data: CHAN - Channel number
DX20 - DX20 number
UNIT - Drive number
>,,<DB%NND>)
BUG.(CHK,DXBEWC,PHYP2,HARD,<PHYP2 - Error present when connecting to a unit>,<<T1,CHAN>,<T2,CTRL>,<T3,UNIT>,<T4,20AND2>>,<
Cause: One or more flags is turned on in a DX20 drive error register. TOPS-20
is ignoring the errors and proceeding.
Action: There is probably a hardware problem with the RP20 disk named in the
additional data, and it should be checked out by Field Service. The
fourth additional data item contains error information.
Data: CHAN - Channel number
CTRL - Controller number
UNIT - Unit number
20AND2 - RH of DX20 ending status,,RH of DX20 error register
>,,<DB%NND>)
BUG.(HLT,DXBFEX,PHYP2,SOFT,<PHYP2 - Illegal function starting IO>,,<
Cause: The routine DX2SIO in PHYP2 was called to start a transfer operation
for an IORB but the function code from the IORB was illegal.
>)
BUG.(CHK,DXBFGS,PHYP2,HARD,<PHYP2 - Failed to get sense bytes>,,<
Cause: A timeout occurred while waiting for an attention interrupt from a
DX20B after requesting the sense bytes.
Action: Field Service should be called to check out the DX20 and RP20 hardware.
>,,<DB%NND>)
BUG.(CHK,DXBFUS,PHYP2,HARD,<PHYP2 - Failed to update sense bytes>,,<
Cause: A timeout occurred while waiting for a DX20 to update the sense bytes
provided to it by TOPS-20.
Action: Field Service should be called to check out the DX20 and RP20 hardware.
>,,<DB%NND>)
BUG.(INF,DXBHLT,PHYP2,HARD,<PHYP2 - DX20B controller halted>,<<T1,CHAN>,<T2,DX20>,<T3,REG1>,<T4,2AND26>>,<
Cause: The DX20B controller's microcode is no longer running. This could be
due to one or more of the following:
o The DX20B has been powered down. Reload the microcode with DX20LD.
o The DX20B microcode has detected a fatal error.
o The microcode could have been halted by a program such as DX20LD.
o The DX20B is seeing microbus parity errors while fetching an
instruction from its memory.
Action: The DX20B may be having problems, look for other DXBxxx BUGCHKs. If
the DX20B won't run or starts halting frequently, Field Service should
check out the DX20B.
Data: CHAN - Channel number
REG1 - DX20B status register
2AND26 - Right half of error register,,RH of error reason register
>,,<DB%NND>)
BUG.(CHK,DXBIEC,PHYP2,SOFT,<PHYP2 - Unknown error code from DX20>,<<T2,CHAN>,<Q2,DX20>,<T1,STATUS>>,<
Cause: A transfer operation on a RP20 drive had drive failed. This indicates
a drive or controller error but the error code provided by the DX20 is
not valid.
Action: Field Service should check out the DX20 and RP20 hardware.
Data: DX20 - DX20 number
STATUS - DX20 error register
>,,<DB%NND>)
BUG.(HLT,DXBIF2,PHYP2,SOFT,<PHYP2 - Illegal function stacking IO>,,<
Cause: The routine DX2STK in PHYP2 was called to stack a second transfer
command but the function in the IORB was illegal.
>)
BUG.(HLT,DXBILF,PHYP2,SOFT,<PHYP2 - Illegal function at Done interrupt>,,<
Cause: The routine DX2INT in PHYP2 was called to handle a done interrupt for a
drive. The IORB which finished I/O contained a function code which was
illegal.
>)
BUG.(HLT,DXBLTF,PHYP2,SOFT,<PHYP2 - Latency optimization failure>,,<
Cause: The routine DX2LAT in PHYP2 was called by PHYSIO to find the best IORB
for a unit. However, after scanning all IORBs in the transfer wait
queue for the unit, no IORB was found that could be returned.
>)
BUG.(HLT,DXBMSR,PHYP2,SOFT,<PHYP2 - Multiple sectors indicated in ECC recovery>,,<
Cause: The routine ECCERR in PHYP2 was called to recover from an ECC error on
a unit. After correcting the error, the routine ECCUCL in PHYSIO was
called to update the CCW list. That routine skipped, indicating that
more sectors must be read to complete the transfer. However, the RP20
is formatted in pages and no transfer is ever longer than a page, so
the skip return should never occur.
>)
BUG.(CHK,DXBNUD,PHYP2,HARD,<PHYP2 - No unit active for done interrupt>,<<T1,CHAN>,<Q2,DX20>>,<
Cause: A done interrupt occurred on a DX20 but there is no active UDB for this
controller.
Action: There probably is a hardware problem with the RH20, DX20B, or RP20
hardware and Field Service should be notified. If an OVRDTA BUGCHK
preceeded this BUGCHK with the same channel and contoller number, then
the drive number called out in the OVRDTA is the one to suspect first.
Data: CHAN - Channel number
DX20 - DX20 number
>,,<DB%NND>)
BUG.(HLT,DXBTNF,PHYP2,SOFT,<PHYP2 - Unit type not found in table>,,<
Cause: The routine DX2INI was called to initialize a UDB for an RP20 disk. It
converted the hardware drive type into the internal drive type and then
looked in the physical parameter table (DSKUTP) for that type so that
the disk parameters could be obtained. The drive type could not be
found. There is a hardware problem or an illegal drive type connected
to the RP20 controller.
Action: Field Service should check out the RH20, DX20B, and RP20 hardware.
>)
BUG.(CHK,DXBTTS,PHYP2,HARD,<PHYP2 - Tables too small for this many drives>,<<T3,NUMDRV>>,<
Cause: The number of RP20 drives on a DX20 controller exceeds the number
supported by TOPS-20. Since NUMDRV is 16 decimal, this is not expected
to happen.
Action: Rebuild the monitor with a larger NUMDRV in PHYP2 or reduce the number
of RP20 drives accessible through this controller.
Data: NUMDRV - Number of drives allowed per controller
>,,<DB%NND>)
BUG.(CHK,DXBUA1,PHYP2,HARD,<PHYP2 - Done interrupt and unit was not active>,<<T1,CHAN>,<T2,CTRL>,<T3,UNIT>>,<
Cause: A done interrupt occurred on a DX20 but the UDB for the drive marked in
the KDB as active is not.
Action: There probably is a hardware problem with the RH20, DX20B, or RP20
hardware and Field Service should be notified. If an OVRDTA BUGCHK
preceeded this BUGCHK with the same channel and contoller number, then
the drive number called out in the OVRDTA is the one to suspect first.
Data: CHAN - Channel number
CTRL - Controller number
UNIT - Unit number
>,,<DB%NND>)
BUG.(CHK,DXBUNA,PHYP2,HARD,<PHYP2 - Attention interrupt and unit was not active>,<<T1,CHAN>,<T2,CTRL>,<T3,UNIT>>,<
Cause: An attention interrupt occurred for a DX20 drive but the unit listed in
the KDB as active is not.
Action: There probably is a hardware problem with the RH20, DX20B, or RP20
hardware and Field Service should be notified. If an OVRDTA BUGCHK
preceeded this BUGCHK with the same channel and contoller number, then
the drive number called out in the OVRDTA is the one to suspect first.
Data: CHAN - Channel number
CTRL - Controller number
UNIT - Unit number
>,,<DB%NND>)
BUG.(CHK,DXBZEC,PHYP2,HARD,<PHYP2 - Zero ECC byte returned>,,<
Cause: An ECC correctable error occurred on a RP20 drive. The byte number in
error in the sector was zero. This indicates the error was in the ECC
byte itself and the data is actually correct.
Action: If this BUGCHK is seen often, Field Service should check the RP20
hardware for error recovery problems.
>,,<DB%NND>)
BUG.(CHK,ENQLNL,CFSSRV,SOFT,<CFSSRV - ENQ Database Lock not locked>,,<
Cause: CFEQUL was called to unlock the ENQ Database Lock, but the Lock
was found not to be locked.
Action: If this problem is reproducible, set this BUGCHK dumpable and submit an
SPR along with the dump and instructions on reproducing the problem.
>)
BUG.(HLT,EQBLNK,ENQSRV,SOFT,<ENQSRV - Bad list of SCA buffers>,,<
Cause: The .PKFLI pointer in the second buffer points to yet another
buffer. Presently, the code only expects to send 2 SCA messages
for any Request Message Set.
Action: Examine the dump for signs of a miscalculation in SC.BRK.
>)
BUG.(HLT,EQLTOT,ENQSRV,SOFT,<ENQSRV - VRQA value of EBTOTT is too big>,,<
Cause: The value in variable BUFL indicates that SC.BRK thought that the
VRQA would require multiple buffers. The value of EBTOTT
indicates that the rest of VRQA requires more than 1 more buffer.
Action: Examine the dump for signs of a miscalculation in either SC.BRK
or the setting of EBTOTT.
>)
BUG.(CHK,EQNOTF,ENQSRV,SOFT,<ENQSRV - Rescheduling notification failed>,,<
Cause: Routine EQLKSD was called to notify the other systems in the cluster
that a lock has been rescheduled. This did not occur because
of a lack of resources; SCA message buffers. There may be a fork on
another system which is hung waiting to get access to this lock and it
won't unhang until that system recieves a rescheduling notification.
This BUGCHK should not occur and is evidence of a much more serious
SCA problem.
Action: In order to unhang any forks which may be waiting for access to the
lock which was just rescheduled, the following can be done. On
every system, set bit EN.SDO in every Lock-Block which is on the
EQLBLT. Then increment the right half of EQFKFL by the number of
blocks just set. This can be done by setting EQCSTF to -1 on
each system. This makes the system think that a cluster
state change occurred and forces a rescheduling of all known
lock blocks.
>,,<DB%NND>)
BUG.(HLT,EQNOVW,ENQSRV,SOFT,<ENQSRV - NO bit set no vote required>,,<
Cause: The VRPA contains the replies to a vote request and VPNO and VPNOV
are both set. This should never happen since VPNOV means that no
other node know about the lock but VPNO means that another node
rejected the vote request; so it must have known about the lock.
This is a problem in the Lock-Block caching algorithm.
Action: A cluster dump may be required here to solve the problem. Also,
check all the places where VPNOV and VPNO are set for a possible
problem. This logic resides mostly in routines EVEOKR and EIEOKR.
>)
BUG.(HLT,EQNVRB,ENQ,SOFT,<ENQ - Could not get buffer for VRB>,<<T1,ERR>>,<
Cause: SC.ABF was called to acquire a buffer for use as the VRB. However,
no buffers were available in the SCA message pool. We cannot
continue without a VRB.
Action: Examine the dump for signs of damage to the SCA message pool.
Data: ERR - Error code returned by SCA.
>)
BUG.(HLT,EQSTOT,ENQSRV,SOFT,<ENQSRV - VRQA value of EBTOTT is too small>,,<
Cause: The value in variable BUFL indicates that SC.BRK thought that the
VRQA would require multiple buffers. But, the value of EBTOTT is
less than or equal to zero which means that there is nothing more
to send.
Action: Examine the dump for signs of a miscalculation in either SC.BRK
or the setting of EBTOTT.
>)
BUG.(CHK,EXILGO,MEXEC,SOFT,<EXECI - Interrupt during login or logout>,,<
Cause: Control has passed to the mini-exec because the top fork hit a
terminating condition or monitor interrupt. The top fork EXEC may
have been wiped out. In addition, the job was trying to log in or
out. The fork is put into an infinite wait state since any other
action might lead to further itraps, interrupts, looping, etc.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(HLT,EXPAFK,MEXEC,SOFT,<EXPALL - Job 0 CFORK failed>,,<
Cause: This happens if the CFORK JSYS fails to create a fork
for doing the system-wide expunge of structure BS:. This could most
likely happen if all the fork slots are used up.
>)
BUG.(CHK,EXPRCD,MEXEC,HARD,<EXPALL - RCDIR failure>,,<
Cause: RCDIR% failed to translate the first directory of BS:<*> to a
directory number in routine EXPALL.
Action: The system-wide expunge of PS: has not been done. There may be
hardware or directory structure problems with the boot structure.
If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(HLT,FATCDP,APRSRV,HARD,<Fatal cache directory parity error>,<<A,CONIAP>>,<
Cause: An APR interrupt occurred because a physical page number with even
parity was encountered in the cache directory.
Action: Have Field Service check the system to make sure that it is functioning
properly. Particular attention should be given to the cache directory.
Backplane problems have also been known to cause these.
Data: CONIAP - Result of CONI APR
>)
BUG.(HLT,FATMER,APRSRV,HARD,<Fatal memory error>,,<
Cause: An APR interrupt occurred indicating an SBUS error and no MB parity
error. However, no MOS controller reports an error. The monitor is
unable to determine the cause of the SBUS error. The monitor has
printed a description of the problem on the CTY.
Action: Field Service should check out all memory on the system.
>)
BUG.(CHK,FEBAD,FESRV,HARD,<FEHSD - Wrong FE>,<<C,FEDTE>,<A,SRCDTE>>,<
Cause: FEHSD was called to pass string data from an FE: device to the user
program but the source DTE disagreed with the DTE that was
associated with this instance of FE: device.
Data: FEDTE - DTE associated with FE:
SRCDTE - Source DTE
>)
BUG.(CHK,FEBFOV,FESRV,HARD,<FEHSD - Buffer overflow>,<<A,NBYTES>,<C,BYTCNT>>,<
Cause: FEHSD was called to transfer a string from a front end device to a
user program but user's buffer could hot hold the number of bytes
the front end was passing.
Data: NBYTES - Number of bytes in the string
BYTCNT - Count of free bytes in user buffer
>)
BUG.(CHK,FEOCPB,DSKALC,SOFT,<FEFSYS - Failed to backup ROOT-DIRECTORY>,<<T1,STRCOD>>,<
Cause: A copy of the Root-Directory was not made due to one of the
following errors:
1. Not enough free space
2. Could not get JFN
3. Root-Directory or symbol table is bad
Action: Determine the cause (1, 2 or 3) and try to fix the problem.
Data: STRCOD - Structure unique code
>,,<DB%NND>)
BUG.(CHK,FEUSTS,FESRV,HARD,<FESSTS - Unknown status>,<<A,STATUS>>,<
Cause: TOPS-20 received an FE: device status message that did not indicate
an end of file condition.
Data: STATUS - Status byte from RSX20F
>)
BUG.(CHK,FILBAK,FILINI,HARD,<FILCRD - Could not create backup of ROOT-DIRECTORY>,,<
Cause: CREBAK failed to create the backup copy of the root directory during a
file system creation.
Action: There appears to be a hardware problem on this system. If the trouble
persists, have Field Service check the system. If the problem can be
reproduced on healthy hardware, send in an SPR along with a dump and
instructions on reproducing the problem.
>,,<DB%NND>)
BUG.(INF,FILBAT,DISC,HARD,<DSKCLZ - File marked as possibly bad>,<<T4,DIRNUM>,<T2,STR>>,<
Cause: A file is being closed and the OFN for the file contains a bit
indicating a possible error. The file's FDB is marked.
Data: DIRNUM - directory number
STR - structure name in SIXBIT
>,,<DB%NND>)
BUG.(CHK,FILBOT,FILINI,HARD,<FILRFS - Could not create BOOTSTRAP.BIN file>,,<
Cause: BOTSYS failed to create a BOOTSTART.BIN file during a BS: refresh. See
the documentation for FILFEF for possible reasons for this failure.
Action: Select a different disk pack to build the system on. If the trouble
persists, have Field Service check the system. If the problem can be
reproduced on healthy hardware, send in an SPR along with a dump and
instructions on reproducing the problem.
>,,<DB%NND>)
BUG.(HLT,FILBTB,FILINI,HARD,<FILRFS - Unable to write bit table file>,,<
Cause: This BUGHLT occurs when FILRFS is refreshing PS:. FILRFS calls
WRTBTB to write the bit table, and WRTBTB fails for a reason other than
MSTRX6 (home blocks are bad). WRTBTB fails for several reasons
including: GETBTB failed to get a JFN on the bit table file: the OPENF
failed: CHFDB, GTFDB, or MODHOM failed.
Action: Select a different disk pack to build the system on. If the trouble
persists, have Field Service check the system. If the problem can be
reproduced on healthy hardware, send in an SPR along with a dump and
instructions on reproducing the problem.
>)
BUG.(CHK,FILCCD,FILINI,HARD,<FILCRD - Could not create directory>,,<
Cause: FILCRD failed to create one of the standard system directories during a
file system creation because STRST failed to form a complete directory
string. STRTS fails if STRCNV cannot create a unique code from the
structure number or if a DEVST fails to convert the structure number to
a string.
Action: There appears to be a hardware problem on this system. If the trouble
persists, have Field Service check the system. If the problem can be
reproduced on healthy hardware, send in an SPR along with a dump and
instructions on reproducing the problem.
>,,<DB%NND>)
BUG.(CHK,FILFEF,FILINI,HARD,<FILRFS - Could not create front end file system>,,<
Cause: FEFSYS failed to create a front end file system. FEFSYS fails for
several reasons. For example, ASGPAG can fail or the count of front
end pages in the home blocks can be negative.
Action: Select a different disk pack to build the system on. If the trouble
persists, have Field Service check the system. If the problem can be
reproduced on healthy hardware, send in an SPR along with a dump and
instructions on reproducing the problem.
>,,<DB%NND>)
BUG.(CHK,FILHOM,FILINI,HARD,<FILRFS - Unable to rewrite home blocks in WRTBTB>,,<
Cause: FILRFS is attempting to refresh BS:. WRTBTB is called to write the bit
table. This BUGCHK indicates that WRTBTB failed because the home blocks
are bad.
Action: The home blocks must be repaired. Select a different disk pack to
build the system on. If the trouble persists, have Field Service check
the system. If the problem can be reproduced on healthy hardware, send
in an SPR along with a dump and instructions on reproducing the
problem.
>,,<DB%NND>)
BUG.(HLT,FILIRD,FILINI,HARD,<FILRFS - Could not initialize the ROOT-DIRECTORY>,,<
Cause: This occurs during special system startup if FILRFS, while trying
to build BS:, gets a failure return from DIRINI, which is trying to
initialize the root-directory. DIRINI fails if its call to MAPDIR
fails, or if the SETZM which first touches the directory fails. MAPDIR
fails if either the structure or directory number is out of range,
or if MAPIDX fails to map in the index table.
Action: Select a different disk pack to build the system on. If the trouble
persists, have Field Service check the system. If the problem can be
reproduced on healthy hardware, send in an SPR along with a dump and
instructions on reproducing the problem.
>)
BUG.(CHK,FILJB1,FILINI,HARD,<FILCRD - No room to create standard system directories>,,<
Cause: FILCRD could not create the standard system directories during a file
structure creation because it could not get JSB free space for use
during CRDIR calls.
Action: None of the system directories were created. If the problem can be
reproduced on healthy hardware, set this bug dumpable, send in an SPR
along with a dump and instructions on reproducing the problem.
>,,<DB%NND>)
BUG.(HLT,FILMAP,FILINI,HARD,<FILIN2 - Could not map in ROOT-DIRECTORY>,,<
Cause: During standard system startup, SETDIR failed to map in the root
directory for consistency checking. SETDIR fails if CNVSTR fails
to convert structure number data or if MAPDIR fails to map in the
directory, or if DR0CHK finds a directory header inconsistency.
Action: There appears to be a hardware problem with the BS: file structure. If
the trouble persists, have Field Service check the system. If the
problem can be reproduced on healthy hardware, send in an SPR along
with a dump and instructions on reproducing the problem.
>)
BUG.(HLT,FILRID,FILINI,SOFT,<FILRFS - Index table already set up for ROOT-DIRECTORY>,,<
Cause: This BUGHLT occurs if, during a refresh in the FILRFS routine
during system startup, the SETIDX call fails. That call is trying to
set up the index table for the root-directory for BS:. SETIDX fails if
it is passed a directory number that is out of range, or if the index
table is already set up but at a different spot than that requested in
the current call.
>)
BUG.(CHK,FIXBAD,FILINI,HARD,<FILIN3 - Could not re-write home blocks to point to FE file system>,,<
Cause: FIXFES has failed to re-write the pointer to the front-end file system
during startup. FIXFES can fail if it cannot get free space or the
front-end file system file is bad.
Action: There appears to be a hardware problem with the boot structure on this
system. If the trouble persists, have Field Service check the system.
If the problem can be reproduced on healthy hardware, send in an SPR
along with a dump and instructions on reproducing the problem.
>,,<DB%NND>)
BUG.(CHK,FIXBDB,FILINI,HARD,<FILIN3 - Could not re-write home blocks to point to BOOTSTRAP.BIN>,,<
Cause: FIXBOT failed to re-write the pointer to the BOOTSTRAP.BIN file into
the home blocks during system startup. FIXFES can fail if it cannot
get free space or if it cannot get a JFN for BOOTSTRAP.BIN.
Action: There appears to be a hardware problem with the boot structure on this
system. If the trouble persists, have Field Service check the system.
If the problem can be reproduced on healthy hardware, send in an SPR
along with a dump and instructions on reproducing the problem.
>,,<DB%NND>)
BUG.(CHK,FKCTNZ,FORK,SOFT,<Fork lock nest count non-zero>,<<JOBNO,JOB>,<FORKN,JBFORK>>,<
Cause: The FLOCK routine has encountered the nest count for the fork lock
being non-zero, which should not be, since the lock has just been
locked for the first time. This is probably due to some other
software not having cleared the nest count from some previous lock.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: JOB - Internal Job number whose fork discovered the non-zero
nest count.
JBFORK - Jobwide fork index of the discovering fork.
>)
BUG.(CHK,FKWSP1,SCHED,SOFT,<LOADBS - Unreasonable FKWSP>,<<T1,FKWSS>,<T2,COUNT>,<T3,FKCSIZ>,<FX,FORK>>,<
Cause: The value of FKCSIZ for this fork was found to be incorrect.
Specifically, the value of FKWSS was found to be less than the value of
FKCSIZ for this work. The correct value is being computed and saved in
FKCSIZ. This problem is difficult to diagnose.
Action: If this BUGCHK can be reproduced, change it to a BUGHLT and submit an
SPR along with a dump and how to reproduce the problem.
Data: FKWSS - Fork's reserve working set size
COUNT - Actual count of pages belonging to this fork
FKCSIZ - Saved count of pages belonging to this fork
FX - Fork number
>)
BUG.(INF,FLKINT,FORK,SOFT,<FLOCK - Called while NOINT>,,<
Cause: The routine FLOCK was called while the calling process was
unable to be interrupted. The calling fork was not nesting the lock
nor was it the top fork of the job. This indicated a logic error
because if this fork was unable to aquire the lock it will DISMS
while NOINT. This can cause a deadly embrace where the fork which
owns the lock is not relenquish it until the fork which has dismissed
is interrupted which never happens because the fork is NOINT.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,FLKNS,FORK,SOFT,<FUNLK - Lock not set>,<<FORKN,JOBFRK>>,<
Cause: The FUNLK routine, which unlocks the fork lock, detected that the
lock was already unlocked. This should not be, since anyone
calling FUNLK to unlock the lock presumably first called FLOCK to
lock it. This BUG is usually preceded by a FLKTIM BUGCHK. See
the description of FLKTIM for more details.
Action: No action is required for this BUG, especially if it was preceded
by a FLKTIM BUGCHK, unless a real problem in the fork lock logic is
suspected. If this is the case, make the BUG dumpable and submit an
SPR with the dump and a copy of MONITR.EXE. If possible, include
any known method for reproducing the problem and/or the state of the
system at the time the BUG was observed.
Data: JOBFRK - Job fork number of fork desiring the lock
>)
BUG.(CHK,FLKTIM,FORK,SOFT,<FLOCK - Fork lock timeout>,<<FORKN,JOBFRK>,<JOBNO,JOB>,<FLKOWN,OWNER>>,<
Cause: A fork has been waiting a "long time" for the fork lock.
This BUGCHK announces that the system is assuming that some fork has
neglected to unlock the fork lock and the waiting fork is being
given the lock even though someone else still has it.
The code could be in error here. The measure of a "long time" is
calculated arbitrarily and can be changed. It is parameter FLKTMV.
Action: This BUG appears if the fork owning the lock is hung due
to some other event (unit offline, CFS voting freeze, etc.). Usually,
this is not evidence of a real problem but just a temporary system
event which caused the fork timeout value to expire. This BUG is
usually followed by a FLKNS BUGCHK since this fork acquires and
unlocks the lock and then the fork which had it before attempts
to unlock the lock and finds it already unlocked.
There is no need to take any action due to this BUG unless a real
problem in the fork lock logic is suspected. If action is desired,
first, try increasing FLKTMV in STG.MAC and rebuilding the monitor.
If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: JOBFRK - Job fork number of fork desiring the lock
JOB - Internal Job number desiring the lock
OWNER - Job fork number of fork currently holding the lock
>)
BUG.(INF,FORCED,DOB,SOFT,<DOB - Requested BUGINF with continuable dump>,<<CTRLTT,CTRLTT>,<GBLJNO,GBLJNO>>,<
Cause: This BUGINF has been requested by a user running the DOBOPR program
or executing the DOB% JSYS function .DBIMD. There is no other
way that this BUGINF can occur. The name of the user who requested
the BUGINF has been printed on the CTY as part of the BUGINF output.
The purpose of this BUGINF is to force a continuable dump of memory.
A continuable dump should follow this BUGINF.
Data: CTRLTT - the controlling terminal of the user who requested this.
GBLJNO - the job number of the user who is requesting this.
Action: Examine the dump.
>,,<DB%REQ!DB%IGN>)
BUG.(HLT,FPTMXX,PAGEM,SOFT,<FPTA - Process address in sched context>,<<T1,ADR>>,<
Cause: FPTA has been called in scheduler context and given an address that is
part of the process/job context area.
Data: ADR - Given address
>)
BUG.(CHK,FRKBAL,PAGEM,SOFT,<AGESET - Fork not in BALSET>,,<
Cause: While adding a page to a process's working set, AGESET detected that
the working set is not in memory.
>)
BUG.(CHK,FRKNDL,SCHED,SOFT,<HLTFRK - Fork not properly deleted>,,<
Cause: HLTFRK was called to complete the deletion process for a fork but a
check of FKCSIZ showed that not all pages belonging to this fork have
been deleted. This indicates an inconsistency in the monitor's data
base.
Action: If this BUGCHK can be reproduced, set it dumpable and submit an SPR
along with a dump and how to reproduce the problem.
>)
BUG.(HLT,FRKPTE,PAGUTL,HARD,<BADCPG - Fatal error in fork PT page>,,<
Cause: A hardware error (AR/ARX parity error or MB parity error) was detected
when the monitor referenced a page in memory that contained a process's
page table. The monitor has printed an analysis of the error on the
CTY, and A SYSERR entry is created when the monitor is rebooted.
Action: Field Service should check out the system.
>)
BUG.(HLT,FRKSLF,FORK,SOFT,<SUSFK - Given self as argument>,,<
Cause: Some routine in the monitor has erroneously tried to suspend
itself with SUSFK.
>)
BUG.(HLT,FSICFS,DSKALC,SOFT,<Could not register BS with CFS>,,<
Cause: Some other CFS system has this structure mounted exclusively
or as an alias and is preventing this system from mounting the
structure. This is usually an administrative problem.
Action: Find out which other system in the cluster has this system's BS:
mounted and then dismount it from that system or set the access to
shared. Note, it could be more than one system in the cluster so
make sure you check each system in the cluster. Since OPR may
indicate that the structure is dismounted (since the system is down),
STRTST (or some other tool) might have to be used to dismount the
structure or change the access.
>)
BUG.(HLT,FSPANN,FREE,SOFT,<ASGFRE called OKINT>,<<T4,POOLN>,<T3,CALRPC>>,<
Cause: This is a free space problem. Calls to swapable free space
routines should be made only while the calling process is NOINT. The
calling routine is not protecting itself from losing free space. It is
OKINT. Since it is OKINT it could get interrupted and never return,
thus losing the free block assigned.
Action: The data supplied gives the address of the calling routine. Make
the routine NOINT until it has insured that the block is
freed when it is interrupted (e.g. JSB stack).
Data: POOLN - Pool number
CALRPC - Caller of RELFSP
>)
BUG.(HLT,FSPARB,FREE,SOFT,<RELFSP - Bad block being released>,<<Q1,POOLN>,<T4,CALRPC>,<P2,BLKADR>>,<
Cause: The caller is attempting to release a block that has already been
released.
Action: Look at the stack to show the caller. It is possible that the
length of the current block is incorrect. It is equally likely that
the block(s) before this block (in free space) have had incorrect
lengths on return. Thus, the caller may not be the culprit.
Data: POOLN - Pool number
CALRPC - PC of caller of RELFSP
BLKADR - Address of user block
>)
BUG.(HLT,FSPBBS,FREE,SOFT,<Bad blocksize>,<<P1,POOLN>,<P2,BLKADR>>,<
Cause: This is a free space problem; The block size is either smaller
than the minimum block size for this pool, or larger than the entire
amount of space allocated to the pool.
Action: If the condition is noticed when a block is being returned to the list,
it is simply marked deassigned (and bad) but not returned to the list.
If the list header is large enough to include it, the PC, job
number and fork number of the assigner and deassigner are stored in
the header/trailer. These may give a clue as to what code caused the
problem.
Data: POOLN - Pool number
BLKADR - Address of the block. Zero indicates it is the pool
descriptor itself that contains the bad pointer.
>)
BUG.(HLT,FSPBLK,FREE,SOFT,<Block damaged>,<<T1,POOLN>,<P2,BLKADR>>,<
Cause: This is a free space problem. The header of the block does not match
its trailer.
Action: If the condition is noticed when a block is being returned to the list,
it is simply marked deassigned (and bad) but not returned to the list.
If the list header is large enough to include it, the PC, job
number and fork number of the assigner and deassigner are stored in
the header/trailer. These may give a clue as to what code caused the
damage.
Data: POOLN - Pool number
BLKADR - Address of the block. Zero indicates it is the pool
descriptor itself that contains the bad pointer.
>)
BUG.(HLT,FSPBND,FREE,SOFT,<RELFSP - Block out of range>,<<Q1,POOLN>,<T3,CALRPC>,<P2,BLKADR>>,<
Cause: This is a free space problem. The caller to the free space
routines is trying to return a block that was not given
out by the free space manager. The block is outside the
range of free space management.
Action: Look through the dump. By looking at the stack you
should be able to determine who called for the releasing
of the block.
Data: POOLN - Pool number
CALRPC - PC of caller to RELFSP
BLKADR - Address of block being returned
>)
BUG.(HLT,FSPBPC,FREE,SOFT,<RELFSP - Bad pool count>,<<Q1,POOLN>,<T3,CALRPC>,<P2,BLKADR>>,<
Cause: This is a free space problem. The caller to the free space
routines is trying to return a block so that when the pool count
is augmented by the blocksize, an invalid number results. The
blocksize may be in error, or the pool count may already be in
error.
Action: Look through the dump. If the blocksize is wrong, then study the
code at the calling PC for possible errors. If the pool count is
wrong, then more investigation is required. The history buffer for the
pool may contain helpful data.
Data: POOLN - Pool number
CALRPC - PC of caller to RELFSP
BLKADR - Address of block being returned
>)
BUG.(CHK,FSPBPN,FREE,SOFT,<FSPREM - Bad pool number>,<<T1,POOLN>,<T2,CALLER>>,<
Cause: This BUGCHK occurs when a routine calls routine FSPREM in FREE
to determine how much free space is left in a given pool and
the pool number supplied is either out of range (greater or
equal to FSPTBL) or does not exit (FSPTAB entry equals zero).
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Use the dump to examine the
calling routine and fix it to supply a valid pool number.
Data: POOLN - Pool number
CALLER - Address of calling routine
>)
BUG.(HLT,FSPDNN,FREE,SOFT,<RELFSP called OKINT>,<<T4,POOLN>,<T3,CALRPC>>,<
Cause: This is a free space problem. The calling routine is trying to release
a swapable free space block while it is OKINT. This is dangerous since
it could get interrupted and loose the block. All free space actions
should occur while NOINT.
Action: The data supplied gives the address of the calling routine. Make
the routine become NOINT when it removes the address of the block
about to be released from the database. The routine can be made
OKINT when control is returned to it.
Data: POOLN - Pool number
CALRPC - PC of caller of RELFSP
>)
BUG.(INF,FSPOUT,FREE,SOFT,<Freespace pool exhausted>,<<T2,POOLN>>,<
Cause: This is a free space problem. There is no more space available
in the freespace pool.
Action: The data supplied gives the pool number (pool descriptor index) of
the pool in question. If the pool repeatedly runs out of space,
the pool size must be increased and the monitor rebuilt. The pool
size is specified in STG.MAC as the third argument in the FSPPL.
macro used to build the freespace pools.
Data: POOLN - Freespace pool number
>)
BUG.(HLT,FSPPRE,FREE,SOFT,<RELFSP - Bad block being released>,<<Q1,POOLN>,<T4,CALRPC>,<P2,BLKADR>>,<
Cause: This is a free space problem. The block being returned does not fit
into the free pool. The block would overlap the preceding block
in the pool.
Action: Look at the stack to show the caller. It is possible that the
length of the current block is incorrect. It is equally likely that
the block(s) before this block (in free space) have had incorrect
lengths on return. Thus, the caller may not be the culprit.
Data: POOLN - Pool number
CALRPC - PC of caller of RELFSP
BLKADR - Address of user block
>)
BUG.(HLT,FSPSCC,FREE,SOFT,<RELFSP - Bad block being released>,<<Q1,POOLN>,<T1,CALRPC>,<P2,BLKADR>>,<
Cause: This is a free space problem. The block being returned does not fit
into the free pool. The block would overlap the succeeding block
in the pool.
Action: Look at the stack to show the caller. It is possible that the
length of the current block is incorrect. It is equally likely that
the block(s) before this block (in free space) have had incorrect
lengths on return. Thus, the caller may not be the culprit.
Data: POOLN - Pool number
CALRPC - PC of caller of RELFSP
BLKADR - Address of user block
>)
BUG.(HLT,FSPZER,FREE,SOFT,<ASGFSP - Illegal to assign 0 FREE space>,<<T1,POOLN>,<T3,CALRPC>>,<
Cause: An illegal request for free space is being made. The calling routine
is asking for zero words of free space.
Action: Look at the dump. By backing up the stack you
should be able to tell what routine called for the illegal
free space.
Data: POOLN - Pool number
CALRPC - PC of caller of ASGFSP
>)
BUG.(CHK,GALCHF,CLUDGR,SOFT,<GALCHK failed>,<<T1,ERROR>>,<
Cause: The call to GALCHK failed because GALCHK could not get
enough free space from the system pool to do an MUTIL%
JSYS or the call to MUTIL% failed. Therefore, it could
not verify whether or not this job was part of GALAXY.
Action: If this BUGCHK appears, use the DOB% facility to take a
dump and submit an SPR about this problem.
Data: ERROR - Error code returned from MUTIL% or ASGRES.
>)
BUG.(CHK,GIVTMR,JSYSA,SOFT,<GIVOK timeout>,<<T2,FUNC>>,<
Cause: The access control job has not responded with a GIVOK within the
designated time period.
Action: If this consistently happens with the same function code, you
should see if the processing of the function can be made faster.
If there is no obvious function code pattern, you may need to increase
the timeout period or rework the way in which the access control
program operates.
Data: FUNC - the GETOK function code
>,,<DB%NND>)
BUG.(HLT,GLFNF,SCHED,SOFT,<GLREM - Fork not found>,,<
Cause: The scheduler is trying to remove a process from its linked list of
runnable processes (the GOLIST). The BUGHLT occurs because the
scheduler does not find the process in the GOLIST. This indicates an
inconsistency in the scheduler's data base.
>)
BUG.(CHK,GTFDB1,DISC,SOFT,<DSKINS - GETFDB failure.>,,<
Cause: The newly created file data block to mark file as being temporary
cannot be found.
>)
BUG.(HLT,GTFDB2,DISC,SOFT,<NEWLFP - GETFDB failure for open file>,,<
Cause: The FDB for a long file cannot be found, even though the FDB for
that file was found previously. The file is opened, but the FDB is
gone.
>)
BUG.(HLT,GTFDB3,DISC,SOFT,<DSKREN - GETFDB failure for open file>,,<
Cause: The RNAMF JSYS has detected a monitor internal error. It has created
an FDB for the destination file, and an internal routine that
finds an FBD in a directory has returned with a failure,
indicating an inconsistency in the newly-created FDB.
>)
BUG.(HLT,GTFDB6,JSYSF,SOFT,<CRDI0A - Cannot do GETFDB on ROOT-DIRECTORY >,,<
Cause: There was an error in creating the Root-Directory. Either the
FDB could not be mapped or the index table could not be set up.
Action: Use CHECKD to determine if the disk is OK. If you cannot repair
the structure with CHECKD, then it may need to be rebuilt.
>)
BUG.(CHK,GWYFNB,IPIPIP,SOFT,<FNDGWY returned unconnected gateway>,,<
Cause: The internet gateway lookup routine has returned the address for
a gateway that is not a neighbor.
>)
BUG.(CHK,HARDCE,APRSRV,HARD,<Hard cache errors--cache deselected>,,<
Cause: The hardware has detected an AR or ARX parity error that occurs only
when an address is referenced through the cache. An attempt to
reference the same address from memory with the cache turned off has
succeeded. This has happened more than the allowable maximum number of
times. The monitor turns off the cache and proceeds.
The monitor has printed a description of the problem on the CTY and
created a SYSERR block, which is written into the SYSERR file.
Action: Field Service should look at the system. The monitor continues to
run without the cache. However, when the front end reloads the monitor
at some future time, the front end enables the cache. Change the
configuration file in order to avoid BUGCHK until the cache is fixed.
>,,<DB%NND>)
BUG.(HLT,HOMGON,PHYSIO,SOFT,<FRTHOB - Missing homeblock IORB>,<<P1,CHN>,<P2,KONT>,<P3,UNIT>>,<
Cause: Missing homeblock IORB when we believe there should be one on the PWQ.
Data: CHN - The channel number
KONT - The controller number
UNIT - The unit number
>)
BUG.(CHK,HPSCHK,SCHED,SOFT,<Excessive time in high priority>,<<T2,JOBNO>,<FX,FRKNO>>,<
Cause: A fork has entered a high priority scheduling condition (PIBMP, CSKED,
or JP%SYS), and has remained compute-bound for more than 5 seconds.
The fork has probably malfunctioned in some way, and the high
scheduling priority is affecting overall system response. The high
priority status is disabled until the fork itself clears the condition.
Action: The additional data contains the job number and system fork number.
The program should be changed to either not be so compute bound or not
set itself as high priority.
Data: GBLJNO - Job number
FRKNO - Fork number
>)
BUG.(CHK,HSHERR,JSYSA,HARD,<VERACT - Hash value out of range>,,<
Cause: An account string was being hashed by routine HSHNAM in JSYSA in a
effort to validate an account. This BUG. indicates that HSHNAM returned
a hash value that is illegal.
Action: If this BUGCHK is reproducible, set this bug dumpable and submit an SPR
along with the dump along with instructions on reproducing the problem.
>)
BUG.(HLT,HSYFRK,JSYSM,SOFT,<HSYS - Job 0 CFORK failed>,,<
Cause: This occurs if the CFORK JSYS fails to create a fork for
shutting down the system.
This failure occurs if the forks are totally used up, or if job
0 has used the maximum number of forks permitted. NUFKS contains this
maximum number.
Action: This is almost always caused by running GALAXY under job 0.
Investigate why there are too many forks under job 0 and move
some of them out.
>)
BUG.(HLT,IBCPYW,PAGEM,SOFT,<COPY - Write pointer in index block>,,<
Cause: A page fault occurred because a process attempted to write into a page
whose access was copy-on-write. The BUGHLT indicates that the page
table is an index block that should never have copy-on-write access.
>)
BUG.(HLT,IBOFNF,FILINI,SOFT,<FILINI - ASNOFN failure for ROOT-DIRECTORY index block>,,<
Cause: During normal system startup, the call to SETRDO failed to set an OFN
for the PS: root-directory. SETRDO fails if there is no SDB for the
structure, or if ASROFN fails to assign an OFN.
>)
BUG.(HLT,ICMBDE,IPIPIP,SOFT,<ICMERR -- Bad type code>,<<T1,D>>,<
Cause: The ICMERR routine was called to send an ICMP error message with
a message type code that is not supported by the monitor.
Data: D - Error code
>)
BUG.(HLT,ICMNST,IPIPIP,SOFT,<No storage for ICMP>,,<
Cause: During TCP/IP initialization the monitor was unable to obtain
the free space needed for ICMP message processing. This
probably indicates that internet free space is corrupted.
>)
BUG.(HLT,IDXNOS,FILINI,SOFT,<FILRFS - Could not assign free space for IDXTAB>,,<
Cause: During a refresh start (BS: is being built), if the call to ASGPAG for
getting buffer space for the index table fails, this BUGHLT happens.
ASGPAG fails if JBCOR has no 1-bits left in it, meaning that there
are no free pages left in free space.
>)
BUG.(HLT,ILAGE,PAGEM,SOFT,<Bad age field in CST0>,,<
Cause: The age of a memory page contains an unexpected value. One of the
following happened:
1. A page fault occurred and the age was either PSDEL or an undefined
age LESS THAN PSASN.
2. A process attempted to assign the page and its age was PSDEL, PSSPQ,
or an undefined age LESS THAN PSASN.
>)
BUG.(HLT,ILCHS1,PHYSIO,HARD,<PHYSIO - Illegal channel status at SIO>,,<
Cause: The STRTIO routine was called to begin IO for an IORB, but the channel
status indicated that the channel was already active doing a stacked
command.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILCHS2,PHYSIO,HARD,<PHYSIO - Illegal channel state at STKIO>,,<
Cause: The STKIO routine was called to set up a second command for a channel,
but the channel status indicated it already had a second command in
progress.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILCNSP,PHYSIO,HARD,<PHYSIO - Illegal call to CONSPW>,,<
Cause: The routine CONSPW was called to remove an element from the position
wait queue of a unit, but the arguments are illegal. Either the
arguments are null, or CONSPW is trying to remove more than one element
because it was passed more than one arguement.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILCNST,PHYSIO,HARD,<PHYSIO - Illegal call to CONSTW>,,<
Cause: The routine CONSTW was called to remove an element from the transfer
wait queue of a unit, but the arguments are illegal. Either the
arguments are null, or CONSTW is trying to remove more than one element
because it was passed more than one arguement.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILCST1,PAGUTL,SOFT,<Illegal address in CST1 entry, cannot restart>,,<
Cause: The monitor is attempting to complete I/O that was taking place when
the system crashed. The backup address in the CST is invalid for some
core page. Note: This code is executed only if the monitor is manually
started at location EVRST. This is not a recommended procedure.
>)
BUG.(HLT,ILDEST,PAGEM,SOFT,<Illegal destination identifier to SETMPG or SETPT>,,<
Cause: A routine has been called to change the map for a page. The caller
provided a source identifier for a page table (an SPT index) rather
than a single page. The BUGHLT indicates that the caller provided a
file page as a destination. This is illegal when the source is a page
table.
>)
BUG.(CHK,ILDRA1,SWPALC,HARD,<DASDRM - Illegal or unassigned drum address>,,<
Cause: DASDRM was called to deassign a drum address, but the drum address
provided by the caller is invalid or already unassigned.
Action: If this BUGCHK can be reproduced, change it to a BUGHLT and submit an
SPR along with instructions on reproducing the problem.
>)
BUG.(HLT,ILDRA2,SWPALC,SOFT,<DRMIAD - Illegal drum address>,,<
Cause: AN illegal drum address was given to the DRMIAD subroutine, which
computes disk tracks and sectors.
>)
BUG.(INF,ILDSTF,DATIME,SOFT,<Illegal Daylight Saving Time flag>,<<T1,DSTFLG>>,<
Cause: Location DSTFLG contains an illegal value. The most likely case of
this bug is a new way of confusing DST that subroutine DSTCHK wasn't
informed about.
Action: If this BUGINF occurs, the Daylight Saving Time flag is reset
to zero, using the default system in the monitor. Patch DSTFLG
to a legal value in your monitor to avoid this BUGCHK.
Data: DSTFLG - Daylight savings time flag
>)
BUG.(HLT,ILESCD,PAGEM,SOFT,<Monitor section pointer not shared>,<<T1,POINTER>,<Q2,SECTION>>,<
Cause: A pointer for a monitor section has been found that is not a share
pointer. Only share pointers are expected. If other pointer types are
used, the code at FPTMSS must be enhanced. It is possible that the
monitor section table has been clobbered.
Data: POINTER - The pointer
SECTION - The monitor section for which it was found
>)
BUG.(HLT,ILFPTE,PAGEM,SOFT,<ILLFPT - Illegal section number referenced>,,<
Cause: A routine was called to translate a virtual address into an internal
identifier. The BUGHLT indicates that the caller provided a monitor
address that contained an invalid section number. This can mean one of
the following: (1) The section number is larger than the maximum
possible section or (2) On a machine that does not support extended
addressing, a non-zero section number was provided.
>)
BUG.(HLT,ILGDA1,SWPALC,SOFT,<GDSTX - Bad address>,,<
Cause: A bad drum address was given to the GDSTX routine, which converts drum
address into indexes into the DST table. Consequently, the GDSTX
routine did not try to compute the index.
>)
BUG.(HLT,ILGDA2,SWPALC,SOFT,<GDSTX - Bad address>,,<
Cause: Bad index into the DST was computed from a drum address that was given
to GDSTX for conversion.
>)
BUG.(CHK,ILGOKM,JSYSA,SOFT,<Illegal function for GETOKM call>,<<T1,GOKFCN>>,<
Cause: The GETOKM routine was called with an unknown function code. GETOKM
handles internal GETOK requests from the monitor.
Action: Set this bug dumpable and submit an SPR along with the dump and
instructions on reproducing the problem.
Data: GOKFCN - GETOK function code
>)
BUG.(CHK,ILIBPT,PAGUTL,HARD,<Bad pointer type in index block>,,<
Cause: SCNOFN was called to scan an index block and move all its pages to disk
but the index block contains a pointer that is not an immediate
pointer.
Action: This problem is usually seen when there is a hardware problem with a
disk or channel. Field Service should run SPEAR to read the SYSERR
file and check for problems.
However, if the hardware checks out and the problem is reproducable,
set this bug dumpable and submit an SPR along with the dump and how to
reproduce the problem.
>,,<DB%NND>)
BUG.(HLT,ILIRBL,PHYSIO,HARD,<PHYSIO - IORB link not null at ONFPWQ>,,<
Cause: The routine ONFPWQ was called to place an IORB at the front of the
position wait queue for a unit. But the link field in the IORB
pointing to the next IORB was not null.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(CHK,ILJRFN,FORK,SOFT,<JFKRFH - Bad JRFN, ignored>,,<
Cause: Routine JFKRFH was erroneously called with a fork number which
is out of range. The correct range is a value less than NUFKS.
JFKRFH changes a fork number into a fork handle.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(HLT,ILLDMS,APRSRV,SOFT,<BADDMS - Illegal DMS JSYS from monitor context>,<<KIMUPC,PC>>,<
Cause: The monitor has issued a JSYS that requests a service of the RMS-20
package. These JSYSs are legal in user mode only. An illegal
instruction trap is given to the current process.
Data: PC - PC in monitor address space where JSYS was invoked
>)
BUG.(HLT,ILLGO,PHYH2,HARD,<Invalid channel logout>,,<
Cause: The routine CKERR was called to check for channel errors after an I/O
operation. The operation supposedly succeeded according to the IORB
status bits. But in verifying for a short style IORB that the I/O was
done correctly, the page number contained in the channel logout area
did not match the number of the page on which the IORB wanted to
perform I/O.
Action: This is usually caused by a RH20 or drive problem. Have Field Service
check the hardware, particularly the RH20 and RP0x DCL.
>)
BUG.(CHK,ILLMJS,APRSRV,SOFT,<JSYS with E GTR 1000 executed in monitor>,<<FPC,PC>>,<
Cause: A JSYS with E greater than 1000 has been executed in the monitor.
There should be no such cases.
Action: If you can reproduce this BUGCHK, set it dumpable and send in an SPR
with the dump and instructions on reproducing the problem.
Data: PC - PC of JSYS
>)
BUG.(CHK,ILLTAB,LOGNAM,SOFT,<TABLK2 - Table not in proper format>,<<Q1,TABADD>>,<
Cause: A logical name table is not in the proper alphabetic order.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: TABADD - Address of logical name table
>)
BUG.(HLT,ILLUUO,APRSRV,SOFT,<Illegal UUO from monitor context>,<<KIMUFL,FLAGS>,<KIMUPC,PC>,<KIMUEF,EFFADR>>,<
Cause: The monitor has executed an instruction that the microcode treats as an
MUUO. The op code is not 104 (for a JSYS) or one of the KA10 floating
point instructions.
Action: This bug is commonly caused by a software problem, but can be caused by
bad hardware. If the hardware checks out OK, and the BUGHLT is
reproducible, then send in an SPR along with the dump and instructions
on reproducing the problem.
Data: FLAGS - Processor flags when MUUO was executed
PC - PC in monitor address space where MUUO was executed
EFFADR - Effective address of MUUO
>)
BUG.(HLT,ILMNRF,PAGEM,SOFT,<Illegal reference to monitor address space>,<<T1,PFW>,<T2,FLAGS>,<T3,PC>>,<
Cause: The monitor made an illegal reference to an address in its map and was
not prepared to handle the error. The possible errors include illegal
read, write, and section number. See the page fail word for the reason
code.
This BUGHLT can also occur if an unrecoverable AR/ARX parity error is
detected on certain monitor references. In this case, the analysis of
the error has been printed on the CTY.
Action: If this BUGHLT was not preceeded by an unrecoverable AR/ARX parity
error, please submit an SPR with a dump and how to reproduce the
problem.
Data: PFW - Page Fail Word
FLAGS - PC flags
PC - PC
>)
BUG.(HLT,ILOFN1,PAGEM,SOFT,<MSCANP - Illegal OFN>,,<
Cause: A routine has been called to scan the pages of a file to find the first
non-zero page. Its arguments include an OFN associated with the file.
The BUGHLT occurs because the caller has passed a 0.
>)
BUG.(HLT,ILOKSK,SCHED,SOFT,<OKSKED executed when not NOSKED>,,<
Cause: A process has declared itself to be OKSKED and ready to cease running
(dismiss) until some event occurs. This is clearly a software problem.
This BUGHLT occurs because the process is OKSKED, indicating a mismatch
of NOSKED and OKSKED states.
>)
BUG.(HLT,ILPAG1,PAGEM,SOFT,<SWPOT0 - Invalid page>,,<
Cause: A routine was called to swap out a page in core. The BUGHLT indicates
that the caller provided a bad argument, resulting in one of the
following:
1. The page is not in core.
2. The page is part of the resident monitor.
3. The page is locked in memory.
4. The page is already being swapped.
>)
BUG.(HLT,ILPAGN,PAGUTL,SOFT,<MRKMPG - Invalid page number>,,<
Cause: A routine has been called to mark a page as modified (to set the CORMB
flag in CST0). The BUGHLT indicates that the core page number provided
by the caller is invalid.
>)
BUG.(HLT,ILPDAR,PHYSIO,HARD,<PHYSIO - Illegal disk address in PAGEM request>,,<
Cause: The routine PHYSIO was called to queue up an IORB for PAGEM, but the
disk or swapping address, or unit was illegal. All such arguments
should have been checked by the caller.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(CHK,ILPID1,IPCF,HARD,<CREPID - Attempt to create illegal PID>,,<
Cause: CREPID called PUTPID with an illegal PID number.
Action: If these persist, change the CHK to a HLT and examine the dump to
determine why GETPID returned an illegal PID value.
>)
BUG.(CHK,ILPID2,IPCF,HARD,<DELPID - Validated PID turned illegal>,,<
Cause: A PID which had previously been blessed by VALPID was found to be
illegal by PUTPID.
Action: If the problem continues, change this CHK to a HLT and try to
determine from a dump why the PID was OKed by VALPID and rejected
by PUTPID.
>)
BUG.(HLT,ILPLK1,PAGUTL,SOFT,<MLKPG - Illegal arguments>,,<
Cause: A routine was called to create a page in the monitor's address space.
But the caller provided a page identifier that pointed to an existing
page.
>)
BUG.(HLT,ILPPT1,PAGUTL,HARD,<UPDOFN - Bad pointer in page table>,,<
Cause: The monitor is updating the disk index block for a file. The index
block contains an address of a file page that is incorrect for one of
the following reasons:
1. It is a memory address of non-existent memory or in the
resident monitor.
2. There is no disk address for the page.
Action: This problem can be caused by bad hardware; Field Service should check
the system. If the hardware is not at fault, send in a SPR along with
a dump and any information on how to reproduce the problem.
>)
BUG.(HLT,ILPPT3,PAGEM,SOFT,<Bad pointer in page table>,,<
Cause: A page fault occurred because a process touched a page whose map entry
contained access bits of 0, indicating non-existent page. But when the
monitor mapped the page table, the page's entry was not 0. A
non-existent page should always be represented by an all-zero entry.
>)
BUG.(HLT,ILPSEC,APRSRV,SOFT,<Illegal section number>,<<UPTPFO,PC>,<UPTPFW,PFW>>,<
Cause: While running in scheduler context, the monitor made a reference to an
address whose section number exceeded 37.
Data: PC - PC when instruction was executed
PFW - page fail word
>)
BUG.(HLT,ILPTN1,PAGEM,SOFT,<MRPACS - Illegal PTN>,,<
Cause: A routine has been called to determine the possible access to a page.
Its arguments include the SPT index for the page table associated with
the page. The BUGHLT occurs because the caller has passed a 0.
>)
BUG.(HLT,ILRBLT,PHYSIO,HARD,<PHYSIO - IORB link not null at ONF/STWQ>,,<
Cause: One of the routines ONFTWQ or ONSTWQ was called to insert an IORB into
the transfer wait queue, but the link word for that IORB was not zero.
IORBs should always contain a null link when they are created or
removed from a queue, so that many queue handling errors can be
detected.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILSPTH,PAGEM,SOFT,<SETPT - SPTH inconsistent with XB>,,<
Cause: A routine has been called to change the map for a page of a process.
The page is being mapped to a file page for which the index block has a
share pointer. The share pointer points to an SPT slot. The BUGHLT
indicates that the SPT slot is not owned by the file page whose map
word points to it. This indicates an inconsistency in the monitor's
data.
>)
BUG.(HLT,ILSPTI,PAGEM,SOFT,<Illegal SPT index given to SETMXB>,,<
Cause: A routine has been called to change the map for a page. The caller
provided a source identifier for a page table (an SPT index) rather
than a single page. The BUGHLT indicates that the source identifier is
an invalid SPT index, larger than the maximum value allowed.
>)
BUG.(HLT,ILSRC,PAGEM,SOFT,<Illegal source identifier given to SETPT>,,<
Cause: A routine has been called to change the map for a page. The caller is
expected to provide an identifier for the source that is of the form
(SPT index,,page number). The BUGHLT indicates that the right half of
the identifier contains an illegal value (that exceeds 777).
>)
BUG.(HLT,ILSWPA,PAGEM,SOFT,<SWPIN - Illegal swap address>,,<
Cause: A routine has been called to swap a page into core. The backup address
for the page is of an illegal format. This indicates a software
problem.
>)
BUG.(HLT,ILTWQ,PHYSIO,HARD,<PHYINT - TWQ or PWQ incorrect>,,<
Cause: In the PHYINT routine to handle an interrupt, after the lower level
interrupt code has returned, a check is made to see if the IORB
returned matched the first element of either the position wait queue or
the transfer wait queue. The returned IORB did not match the first
element in the queue checked.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILTWQP,PHYSIO,HARD,<PHYSIO - PWQ or TWQ tail pointer incorrect>,,<
Cause: The pointer to the last element in the position wait queue or transfer
wait queue (UDBPWQ or UDBTWQ) points to an IORB which has a non-null
link to further IORBs. This is checked in various routines such as
ONTWQ, ONPWQ, ONSTWQ, ONFPWQ, CONSTW, or CONSPW.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILULK1,PAGUTL,SOFT,<MULKPG - Tried to unlock page not locked>,,<
Cause: A routine was called to unlock a core page, but the page was not in
core, indicating it could not have been locked.
>)
BUG.(HLT,ILULK2,PAGUTL,SOFT,<Tried to unlock page not locked>,,<
Cause: A routine was called to unlock a core page, but the page's lock count
was 0.
>)
BUG.(HLT,ILULK3,PAGUTL,SOFT,<MULKMP - Illegal monitor address>,,<
Cause: A routine was called to unlock a core page in the monitor's address
space but the caller provided a page identifier that did not point to
the monitor's map.
>)
BUG.(HLT,ILULK4,PAGUTL,SOFT,<MULKCR - Illegal core page number>,,<
Cause: A routine was called to unlock a core page. The caller provided a page
number that was illegal because of one of the following:
1. The page is never locked because it is part of the resident monitor.
2. The page does not exist in physical memory.
>)
BUG.(HLT,ILUST1,PHYSIO,HARD,<PHYSIO - Unit status inconsistent at SIO>,,<
Cause: The STRTIO routine was called to start IO on a unit for an IORB, but
the unit or controller status indicated that the unit was already
active. IO should never be started on an active drive.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(CHK,ILUST2,PHYSIO,HARD,<PHYSIO - Unit status inconsistent at SPS>,,<
Cause: The routine STRTPS was called to begin a positioning request for a
unit, but the status indicated that the unit was already active and the
transfer wait queue was nonempty.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGCHK. If this BUGCHK persists and no
hardware problem can be found, change this BUGCHK to a BUGHLT and send
in an SPR with the dump and how to reprduce the problem.
>)
BUG.(HLT,ILUST3,PHYSIO,HARD,<PHYSIO - SCHSEK - Impossible unit status>,,<
Cause: The SCHSEK routine was called to start a position request for a unit,
but the status of the unit indicated it was not idle. SCHSEK should
only be called when a unit becomes inactive.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILUST4,PHYSIO,HARD,<PHYSIO - Controller active at SPS>,,<
Cause: The routine STRTPS was called to begin positioning on a unit, but the
controller status indicated it was already busy.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILUST5,PHYSIO,SOFT,<PHYSIO - Illegal channel or controller state at STKIO>,,<
Cause: The STKIO routine was called to stack up a second command for a
channel, so that the CDB and KDB (if it exists) should have been marked
as active. However, at least one of them wasn't active.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILUST6,PHYSIO,SOFT,<PHYSIO - Illegal unit state at STKIO>,,<
Cause: The STKIO routine was called to stack up a second command for a
channel, but the unit either was not active or was doing positioning.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,ILXBP,PAGEM,SOFT,<SETPT - Bad pointer in XB>,,<
Cause: A routine has been called to change the map for a page of a process.
The page is being mapped to a file page. The BUGHLT indicates that the
index block for the file contains an indirect pointer in memory. Only
share pointers and immediate pointers are legal for index blocks.
>)
BUG.(INF,IMINX1,IMPANX,HARD,<Unusual ANI interrupt>,<<T1,D>>,<
Cause: The monitor received an interrupt from the input side of the AN20
when it was supposed to be idle. This may indicate a hardware
problem with the AN20.
Data: D - CONI ANI
>,,<DB%NND>)
BUG.(HLT,IMPBAD,IPFREE,SOFT,<Internet free space - Attempt to return a buffer not in range>,,<
Cause: The Internet 1822 buffer facility has been called to return a buffer
which is not an 1822 buffer. From looking at the dump, find the caller
of this routine and figure out what kind of packet the caller is
really dealing with.
>)
BUG.(INF,IMPERN,IMPDV,HARD,<IMPDV: Received error notification message>,<<T1,HOST>,<T2,LINK>,<T3,TYPE>,<T4,SUBTYP>>,<
Cause: The IMP has detected an error in the last message transmitted to it.
The error is after the leader but before the end of the message. This
may indicate possible hardware problems in the AN20.
Action: Have field service look at the AN20.
Data: HOST - Host number
LINK - Link
TYPE - Error type
SUBTYP - Error subtype.
>)
BUG.(INF,IMPHNW,IMPDV,SOFT,<IMPDV: LHOSTN disagrees with the IMP>,,<
Cause: The monitor has received a NOP message from the IMP with an address
that disagrees with our known address. The IMP has been known to
send corrupted NOP message in the past but the problem is probably
that the SYSTEM:INTERNET.ADDRESS file has the wrong address for the
AN20 interface.
Action: Ensure proper operation of the IMP and then check this systems'
SYSTEM:INTERNET.ADDRESS file and see if the address for the AN20
interface is correct.
>,,<DB%NND>)
BUG.(HLT,IMPIBF,IMPANX,SOFT,<Internet buffers fouled>,<<T3,D>>,<
Cause: The monitor was trying to obtain an internet buffer and none were
available. This is a problem because the available buffer count
was non-zero.
Data: D - Pointer to buffer
>)
BUG.(INF,IMPINC,IMPDV,SOFT,<IMPDV: Received incomplete transmission message>,<<T1,HOST>,<T2,LINK>,<T3,TYPE>,<T4,SUBTYP>>,<
Cause: The IMP has declared that the last message transmitted to it was
incomplete. This may indicate possible hardware problems
with the AN20 or the following conditions (subtypes):
0. The destination host did not respond quickly enough to the
message.
1. The message was too long.
2. The AN20 took more than 15 seconds to transmit the message
to the IMP.
3. The message was lost in the network due to an IMP or circuit
failure.
4. The IMP could not accept the message within 15 seconds due
to unavailable resources.
5. The IMP had an IO failure during the receipt of this message.
Action: This BUG has been known to spuriously occur. If you get many at
one time, check your connections to the IMP and also check to
make sure that it is functioning correctly. The following
monitor data will be updated to help track this problem:
IMP9 - total number of incomplete transmissions
IMP9X - Count of each subtype indexed by the subtype number
IMP9LS - Last subtype seen
IMP9LH - IP address of failing host
Data: HOST - IP address of failing host
LINK - Link
TYPE - Error type
SUBTYP - Error subtype
>,,<DB%NND>)
BUG.(HLT,IMPIWW,IMPDV,SOFT,<IMPDV: Internet buffer word size wrong>,,<
Cause: The monitor has detected an illegal size in the NBBSZ field of an
internet buffer. This indicates the buffer is probably smashed.
This is probably a software problem.
>)
BUG.(HLT,IMPLKF,IMPDV,SOFT,<IMPDV: Attempt to lock buffer on freelist>,,<
Cause: The monitor has attempted to lock a buffer into memory in preparation
for IO and has determined that the buffer is not assigned or has been
smashed. This probably indicates a software problem.
>)
BUG.(CHK,IMPRMI,IMPDV,SOFT,<IMPDV: Regular message on irregular queue>,,<
Cause: The monitor has detected a type zero message on the irregular message
queue. This is not supposed to happen and indicates a software problem
in the monitor.
Action: If this problem persists, change this BUGCHK to a BUGHLT and submit
an SPR.
>)
BUG.(HLT,IMPULF,IMPDV,SOFT,<IMPDV: attempt to unlock buffer on freelist>,,<
Cause: The monitor has either attempted to unlock a buffer on the free
buffer list or the buffer is smashed. This probably indicates a
software problem.
>)
BUG.(HLT,IMPUUO,APRSRV,HARD,<Impossible MUUO>,,<
Cause: The monitor was called at its MUUO handler because the user executed an
MUUO. However, the op code reported by the microcode is in the range
1-37, which should have caused an LUUO.
Action: This bug is commonly caused by a hardware problem, but can be caused by
bad software. Field Service should check out the system. If the
hardware checks out OK, and the BUGHLT is reproducible, then send in an
SPR along with the dump and instructions on reproducing the problem.
>)
BUG.(HLT,IMPVBD,IPFREE,SOFT,<Internet free space - Attempt to return a buffer with the address smashed>,,<
Cause: The internet 1822 buffer facility has been called to return a buffer
which has a bad address.
>)
BUG.(INF,IMPXBO,IMPDV,SOFT,<IMPDV: Irreg msg buffer overflow>,,<
Cause: The irregular message buffer has overflowed and the monitor has had
to discard an irregular message (message type non zero) from the IMP.
This tends to indicate a possible hardware with the AN20 or a problem
with the IMP.
Action: Analysis of other BUGxxx information should shed light on the real
problem.
>)
BUG.(INF,IMPXUT,IMPDV,SOFT,<IMPDV: Received irreg msg with unknown link or type>,<<T1,HOST>,<T2,LINK>,<T3,TYPE>,<T4,SUBTYP>>,<
Cause: The monitor received an irregular message that either could not be
identified or is not supported by the monitor.
Data: HOST - Host number
LINK - Link
TYPE - Error type
SUBTYP - Error subtype.
>)
BUG.(INF,INDCNT,DTESRV,HARD,<DTESRV - Bad indirect count>,<<A,DTENO>>,<
Cause: The DTE was attempting to complete an indirect but the count in
the data being sent does not match the count in the indirect
packet.
Data: DTENO - DTE number.
>,,<DB%NND>)
BUG.(HLT,INGGP0,IPIPIP,SOFT,<GWYINI: Crucial storage missing>,,<
Cause: During TCP/IP initialization the monitor was unable to obtain the
free space needed for gateway message processing. This probably
indicates that internet free space is corrupted.
>)
BUG.(HLT,INGWA1,IPIPIP,SOFT,<INQINI: Free Storage gone>,,<
Cause: During system initialization no internet free space was available
for the initialization of the internet special queues.
>)
BUG.(HLT,INTBUF,IPIPIP,SOFT,<IPIPIP: Packet size smashed when unlocking Internet Buffer>,,<
Cause: The internet buffer locking facility was called to unlock
a buffer which appears to have a smashed local header.
>)
BUG.(HLT,INTDHF,IPIPIP,SOFT,<INTDWN - Impossible failure of NETHSH>,,<
Cause: The internet network hashing routine has failed to find the local
network. This probably indicates that the network hash table is
corrupted.
>)
BUG.(HLT,INTFR0,IPFREE,SOFT,<Internet free space - Block size clobbered>,,<
Cause: An attempt has been made to return a block to Internet free storage but
the block has an illegal size.
>)
BUG.(HLT,INTFR1,IPFREE,SOFT,<Internet free space - Block hash clobbered>,,<
Cause: An attempt has been made to return a block to Internet free storage but
the block's hash code appears to have changed indicating that the block
has been smashed.
>)
BUG.(HLT,INTFR2,IPFREE,SOFT,<Internet free space - Invalid block pointer>,,<
Cause: An attempt has been made to return a block to Internet free storage
with a pointer not between INTFRE and INTFRZ.
>)
BUG.(HLT,INTFR3,IPFREE,SOFT,<Internet free space - Returning free block>,,<
Cause: The internet free space facility is trying to return a free space block
that appears to be free already.
>)
BUG.(HLT,INTFR4,IPFREE,SOFT,<Internet free space - Block size requested too small>,,<
Cause: The internet free space facility made an internal call for a zero or
negative length block. An examination of the dump should include
a trace back on the stack to determine why the caller wants a 0 length
block.
>)
BUG.(HLT,INTFR5,IPFREE,SOFT,<Internet free space - Bad block size request>,,<
Cause: The internet free space facility made an internal call for a block
larger than the maximum supported size. Examination of the dump
should include a trace back on the stack to find the caller. There
is probably a mathmatical error in that routine when it calculates
how much freespace to obtain.
>)
BUG.(INF,INTFR6,IPFREE,SOFT,<Internet free space - Free storage exhausted>,,<
Cause: Internet free space is totally exhausted. This could be caused
by someone not returning freespace or an insufficient amount of
Internet freespace available.
Action: If this problem can be reproduced, change this BUGINF to a BUGHLT
and submit an SPR along with an unrun monitor, the dump, and any
other information on reproduction of the problem.
>)
BUG.(HLT,INTFR8,IPFREE,SOFT,<Internet free space - Smashed free block pattern>,,<
Cause: An attempt has been made to return a block to Internet free storage but
the check pattern in the block appears to have changed indicating that
the block has been smashed.
>)
BUG.(HLT,INTFR9,IPFREE,SOFT,<Internet free space - Bad backup pointer>,,<
Cause: An attempt has been made to return a block to Internet free storage but
the block's back pointer does not point to the start of the block,
indicating that the block has been smashed.
>)
BUG.(HLT,INTGW1,IPIPIP,SOFT,<Internet input packet smashed>,,<
Cause: The receive gateway code has determined that a buffer with a
corrupted local header has been passed from a device driver.
>)
BUG.(HLT,INTGW2,IPIPIP,SOFT,<INTLC0: Internet buffer list corrupted>,,<
Cause: The internet bypass send routine has determined that the internet
buffer list is corrupted.
>)
BUG.(HLT,INTMA0,IPIPIP,SOFT,<INTBEG: Can't create Internet fork>,,<
Cause: During system initialization the monitor was not able to create a fork
for the TCP/IP asynchronous process.
Action: It is possible that there are too many job 0 forks running. If you
are running GALAXY under job 0, then we suggest that you refer to
the KL10 System Manager's Guide on how to move GALAXY into a seperate
job for itself.
>)
BUG.(CHK,INTMA1,IPIPIP,SOFT,<Internet fork: unexpected interrupt>,,<
Cause: The TCP/IP asynchronous process has received an unexpected software
interrupt.
>)
BUG.(HLT,INTMS1,IPIPIP,SOFT,<INTLKB: Packet size smashed>,,<
Cause: The internet buffer locking facility was called to lock
a buffer which appears to have a smashed local header.
>)
BUG.(HLT,INTNQ1,IPIPIP,SOFT,<EnQ: Item not dequeued>,,<
Cause: The TCP/IP list enqueuing facility was called for an item
which was already queued on a list.
>)
BUG.(HLT,INTNQ2,IPIPIP,SOFT,<DeQ: Item not queued>,,<
Cause: The TCP/IP list dequeuing facility was called for an item
which was not queued on a list.
>)
BUG.(HLT,INTWA0,IPIPIP,SOFT,<RELBFR: Bit table fouled>,,<
Cause: The TCP/IP release wait bit mechanism was called to release a
wait bit and the wait bit facility was determined to be corrupted.
>)
BUG.(CHK,INTWA1,IPIPIP,SOFT,<SETWTB: Wait bit not assigned>,,<
Cause: The TCP/IP wait bit facility was called to set a wait bit and
the wait bit has not been assigned.
>)
BUG.(CHK,INTWA2,IPIPIP,SOFT,<CLRWTB: Wait bit not assigned>,,<
Cause: The TCP/IP wait bit facility was called to reset a wait bit and
the wait bit has not been assigned.
>)
BUG.(CHK,INVDFN,DTESRV,SOFT,<DTEDSP - Bad function specified>,,<
Cause: The caller of DTEDSP supplied an illegal controller function.
>,RTN)
BUG.(HLT,INVDTE,DTESRV,SOFT,<DTEQ - Invalid DTE specified>,,<
Cause: The DTE request queuer for outgoing messages has been given
an invalid (greater than 3) DTE number.
Action: Look at the dump. The stack should indicate the calling
routine.
>)
BUG.(HLT,IOPGF,APRSRV,SOFT,<IO page fail>,<<Q1,IOP>>,<
Cause: An APR interrupt occurred because an interrupt instruction caused a
page failure. This probably indicates that the interrupt instruction
provided by the monitor referenced a page that was not in memory. The
monitor has already checked for a DTE that made the reference and found
none. (However, it is possible for a software bug to cause a DTE to
generate an I/O page fail that the monitor cannot detect.) The monitor
has printed a description of the problem on the CTY.
Note that it has been demonstrated that if the AN20 is ever powered
down on a running system, there is a high probability of an IOPGF
occurring.
Action: If hardware is not suspected as the cause for this BUGHLT, and this
BUGHLT is reproducible, send in an SPR with the dump and instructions
on reproducing the problem.
Data: IOP - IOP word
>)
BUG.(CHK,IPABTO,IPNIDV,SOFT,<ARP buffer timeout>,,<
Cause: The system failed to get a buffer in order to send or reply Address
Resolution Protocol over the Ethernet. Furthermore, the last time a
buffer was assigned or released was more than 5 minutes ago. This could
mean one of the following:
(1) Internet freespace is corrupted and freespace has been lost.
(2) Someone owns most of the freespace and is blocked so the freespace
is not being released.
(3) There is a lot of Ethernet traffic, and buffers taken from Internet
free space are being tied up for long periods. In this case, you may
see an occasional IPABTO.
If the IPABTO occurs every 5 minutes instead of once in a long time,
then chances are good that (1) or (2) describes your problem.
Action: It is possible that some systems on the Ethernet are flooding TOPS-20
with messages. If this is the case, take corrective action against the
offending system.
It is also possible to make more freespace available by making the
monitor's host tables smaller. Digital distributes TOPS-20 with a vary
large value for NHOSTS. Since the monitor's host tables and internet
free space share the same fixed portion of address sapce, lowering
NHOSTS results in more free space. NHOSTS must be a prime number.
Each entry in the host table consists of 9 words. This is done by
assigning a smaller value to NHOSTS (in PARxxx) and rebuilding the
monitor. Location MHOSTS contains the negative number of host table
entries in use on the system.
>)
BUG.(CHK,IPACFA,IPCIDV,SOFT,<SCA ACCEPT failed>,,<
Cause: The internet SCA interface attempted to accept a connection and
failed. This indicates a problem with SCA or with the CI.
Action: See if there are other BUGINFs/BUGCHKs indicating a CI problem.
If so, have field service look over your system's CI. If not,
then change this BUGCHK to a BUGHLT and submit an SPR.
>)
BUG.(CHK,IPARPE,IPNIDV,SOFT,<ARP did not initialize>,<<T1,ERROR>>,<
Cause: The ethernet ARP (Address resolution protocol) portal did not
initialize. This is probably a temporary resource allocation problem
and may correct itself.
Action: If this BUGCHK becomes chronic, change it to a BUGHLT and submit
an SPR.
Data: ERROR - Error code returned by ARPINI
>)
BUG.(CHK,IPCFKH,IPCF,HARD,<CHKPDD - Could not find local fork handle>,,<
Cause: The fork number waiting for a PID does not exist in the SYSFK
table for this job.
>)
BUG.(CHK,IPCFRK,IPCF,HARD,<PIDINB - Cannot create forks for IPCF>,,<
Cause: PIDINI could not create a fork for pages in transit.
>)
BUG.(CHK,IPCJB0,IPCF,HARD,<PIDINI - Not in context of job 0>,,<
Cause: PIDINI was called by a job other than job 0.
>)
BUG.(CHK,IPCMCN,IPCF,HARD,<MESREC - Message count went negative>,,<
Cause: MESREC was called to copy an IPCF message into user space, but
GETMES found that there were no messages posted for this user.
>)
BUG.(CHK,IPCSOD,IPCF,HARD,<GETMES - Sender's count overly decremented>,,<
Cause: GETMES has discovered that the count of messages from a sender has
gone negative.
>)
BUG.(CHK,IPDWNS,IPNIDV,SOFT,<Datagram was not sent>,<<T1,ERROR>,<SRV,SERVICE>>,<
Cause: The internet ethernet software passed a buffer to NISRV to be sent and
and an error was returned. This usually indicates that the KLNI has
changed state and IPNIDV has not yet been notified. This should be
a temporary condition.
Data: ERROR - Error returned from the KLNI
SERVICE - Service requested of NISRV
>)
BUG.(INF,IPETHA,IPNIDV,SOFT,<Ethernet address change, IPNI shutting down>,,<
Cause: The internet ethernet software has received an address changed
callback from NISRV and the ARP protocol is disabled. The ARP
protocol is needed to handle this situation. Internet ethernet
service has been terminated.
>)
BUG.(CHK,IPFBCV,IPNIDV,SOFT,<IPNI: Illegal callback vector>,<<T1,CODE>>,<
Cause: NISRV has passed an illegal or unknown callback vector to the
internet ethernet software.
Action: This is most likely cause by a software bug in NISRV. If this problem
persists, change this to a BUGHLT and submit an SPR.
Data: CODE - Callback code
>)
BUG.(CHK,IPFNSP,IPNIDV,SOFT,<No free space for UN block for ARP>,,<
Cause: The monitor attempted to assign some free space for the storage
of ARP UN blocks and none was available.
>)
BUG.(INF,IPFRAB,IPNIDV,SOFT,<Fewer than required ARP buffers assigned>,,<
Cause: The monitor has assigned a buffer for use by ARP but further analysis
shows that the buffer is not large enough for the number of messages
ARP wants to allow.
Action: If this problem becomes chronic, change the BUGINF to a BUGHLT
and submit an SPR.
>)
BUG.(INF,IPGCOL,IPFREE,SOFT,<Internet free space - Reclaiming internet free space>,,<
Cause: The internet free space facility is performing a garbage collection to
make some space available.
>,,<DB%NND>)
BUG.(INF,IPGHTF,IPNIDV,SOFT,<ARP information not inserted, GHT is full>,,<
Cause: The internet ethernet ARP software has attempted to add another
internet/ethernet address translation to the GHT and the GHT was
full.
Action: This problem could be avoided by increasing NIMAXH.
>)
BUG.(INF,IPHTNI,IPNIDV,SOFT,<Error while reading GHT>,<<T1,ERROR>>,<
Cause: The monitor detected a problem when attempting to read the
SYSTEM:INTERNET-ETHERNET-MAPPINGS.BIN file during initialization.
Action: Check the file integrity of SYSTEM:INTERNET-ETHERNET-MAPPINGS.BIN.
The error code may help determine the problem with the file.
Data: ERROR - Error code returned from NIHINI
>)
BUG.(CHK,IPIBLP,IPNIDV,SOFT,<IPNI input buffer list problem>,<<T1,CNT>,<T2,BFR>,<T3,BCNT>>,<
Cause: The internet ethernet software has attempted to queue an input
buffer and none were available. This indicates that the
internet asynchronous fork is not making buffers available fast enough.
Action: It is possible that a system on the ethernet is flooding TOPS-20
with incoming messages faster than TOPS-20 can process them.
Data: CNT - Count of internet buffers
BFR - First free buffer
BCNT - Number of times this BUGCHK has occurred
>)
BUG.(CHK,IPNARP,IPNIDV,SOFT,<No buffer space for ARP>,,<
Cause: The monitor attempted to assign a buffer for use by ARP and
none were available.
>)
BUG.(CHK,IPNBFA,IPNIDV,SOFT,<No IPNI input buffers available.>,<<T1,BFRCNT>>,<
Cause: The internet ethernet software has attemted to assign an input
buffer and internet free space is exhausted.
Data: BFRCNT - Count of free internet buffers
>)
BUG.(CHK,IPNFRB,IPCIDV,SOFT,<Failed to recycle buffer>,,<
Cause: The internet SCA interface attempted to return a buffer to SCA and
the buffer was refused. This indicates a problem with SCA.
Action: If this problem becomes chronic, change this to a BUGHLT and submit
an SPR.
>)
BUG.(CHK,IPNSPC,IPNIDV,SOFT,<No free space for portal counters>,<<SRV,SERVICE>>,<
Cause: The internet ethernet software has attempted to read the counters for
a portal and no internet free space was available.
Data: SERVICE - Type of service
>)
BUG.(CHK,IPNUNS,IPNIDV,SOFT,<No space for UN blocks>,,<
Cause: The monitor attempted to assign some internet free space for
UN blocks and none was available.
>)
BUG.(CHK,IPPSTE,IPNIDV,SOFT,<Couldn't post a buffer>,<<T1,ERROR>,<T2,SERVICE>>,<
Cause: The internet ethernet software tried to post a receive buffer to NISRV
and received an error return. This usually indicates that the KLNI has
changed state and IPNIDV has not yet been notified. This should be
a temporary condition.
Data: ERROR - Error code returned from NISRV
SERVICE - Service requested of NISRV
>)
BUG.(INF,IPRANF,IPNIDV,SOFT,<Routing address not found>,<<T2,RTE>,<T3,DEST>>,<
Cause: The internet ethernet software has been asked to forward a message
to an internet host whose ethernet address translation is not known.
This situation is normally handled by the ARP protocol which is not
enabled if this BUGINF occurs.
Action: Put the system in question's ethernet address into the system file
SYSTEM:INTERNET-ETHERNET-MAPPINGS.BIN.
Data: RTE - Routine address
DEST - Destination address
>)
BUG.(CHK,IPSCBV,IPCIDV,SOFT,<SCA passed an illegal callback function>,<<T1,CODE>,<T2,ARG1>,<T3,ARG2>>,<
Cause: SCA gave the internet SCA interface a callback with an unknown function
code. This indicates a problem with SCA.
Action: If this BUGCHK happens frequently, then change it to a BUGHLT and
submit an SPR.
Data: CODE - Function code returned from SCA
ARG1 - Argument passed up from SCA
ARG2 - Second argument given by SCA
>)
BUG.(INF,IPTENC,IPNIDV,SOFT,<Received a trailer encapsulated packet>,<<T1,ETH1>,<T2,ETH2>,<T3,PROTO>>,<
Cause: The monitor has received a trailer encapsulation IP datagram
over the Ethernet.
A system on the Ethernet is using trailer encapsulation
formats for the transmission of IP datagrams. TOPS-20 (and
most other operating systems) does not support trailer
encapsulation. Some Berkeley Unix and VMS TCP/IP
implementations support trailer encapsulation in an effort to
enhance their Internet performance characteristics.
Action: Stop the indicated systems from using trailer encapsulation.
Its ethernet address can be gotten from the additional data.
Data: ETH1 - First part of ethernet address of host sending packets
ETH2 - Second part of ethernet address of remote host
PROTO - Protocol type received
>)
BUG.(CHK,IPTRLE,IPNIDV,SOFT,<Trailer detection code did not initialize>,<<T1,ERROR>>,<
Cause: The ethernet trailer encapsulation detection portals did not
initialize. This is probably a temporary resource allocation problem.
Action: If this problem does not rectify itself, change this to a BUGHLT
and submit an SPR.
Data: ERROR - Error code returned by TRLINI
>)
BUG.(CHK,IPUNBP,IPNIDV,SOFT,<Free UN block queue problem>,,<
Cause: The monitor attempted to assign a UN block and none were available.
>)
BUG.(HLT,ITNOJC,SCHED,HARD,<Instruction trap not in JSYS context>,<<LSTERR,LSTERR>,<LSTIPC,ERRPC>,<KIMUPC,MUUOPC>>,<
Cause: The illegal instruction trap handler has been entered, but the process
is not in JSYS context.
Action: This BUGHLT is normally seen with bad hardware. If the hardware checks
out OK, send in an SPR along with a dump and indicate how this problem
can be reproduced.
Data: LSTERR - Last error code
ERRPC - PC at which error was generated
MUUOPC - Last MUUO PC
>)
BUG.(CHK,JB0CSH,MEXEC,SOFT,<Job 0 crash>,<<ITFPC,PC>,<LSTERR,LSTERR>>,<
Cause: An unexpected interrupt has occurred in the job 0 fork which
checks system status. The context is reinitialized, and
the process is restarted.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. In the dump, the stack may be
examined to determine the situation which caused the error.
Data: PC - PC at which error occurred.
LSTERR - Last error code for this fork.
>)
BUG.(CHK,JB0INX,MEXEC,SOFT,<Unexpected interrupt in job 0 during initialization>,<<ITFPC,PC>,<JB0XFR,NEWPC>,<LSTERR,LSTERR>>,<
Cause: An unexpected error has occurred in Job 0 which results
in control being transferred to the default error handler.
This has happened during job 0 initialization. The error handler
attempts to reset the context and continue at the specified
error address, however some system resources may be hung as a result
of locks not being cleared.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. In the dump, the stack
can be examined to determine what was in progress when
the error occurred.
Data: PPC - PC at which error occurred
NEWPC - Address to which control will be transferred after cleanup
LSTERR - Last error code in this fork
>)
BUG.(CHK,JB0NMF,MEXEC,SOFT,<Job 0 - no more forks to start an ACJ>,<<T1,ERR>>,<
Cause: RUNDIR was called to attempt to start up an ACJ fork within the
monitor and this call failed for some reason.
Action: Find out why this call failed. Look at the error code. It is possible
that the system already has too many forks running under job 0. If this
is the case, move some of the programs that are run under job 0 to
another job.
Data: ERR - Error code from RUNDIR
>,,<DB%NND>)
BUG.(HLT,JSBNIC,PAGUTL,SOFT,<SETPPG - JSB not in core>,,<
Cause: The monitor is establishing the context for running a process by making
its per-job area part of the monitor's map. It is about to copy the
SPT entry for the JSB into a special SPT slot. However, the JSB is not
in core.
>)
BUG.(HLT,JSTERR,FREE,SOFT,<JSB stack error>,,<
Cause: This is a problem with the JSB-stack logic; the count for the stack
indicated that free cells were available, however none could be
found.
>)
BUG.(HLT,JTENQE,SCHED,SOFT,<JTENQ with bad NSKED>,,<
Cause: A process has attempted to lock the JSYS trap lock and found it already
locked. The process will enter a queue and dismiss until the lock
becomes available. The BUGHLT occurs because when the process
decrements its NOSKED counter, the value does not go to 0. This means
that the process is still NOSKED or it was OKSKED when it should have
been NOSKED.
>)
BUG.(CHK,KLIOVF,DTESRV,HARD,<DTESRV - KLINIK data base too large>,<<C,PAKSIZ>>,<
Cause: A TO-10 transfer completion interrupt from the DTE under RSX20F
protocol was recieved that indicates the -11 is sending KLINIK
data but the size of the data field is out of range.
Action: Contact Field Service if the problem persists.
Data: PAKSIZ - Size of KLINIK data field in packet
>)
BUG.(CHK,KLIPAF,MEXEC,SOFT,<Failed to read in CI20 microcode>,<<T1,ERRCOD>>,<
Cause: At system startup we tried to read in the CI20 ucode. Routine KLPUCD
in module PHYKLP got a JSYS error while attempting the read.
Action: This BUG will only appear if the system has a KLIPA and if the GTJFN%
attempt on the file succeeded. So, the I/O on the file failed (OPENF%,
RIN%, SFPTR%, SIN%, CLOSF%) or the monitor was unable to obtain
enough free space to hold the microcode. First, insure that the file
is not corrupted. If, after this is done, this BUG still persists,
make it dumpable and submit an SPR with the dump and a copy of
MONITR.EXE. If possible, include any known method for reproducing
the problem and/or the state of the system at the time the BUG was
observed.
Data: ERRCOD - Error code returned
>,,<DB%NND>)
BUG.(HLT,KLPBDS,PHYKLP,SOFT,<PHYKLP - Bad dispatch from PHYSIO>,,<
Cause: PHYKLP was called to perform a function of which it is not capable.
>)
BUG.(CHK,KLPBOP,PHYKLP,SOFT,<PHYKLP - Bad op code on command queue>,<<T3,BOC>>,<
Cause: A packet with an illegal op code was found while purging the command
queue.
Action: If this happens frequently, use the SCAMPI and PHYKLP ring buffers to
try to discover how such a packet is being created. It is possible,
though unlikely, that a CI20 microcode or hardware problem can caus
this to happen.
Data: BOC - the bad code
>)
BUG.(HLT,KLPBPK,PHYKLP,SOFT,<PHYKLP - Bad packet>,,<
Cause: The virtual address of the packet is invalid.
>)
BUG.(INF,KLPBRC,PHYKLP,HARD,<Bad READ-COUNTERS>,,<
Cause: TOPS-20 has removed a READ-COUNTERS packet from response queue and the
reason code field contains an illegal value.
Action: It is possible, though unlikely, that this is a CI20 microcode problem.
It is more likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
>)
BUG.(INF,KLPCBN,PHYKLP,HARD,<PHYKLP - CBUS not available>,<<T1,CSR>,<T2,LAR>,<T3,EWORD3>,<T4,EWORD4>>,<
Cause: The port was not able to get the CBUS. It timed out waiting for the
CBUS to become available after asking for it.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 and KL10 hardware, paying particular attention to
devices interfaced to the EBUS and CBUS.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
EWORD3 - PCB error word 3
EWORD4 - PCB error word 4
>,,<DB%NND>)
BUG.(INF,KLPCBS,PHYKLP,HARD,<PHYKLP - CBUS parity error>,<<T1,CSR>,<T2,LAR>,<T3,LWORD1>,<T4,LWORD2>>,<
Cause: The CI20 had a CBUS parity error.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 and KL10 hardware, paying particular attention to
devices interfaced to the CBUS.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
EWORD1 - PCB error word 1
EWORD2 - PCB error word 2
>,,<DB%NND>)
BUG.(CHK,KLPCGN,PHYKLP,HARD,<PHYKLP - Can't get CI node number>,,<
Cause: The CI20 driver did a READ-REGISTER command to get the CI node number
from the port; it timed out waiting for the reply.
Action: The CI20 port is sick, call Field Service to check it out.
>,,<DB%NND>)
BUG.(CHK,KLPCKE,PHYKLP,HARD,<PHYKLP - SET-CIRCUIT command error>,<<T1,STATUS>,<T2,FLAGS>,<P2,OPC>>,<
Cause: A SET-CIRCUIT command has failed. TOPS-20 doesn't retry such commands
because it believes the CI port always executes them properly. The
port is probably in trouble.
Action: It is possible, though unlikely, that this is a CI20 microcode problem.
It is more likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
Data: STATUS - status field of packet
FLAGS - flags field of packet
OPC - op code field of packet
>,,<DB%NND>)
BUG.(INF,KLPCLB,PHYKLP,HARD,<Close buffer function failed>,<<T1,STATUS>>,<
Cause: The CI20 port driver has received a a Close Buffer packet with an
error. Look at the status word to find out what the error was.
Action: It is possible, though unlikely, that this is a CI20 microcode problem.
It is more likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
Data: STATUS - Status word
>)
BUG.(CHK,KLPCRR,PHYKLP,HARD,<PHYKLP - READ-REGISTER command failed>,,<
Cause: There is a problem with the CI20 port, a read-register command failed.
Action: It is possible, though unlikely, that this is a CI20 microcode problem.
It is more likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
>,,<DB%NND>)
BUG.(INF,KLPCSR,PHYKLP,HARD,<PHYKLP - Grant CSR error>,<<T1,CSR>,<T2,LAR>,<T3,CRAM1>,<T4,CRAM2>>,<
Cause: The port timed out waiting for Grant CSR.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 and KL10 hardware, paying particular attention to
devices interfaced to the EBUS.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
CRAM1 - contents of first CRAM word
CRAM2 - contents of next CRAM word
>,,<DB%NND>)
BUG.(INF,KLPCVC,PHYKLP,SOFT,<PHYKLP - Closed virtual circuit>,<<Q1,NODE>>,<
Cause: TOPS-20 has closed a virtual circuit to a remote node on the CI.
Action: No action is required, this bug is for information only.
Data: NODE - CI node number
>,,<DB%NND>)
BUG.(CHK,KLPDED,PHYKLP,HARD,<PHYKLP - CI20 is dead, no longer trying to start it>,<<T1,ERROR>>,<
Cause: TOPS-20 tried to restart the CI20 and the procedure failed twice in a
row. The CI20 is being left in its current state.
Action: Look back in the console log to see what kind of problems there have
been with the CI20 port. There probably is a hardware problem with the
CI20 and Field Service should be called.
Data: ERROR - error code for failure
>,,<DB%NND>)
BUG.(HLT,KLPDMP,PHYKLP,SOFT,<PHYKLP - Cluster dump requested>,<<Q1,NODE>>,<
Cause: TOPS-20 has closed a virtual circuit to a remote node on the CI. But,
previous to this, this node's cluster dump listener received a
connection indicating that a cluster dump was in progress. So, the
node is crashing upon receipt of the node offline indication. The node
which requested the cluster dump should have crashed with a CFCLDP
BUGHLT.
Action: No action is required. A cluster dump has been requested by another
node in the CFS cluster.
Data: NODE - CI node number to which we lost our VC
>)
BUG.(INF,KLPDPP,PHYKLP,HARD,<PHYKLP - Data path error>,<<T1,CSR>,<T2,LAR>,<T3,WORD0>,<T4,EWORD1>>,<
Cause: The CI20 port's mover/formatter detected a parity error.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 hardware.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
EWORD0 - PCB error word 0
EWORD1 - PCB error word 1
>,,<DB%NND>)
BUG.(INF,KLPDRQ,PHYKLP,HARD,<PHYKLP - CI ucode dump requested>,,<
Cause: TOPS-20 has decided the CI20 microcode needs to be dumped.
Action: No action required. The CI20 microcode is going to be dumped.
>,,<DB%NND>)
BUG.(INF,KLPDUM,PHYKLP,SOFT,<PHYKLP - CI20 ucode dump in progress>,<<T1,DFORK>>,<
Cause: The CI20 is being dumped.
Action: No action required. This bug is for information only.
Data: DFORK - fork doing the dump
>,,<DB%NND>)
BUG.(INF,KLPEBP,PHYKLP,HARD,<PHYKLP - EBUS parity error>,<<T1,CSR>,<T2,LAR>,<T3,EWORD0>>,<
Cause: The port received a data word with bad parity from the KL. This did
not happen while processing a queue.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 and KL10 hardware, paying particular attention to
devices interfaced to the EBUS.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
EWORD0 - PCB error word 0
>,,<DB%NND>)
BUG.(INF,KLPEBQ,PHYKLP,HARD,<PHYKLP - EBUS parity error>,<<T1,CSR>,<T2,LAR>,<T3,EWORD0>,<T4,EWORD1>>,<
Cause: The port received a data word with bad parity from the KL. This
happened while processing a queue.
Action: It is very likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
EWORD0 - PCB error word 0
EWORD1 - PCB error word 1
>,,<DB%NND>)
BUG.(INF,KLPELL,PHYKLP,SOFT,<PHYKLP - Error Log Lost>,,<
Cause: Can't get free space to create ERROR.SYS entry for ERROR LOG MESSAGE.
This may happen if there have been a large number of bugs or hardware
errors before this one.
Action: If this bug is reproducable, change it to a BUGHLT, and send in an SPR
with the dump and how to reproduce it.
>)
BUG.(INF,KLPELT,PHYKLP,SOFT,<PHYKLP - Error Log Truncated>,,<
Cause: An ERROR LOG MESSAGE has been truncated in its ERROR.SYS entry. This
may happen if there have been a large number of bugs or hardware errors
before this one.
Action: If this bug is reproducable, change it to a BUGHLT, and send in an SPR
with the dump and how to reproduce it.
>,,<DB%NND>)
BUG.(CHK,KLPEPB,PHYKLP,SOFT,<PHYKLP - Error logging packet is bad>,<<T2,STATS>,<T3,FLAGS>,<P2,OPC>,<Q1,NODE>>,<
Cause: TOPS-20 received an error-logging packet (PPD byte 5) which had an
error. The packet is returned immediately to the free queue. Whatever
information it carried is lost.
Action: The CI node that sent the packed may be having serious problems and
should be checked out. The node number is the final additional data
word.
Data: STATS - Status field of packet
FLAGS - Flags field of packet
OPC - op code field of packet
NODE - node number
>,,<DB%NND>)
BUG.(INF,KLPERE,PHYKLP,HARD,<PHYKLP - EBUS request error>,<<T1,CSR>,<T2,LAR>>,<
Cause: The CI20 port could not get the EBUS. It timed out waiting for the
EBUS to become available.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 and KL10 hardware, paying particular attention to
devices interfaced to the EBUS and CBUS.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
>,,<DB%NND>)
BUG.(INF,KLPERQ,PHYKLP,HARD,<PHYKLP - Empty response queue>,,<
Cause: The monitor got an interrupt to remove a packet from the response
queue. The queue was empty.
>,,<DB%NND>)
BUG.(INF,KLPERR,PHYKLP,SOFT,<PHYKLP - CI packet error>,<<T2,STATS>,<T3,FLAGS>,<T4,OPC>,<Q1,NODE>>,<
Cause: The CI20 driver received a packet (message or named buffer) with an
error. This causes the virtual circuit to be closed. This is usually
caused by a node on CI shutting down.
Action: No action required. This bug is for information only.
Data: STATS - Status field of packet
FLAGS - Flags field of packet
OPC - op code field of packet
NODE - node number
>,,<DB%NND>)
BUG.(INF,KLPFST,PHYKLP,HARD,<PHYKLP - Self test failed>,<<T1,CSR>,<T2,VER>,<T3,LAR>>,<
Cause: The port had a failure during its self test.
Action: It is possible, though very unlikely, that this is a CI20 microcode
problem. It is more likely that there is CI20 hardware problem. Field
Service should check out the CI20 hardware.
Data: CSR - Result of last CONI
VER - ucode version
LAR - CRAM's last address read
>,,<DB%NND>)
BUG.(INF,KLPHNG,PHYKLP,HARD,<PHYKLP - CI20 is hung>,,<
Cause: The response bit on a REQUEST-ID command was set and timed out waiting
for it to appear on the response queue.
Action: It is possible, though unlikely, that this is a CI20 microcode problem.
It is more likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
>,,<DB%NND>)
BUG.(HLT,KLPHOG,PHYKLP,HARD,<PHYKLP - Interlock value on queue is too large>,<<T1,QUEUE>,<T2,COUNT>,<T3,OWNER>,<T4,CONTXT>>,<
Cause: The KLIPA driver timed out the interlock, but the value isn't what is
expected.
Action: This BUGHLT generally indicates bad or flakey CI20 hardware. It also
be caused by a CI20 microcode bug, but that is unlikely. Field Service
should check out the CI20 hardware and the CBUS and EBUS interfaces.
Data: QUEUE - Address of the queue's interlock word
COUNT - Interlock word value in PCB
OWNER - Interlock word address in PCB
CONTXT - 0 if process context, -1 if at interrupt/scheduler level
>)
BUG.(INF,KLPIBN,PHYKLP,SOFT,<PHYKLP - Invalid buffer name>,<<T2,STATS>,<T3,FLAGS>,<T4,OPC>,<Q1,NODE>>,<
Cause: The CI20 driver received a packet (message or named buffer) with an
Invalid Buffer Name error.
Action: If this bug is reproducable, change it to a BUGHLT, and send in an SPR
with the dump and how to reproduce it.
Data: STATS - Status field of packet
FLAGS - Flags field of packet
OPC - op code field of packet
NODE - node number
>,,<DB%NND>)
BUG.(INF,KLPILP,PHYKLP,HARD,<PHYKLP - Software response bit off in locally-generated packet>,<<Q1,NODE>,<T1,STATUS>>,<
Cause: The response queue contains a packet whose op code indicates that the
packet was queued by this host but the software response bit is not
set, and there was no error.
Action: This may be caused by a CI20 microcode bug or by flakey hardware.
Field Service should check the CI20 hardware.
Data: NODE - Node number
STATUS - Status word
>,,<DB%NND>)
BUG.(INF,KLPINP,PHYKLP,HARD,<PHYKLP - Internal port error>,<<T1,CSR>,<T2,VER>,<T3,LAR>>,<
Cause: The port has found an inconsistency in an operation it was performing.
Action: It is possible, that this is a CI20 microcode problem. It is more
likely that there is CI20 hardware problem. Field Service should check
out the CI20 and KL10 hardware.
Data: CSR - Result of last CONI
VER - ucode version
LAR - CRAM's last address read
>,,<DB%NND>)
BUG.(CHK,KLPIPA,PHYKLP,HARD,<PHYKLP - Invalid packet arrival>,<<T1,STATS>,<T2,FLAGS>,<P2,OPC>,<Q1,NODE>>,<
Cause: The CI20 driver has received an application packet from a node with
which it doesn't think it has ever communicated.
Action: This may be caused by a CI20 microcode bug or by flakey hardware.
Field Service should check the CI20 hardware.
Data: STATS - Status field of packet
FLAGS - Flags field of packet
OPC - op code field of packcet
NODE - node number
>,,<DB%NND>)
BUG.(CHK,KLPIRD,PHYKLP,SOFT,<PHYKLP - Invalid remotely-generated data request>,<<T1,STATS>,<T2,FLAGS>,<P2,OPC>,<Q1,NODE>>,<
Cause: The CI20 driver received an error-free, remotely-generated packet with
opcode 10, 11, 12, or 20. This is illegal.
Action: It is unlikely, but possible, could be caused by a CI20 microcode bug
or bad CI20 hardware. The node specified in the additional data should
be checked out.
Data: STATS - Status field of packet
FLAGS - Flags field of packet
OPC - op code field of packet
NODE - node number
>,,<DB%NND>)
BUG.(INF,KLPIRP,PHYKLP,HARD,<PHYKLP - Software response bit on in remotely-generated packet>,<<Q1,NODE>,<T1,STATUS>>,<
Cause: The response queue contains a packet whose op code indicates that the
packet was queued by a remote host but the software response bit is
set.
Action: This may be caused by a CI20 microcode bug or by flakey hardware.
Field Service should check the CI20 hardware.
Data: NODE - Node number
STATUS - Status word
>,,<DB%NND>)
BUG.(CHK,KLPLBF,PHYKLP,HARD,<PHYKLP - Loopback failed>,<<T2,STATS>,<T3,FLAGS>,<P2,OPC>,<T4,CSR>>,<
Cause: The CI20 driver has tried to send a loopback packet to the CI star
coupler and it had a non-path error.
Action: It is likely that there is a problem with the CI20 hardware, CI cables,
or CI star coupler which should be checked by Field Service.
Data: STATS - Status field of packet
FLAGS - Flags field of packet
OPC - op code field of packet
CSR - result of the last CONI
>,,<DB%NND>)
BUG.(INF,KLPLOA,PHYKLP,SOFT,<PHYKLP - CI20 ucode loaded>,<<T1,EDIT>>,<
Cause: BS:<SYSTEM>IPALOD.EXE was run or the monitor initiated the reload.
Action: No action required. This bug is for information only.
Data: EDIT - edit number of microcode
>,,<DB%NND>)
BUG.(INF,KLPMBS,PHYKLP,HARD,<PHYKLP - MBUS error>,<<T1,CSR>,<T2,LAR>,<T3,CRAM1>,<T4,CRAM2>>,<
Cause: Multiple MBUS drivers simultaneously accessing MBUS.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 and KL10 hardware.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
CRAM1 - contents of first CRAM word
CRAM2 - contents of next CRAM word
>,,<DB%NND>)
BUG.(INF,KLPMCE,PHYKLP,HARD,<Received an MCNF or an MDATREC with an error>,<<T1,NODE>,<T2,STATUS>>,<
Cause: The CI20 port driver has received a maintenance confirm or maintenance
data received packet with an error.
Action: Check the error code in the status word for the type of error. It is
possible, though unlikely, that this is a CI20 microcode problem. It
is more likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
Data: NODE - The node number of the CI node.
STATUS - The status word of the packet.
>,,<DB%NND>)
BUG.(CHK,KLPMCR,PHYKLP,HARD,<Received an MCNF or an MDATREC from CI20 when not expected>,<<T1,NODE>>,<
Cause: Either a maintenance function timed out, or the CI20 gave us a spurious
maintenance confirm or maintenance data received packet with an error.
Action: It is possible, though unlikely, that this is a CI20 microcode problem.
It is more likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
Data: NODE - Node number of CI node that sent the MCNF or MDATREC.
>,,<DB%NND>)
BUG.(HLT,KLPMTY,PHYKLP,SOFT,<PHYKLP - Queue is empty>,,<
Cause: We want to trace the pointers on a queue but the queue is empty.
>)
BUG.(CHK,KLPMVW,PHYKLP,SOFT,<CI20 microcode version wrong>,<<T1,MVER>,<T2,UVER>>,<
Cause: The CI20 has returned a microcode version that is different than the
version the monitor thought it loaded.
Action: There is probably a problem with BS:<SYSTEM>IPALOD.EXE. Find a correct
IPALOD, put it in BS:<SYSTEM>, and reload the system.
Data: MVER - version loaded by monitor
UVER - version returned by microcode
>,,<DB%NND>)
BUG.(CHK,KLPNDE,PHYKLP,HARD,<PHYKLP - Packet with bad node number>,<<T2,STATS>,<T3,FLAGS>,<T4,OPC>,<Q1,NODE>>,<
Cause: CI20 driver received a packet with an invalid node number. The packet
has not been returned to a free queue.
Action: This is usually seen with bad or flakey CI20 hardware. It also be
caused by a CI20 microcode bug, but that is unlikely. Field Service
should thoroughly check out the CI20 hardware.
Data: STATS - Status field of packet
FLAGS - Flags field of packet
OPC - op code field of packet
NODE - node number
>,,<DB%NND>)
BUG.(INF,KLPNDG,PHYKLP,SOFT,<PHYKLP - No datagram buffer>,,<
Cause: TOPS-20 tried to remove a buffer from the datagram free queue but the
queue was empty.
>,,<DB%NND>)
BUG.(HLT,KLPNDM,PHYKLP,SOFT,<PHYKLP - CI20 ucode needs dumping>,,<
Cause: The CI20 port microcode needs to be dumped but there is a timeout
waiting for it to get started.
Action: It could be that there is something blocking job 0. It is more likely
that there is CI20 hardware problem. Field Service should check out
the CI20 hardware.
>)
BUG.(CHK,KLPNEN,PHYKLP,SOFT,<PHYKLP - CI20 not enabled>,,<
Cause: TOPS-20 believes the CI20 should be enabled but has found otherwise.
Action: It is possible, though unlikely, that this is a CI20 microcode problem.
It is more likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
>,,<DB%NND>)
BUG.(INF,KLPNMG,PHYKLP,SOFT,<PHYKLP - No message buffer>,,<
Cause: TOPS-20 tried to remove a buffer from the message free queue but the
queue was empty.
>,,<DB%NND>)
BUG.(CHK,KLPNOA,PHYKLP,SOFT,<PHYKLP - Remote port is not answering>,<<Q1,NODE>>,<
Cause: The remote node is ACKing REQUEST-IDs but not sending IDRECs.
Action: The remote system needs to be investigated.
Data: NODE - Remote CI node number
>,,<DB%NND>)
BUG.(HLT,KLPNOD,PHYKLP,SOFT,<PHYKLP - Can't stock datagram free queue>,,<
Cause: The CALL SC.ALD failed. SCA can't handle the request.
>)
BUG.(HLT,KLPNOM,PHYKLP,SOFT,<PHYKLP - Physical address doesn't match>,,<
Cause: The physical address of a packet is stored in the packet. The physical
address of this packet doesn't match what is in the packet.
Action: This BUGHLT generally indicates a software problem. Diagnosis of the
problem is extremely difficult without the SCA ring buffer code
enabled and KLPDBG enabled.
>)
BUG.(HLT,KLPNRL,PHYKLP,HARD,<PHYKLP - CI20 ucode needs reloading>,,<
Cause: The CI20 port microcode needs to be reloaded but there is a time out
waiting for it to get started.
Action: It could be that there is something blocking job 0. It is more likely
that there is CI20 hardware problem. Field Service should check out
the CI20 hardware.
>)
BUG.(HLT,KLPNSB,PHYKLP,SOFT,<PHYKLP - No system block at OPENVC>,,<
Cause: OPENVC was called with a system block address of 0.
>)
BUG.(HLT,KLPONC,PHYKLP,SOFT,<PHYKLP - Trying to open a VC which isn't closed>,,<
Cause: OPENVC was called when the VC was not closed.
>)
BUG.(CHK,KLPOPC,PHYKLP,SOFT,<PHYKLP - Packet with bad op-code>,<<T2,STATS>,<T3,FLAGS>,<P2,OPC>,<T4,NODE>>,<
Cause: CI20 driver received a packet with an invalid op-code. The packet has
not been returned to a free queue.
Action: This may be caused by a CI20 microcode bug or by flakey hardware.
Field Service should check the CI20 hardware.
Data: STATS - Status field of packet
FLAGS - Flags field of packet
OPC - op code field of packet
NODE - node number
>,,<DB%NND>)
BUG.(INF,KLPOVC,PHYKLP,SOFT,<PHYKLP - Opened virtual circuit>,<<Q1,NODE>>,<
Cause: TOPS-20 has opened a virtual circuit to a remote node on the CI.
Action: No action is required, as this is an information only BUG.
Data: NODE - CI node number we just opened the virtual circuit to
>,,<DB%NND>)
BUG.(HLT,KLPPCB,PHYKLP,SOFT,<PHYKLP - PCB is corrupted>,,<
Cause: During the once a second check of the CI20 PCB, either the PCB's own
address in the PCB is incorrect or the message size in the PCB is
incorrect.
Action: It is possible, though unlikely, that this is a CI20 microcode problem.
It is more likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
>)
BUG.(INF,KLPPIA,PHYKLP,HARD,<PHYKLP - CI20 has lost its PIA>,<<T3,CSR>>,<
Cause: During the once a second check, it has been discovered that the CI20 no
longer knows its interrupt assignment.
Action: The monitor resets, reloads, and attempts to restart the CI20. There
is CI20 hardware problem. Field Service should check out the CI20
hardware.
Data: CSR - the result of the last CONI
>,,<DB%NND>)
BUG.(INF,KLPPLS,PHYKLP,SOFT,<PHYKLP - Packets lost>,,<
Cause: After an unplanned CRAM parity error we can't reliably believe the
queues so we have thrown everything away and started over. This may
cause SCA to do some complaining. This bug should have been preceeded
by a KLPUCP BUGCHK.
Action: See the KLPUCP BUGCHK. This bug is for information only.
>,,<DB%NND>)
BUG.(CHK,KLPPPD,PHYKLP,SOFT,<PHYKLP - Packet with bad PPD byte>,<<T2,STATS>,<T4,OPC>,<T1,NODE>,<P4,PPD>>,<
Cause: The CI20 driver received a packet with an invalid PPD byte. The packet
has not been returned to a free queue.
Action: This may be caused by a CI20 microcode bug or by flakey hardware.
Field Service should check the CI20 hardware.
Data: STATS - Status field of packet
OPC - op code field of packcet
NODE - node number
PPD - PPD byte
>,,<DB%NND>)
BUG.(INF,KLPPPE,PHYKLP,HARD,<PHYKLP - PLI parity error>,<<T1,CSR>,<T2,LAR>>,<
Cause: The port detected bad parity on a PLI BUS read.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 hardware.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
>,,<DB%NND>)
BUG.(HLT,KLPPRI,PHYKLP,SOFT,<PHYKLP - Invalid priority>,,<
Cause: KLPSND was called with an invalid priority.
>)
BUG.(INF,KLPRAE,PHYKLP,HARD,<PHYKLP - Spurious receive attention error>,<<T1,CSR>,<T2,VER>,<T3,LAR>,<T4,REG>>,<
Cause: The port found ATTENTION up but the packet was not totally stored in
the receive buffers.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 hardware.
Data: CSR - Result of last CONI
VER - ucode version
LAR - CRAM's last address read
>,,<DB%NND>)
BUG.(CHK,KLPRCE,PHYKLP,HARD,<PHYKLP - READ-COUNTERS command failed>,,<
Cause: There is a problem with the CI20 port, the read-counters command failed.
Action: It is possible, though unlikely, that this is a CI20 microcode problem.
It is more likely that there is CI20 hardware problem. Field Service
should check out the CI20 hardware.
>,,<DB%NND>)
BUG.(INF,KLPRRQ,PHYKLP,HARD,<PHYKLP - CI20 ucode reload requested>,,<
Cause: TOPS-20 has decided the CI20 microcode needs to be reloaded.
Action: No action required. The CI20 microcode is reloaded.
>,,<DB%NND>)
BUG.(INF,KLPRSF,PHYKLP,HARD,<PHYKLP - CI restart failed>,<<T1,ERROR>>,<
Cause: TOPS-20 tried to restart the CI20 and the procedure failed.
Action: Look back in the console log to see what kind of problems there have
been with the CI20 port. There probably is a hardware problem with the
CI20 and Field Service should be called.
Data: ERROR - error code for failure
>,,<DB%NND>)
BUG.(INF,KLPRSH,PHYKLP,SOFT,<PHYKLP - Received shutdown message>,<<Q1,NODE>>,<
Cause: A CI node has notified our node that it is closing our virtual circuit.
The additional data specifies which node has notifed us.
Action: No action is required, as this is an information only BUG.
Data: NODE - Node number
>,,<DB%NND>)
BUG.(INF,KLPSCE,PHYKLP,HARD,<PHYKLP - Spurious channel error>,<<T1,CSR>,<T4,VER>,<T2,LAR>,<T3,LWORD1>>,<
Cause: Channel Error was asserted but no channel error information was in
the channel logout word.
Action: It is possible, though very unlikely, that this is a CI20 microcode
problem. It is more likely that there is CI20 hardware problem. Field
Service should check out the CI20 hardware.
Data: CSR - Result of last CONI
VER - ucode version
LAR - CRAM's last address read
LWORD1 - Channel logout word 1
>,,<DB%NND>)
BUG.(CHK,KLPSCR,PHYKLP,HARD,<PHYKLP - SET-CIRCUIT command received>,<<T1,STATUS>,<T2,FLAGS>,<P2,OPC>>,<
Cause: TOPS-20 has found an error free SET-CIRCUIT command on the response
queue. The CI port has done something wrong because the response bit
is never set so this packet should not be seen.
Action: This may be a CI20 microcode bug, or the CI20 hardware problem. Field
Service should check out the CI20 hardware.
Data: STATUS - status field of packet
FLAGS - flags field of packet
OPC - op code field of packet
>,,<DB%NND>)
BUG.(CHK,KLPSDM,PHYKLP,SOFT,<PHYKLP - CI20 ucode still dumping>,,<
Cause: The CI20 port microcode is being dumped and there is a time out waiting
for it to complete.
Action: It could be that there is something blocking job 0. It is more likely
that there is CI20 hardware problem. Field Service should check out
the CI20 hardware.
>,,<DB%NND>)
BUG.(CHK,KLPSRL,PHYKLP,SOFT,<PHYKLP - CI20 ucode still reloading>,,<
Cause: The CI20 port microcode is being reloaded and there is a time out
waiting for it to complete. A KLPNRL BUGHLT happens if it doesn't
complete soon.
Action: It could be that there is something blocking job 0. It is more likely
that there is CI20 hardware problem. Field Service should check out
the CI20 hardware.
>,,<DB%NND>)
BUG.(INF,KLPSRM,PHYKLP,SOFT,<PHYKLP - Cannot start remote node>,<<T1,HOST NODE>,<T2,RESET NODE>,<Q1,REMOTE NODE>>,<
Cause: This node wanted to start a remote HSC node, but it is not the node
that did the last RESET REMOTE on the remote HSC. This can happen with
multiple KLs on the same CI as an HSC.
Action: No action is required, as this is an information only BUG.
Data: HOST NODE - The node number of this system
RESET NODE - The node number that last reset the remote node
REMOTE NODE - The remote's node number.
>,,<DB%NND>)
BUG.(INF,KLPSTP,PHYKLP,SOFT,<PHYKLP - CI20 stopped>,,<
Cause: TOPS-20 has stopped the CI20.
Action: No action required. This bug is for information only.
>,,<DB%NND>)
BUG.(INF,KLPSTR,PHYKLP,HARD,<PHYKLP - CI20 started>,,<
Cause: TOPS-20 has restarted the CI20.
Action: No action required. This BUG is for information only.
>,,<DB%NND>)
BUG.(INF,KLPSWC,PHYKLP,HARD,<PHYKLP - Short word count>,<<T1,CSR>,<T2,LAR>,<T3,LWORD1>,<T4,LWORD2>>,<
Cause: The port detected a short word count CBUS channel error.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 and KL10 hardware, paying particular attention to
devices interfaced to the EBUS and CBUS.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
EWORD1 - PCB error word 1
EWORD2 - PCB error word 2
>,,<DB%NND>)
BUG.(INF,KLPSWO,PHYKLP,SOFT,<PHYKLP - Received a START when VC was open>,<<Q1,NODE>>,<
Cause: TOPS-20 has closed a virtual circuit because it received a START packet
while the circuit was open. This happens when a CI node
(specifed in the addtional data) crashes and sends a START to this
system. For example this is seen when a HSC50 breaks the connection
and then reconnects to this TOPS-20 system. The VC is reopened shortly.
Action: No action is required, as this is an information only BUG.
Data: NODE - node number
>,,<DB%NND>)
BUG.(INF,KLPTAE,PHYKLP,HARD,<PHYKLP - Spurious transmit attention error>,<<T1,CSR>,<T2,VER>,<T3,LAR>,<T4,REG>>,<
Cause: The port found ATTENTION up before the Transmit Packet function
completed.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 hardware.
Data: CSR - Result of last CONI
VER - ucode version
LAR - CRAM's last address read
REG - Transmit status register
>,,<DB%NND>)
BUG.(CHK,KLPTIM,PHYKLP,HARD,<PHYKLP - Timed out waiting for queue interlock>,<<T1,QUEUE>,<T2,COUNT>,<T3,OWNER>,<T4,CONTXT>>,<
Cause: The KLIPA driver timed out trying to get the interlock for a queue.
The KLIPA microcode should never have the lock this long.
Action: If this problem occurs often or can be reproduced, there could be a
problem with the CI20 microcode or hardware. Field Service should
verify that the CI20 is healthy.
Data: QUEUE - Address of the queue's interlock word
COUNT - Interlock word value in PCB
OWNER - Interlock word address in PCB
CONTXT - 0 if process context, -1 if at interrupt/scheduler level
>)
BUG.(INF,KLPTMO,PHYKLP,HARD,<PHYKLP - Transmitter timeout>,<<T1,CSR>,<T2,REG>,<T3,VER>>,<
Cause: Someone is hogging the CI. The LINK module could not transmit over the
CI due to carrier detect being continuously asserted.
Action: It is likely that there is a problem with this system's CI20 link
module. It is more likely that there is some other system on the CI
with broken hardware. Field Service should check out all systems
attached to the CI.
Data: CSR - Result of last CONI
REG - Transmit status register
VER - ucode version
>,,<DB%NND>)
BUG.(INF,KLPTPE,PHYKLP,HARD,<PHYKLP - Transmit buffer parity error>,<<T1,CSR>,<T2,REG>,<T3,VER>>,<
Cause: A bit was dropped or picked up in the TRANSMIT BUFFER or the TRANSMIT
DATA BUS.
Action: It is likely that there is CI20 hardware problem. Field Service should
check out the CI20 KL10 hardware.
Data: CSR - Result of last CONI
REG - Transmit status register
VER - ucode version
>,,<DB%NND>)
BUG.(INF,KLPUCP,PHYKLP,HARD,<PHYKLP - Unplanned CRAM parity error>,<<T1,CSR>,<T2,LAR>,<T3,CRAM1>,<T4,CRAM2>>,<
Cause: The port had an unplanned CRAM parity error.
Action: It is possible, though very unlikely, that this is a CI20 microcode
problem. It is more likely that there is CI20 hardware problem. Field
Service should check out the CI20 hardware.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
CRAM1 - contents of first CRAM word
CRAM2 - contents of next CRAM word
>,,<DB%NND>)
BUG.(CHK,KLPUMV,PHYKLP,SOFT,<Unexpected CI20 microcode version>,<<T1,AVER>,<KLPVWD,EVER>>,<
Cause: The monitor has an assembled-in value of the CI20 ucode which it is
expecting to load. The ucode just loaded is a lower version number.
Action: Find a proper IPALOD.EXE and put it into BS:<SYSTEM>.
Data: AVER - actual version loaded
EVER - expected version
>,,<DB%NND>)
BUG.(INF,KLPUPC,PHYKLP,HARD,<PHYKLP - Undefined planned CRAM parity error>,<<T1,CSR>,<T2,LAR>,<T3,CRAM1>,<T4,CRAM2>>,<
Cause: The port had a planned CRAM parity error but it is not defined.
Action: This problem can be caused by a CI20 microcode problem. It is more
likely that there is CI20 hardware problem. Field Service should check
out the CI20 and KL10 hardware, paying particular attention to devices
interfaced to the EBUS and CBUS.
Data: CSR - Result of last CONI
LAR - CRAM's last address read
CRAM1 - contents of first CRAM word
CRAM2 - contents of next CRAM word
>)
BUG.(CHK,KLPVIR,PHYKLP,HARD,<PHYKLP - Virtual address in packet is wrong>,<<T1,QUEUE>,<T2,VMA>,<T3,PMA>,<T4,FLINK>>,<
Cause: The virtual address of a packet is incorrect. This indicates some sort
of inconsistency in one of the queues.
Action: The CI20 port is having microcode problems or may be going bad and
should be examined by Field Service.
Data: QUEUE - Address of the queue's interlock word
VMA - Contents of the software word in the packet
PMA - Physical address of the word pointed to
FLINK - FLINK word from PCB
>)
BUG.(INF,KLPWAB,PHYKLP,HARD,<PHYKLP - CI wire A has gone from good to bad>,<<T2,STATS>,<T3,FLAGS>,<P2,OPC>,<T4,CSR>>,<
Cause: A loopback packet which previously succeeded has failed on wire A.
Action: It is likely that there is a problem with the CI20 hardware, CI cables,
or CI star coupler which should be checked by Field Service.
Data: STATS - Status field of packet
FLAGS - Flags field of packet
OPC - op code field of packet
CSR - result of the last CONI
>,,<DB%NND>)
BUG.(INF,KLPWAG,PHYKLP,HARD,<PHYKLP - CI wire A has gone from bad to good>,,<
Cause: A loopback packet which previously failed has successfully returned
on wire A. This BUGINF is usually preceeded by a KLPWAB.
Action: It is likely that there is a problem with the CI20 hardware, CI cables,
or CI star coupler which should be checked by Field Service.
>,,<DB%NND>)
BUG.(INF,KLPWBB,PHYKLP,HARD,<PHYKLP - CI wire B has gone from good to bad>,<<T2,STATS>,<T3,FLAGS>,<P2,OPC>,<T4,CSR>>,<
Cause: A loopback packet which previously succeeded has failed on wire B.
Action: It is likely that there is a problem with the CI20 hardware, CI cables,
or CI star coupler which should be checked by Field Service.
Data: STATS - Status field of packet
FLAGS - Flags field of packet
OPC - op code field of packet
CSR - result of the last CONI
>,,<DB%NND>)
BUG.(INF,KLPWBG,PHYKLP,HARD,<PHYKLP - CI wire B has gone from bad to good>,,<
Cause: A loopback packet which previously failed has successfully returned on
wire B. This BUGINF is usually preceeded by a KLPWBB.
Action: It is likely that there is a problem with the CI20 hardware, CI cables,
or CI star coupler which should be checked by Field Service.
>,,<DB%NND>)
BUG.(INF,KLPWIR,PHYKLP,HARD,<PHYKLP - Excessive CI wire transitions>,,<
Cause: CI loopback packets are alternately succeeding and failing at a rapid
rate.
Action: Call Field Service. The most likely cause of this is a bad CI link
module. It is possible that there is some other problem with the CI20
hardware, CI cables, or CI star coupler which should be checked by
Field Service.
>,,<DB%NND>)
BUG.(HLT,KNIADE,PHYKNI,SOFT,<PHYKNI - Multicast address disable error>,,<
Cause: In the NIA20 driver, NIDPT got an error from NIDRA when attempting to
disable a multicast address that was supposedly enabled. This is a
NIA20 microcode problem or a monitor software problem.
>)
BUG.(CHK,KNIADR,PHYKNI,SOFT,<Monitor address does not match NIA20 address>,<<T1,KLNHIO>,<T2,KLNLO>,<T3,MONHIO>,<T4,MONLO>>,<
Cause: PHYKNI just read the Ethernet address from the KLNI and found it
different from the shadow copy stored in the monitor. The port is
shutdown.
Action: It is possible that there is a monitor software or NIA20 microcode
problem. If this BUGCHK continues and is reprooducable, change this
BUGCHK to a BUGHLT and send in an SPR along with the dump and how to
reproduce the problem.
Data: KLNHIO & KLNLO - KLNI's copy of the Ethernet address
MONHIO & MONLO - Monitor's copy of the Ethernet address
>,,<DB%NND>)
BUG.(HLT,KNIBFC,PHYKNI,SOFT,<PHYKNI - Illegal NISRV function code>,<<T1,FUNC>>,<
Cause: NISRV called PHYKNI with a bad function code. The code is in T1.
Data: FUNC - Illegal function code
>)
BUG.(HLT,KNIBLV,PHYKNI,SOFT,<PHYKNI - Buffer length violation>,,<
Cause: The BSD chain contained inconsistent length information for the
transmit or receive command that caused it.
>)
BUG.(HLT,KNIBTB,PHYKNI,SOFT,<PHYKNI - Bad BYTAB entry>,<<P2,ENTRY>,<T1,BYTPTR>>,<
Cause: BYTAB has been corrupted.
Data: ENTRY - The corrupted entry
BYTPTR - The byte pointer used to fetch this entry.
>,RTN)
BUG.(INF,KNICAE,PHYKNI,HARD,<PHYKNI - NIA20 got CBUS available timeout>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The NIA20 was unable to acquire control of the CBUS within 50
microseconds from the start of a CBUS request.
The NIA20 will be dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 and KL10 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(INF,KNICCF,PHYKNI,HARD,<PHYKNI - Carrier check failed>,<<T1,TDR>>,<
Cause: The NIA20 module did not detect it's own carrier while it was
transmitting.
Action: The usual cause is that the Ethernet transceiver cable has come loose.
Make sure that both ends of the transceiver cable are securely
fastened. If this BUGINF is seen on several systems at once, check the
Ethernet cable to see if it is properly terminated. If all of these
check out, the NIA20 hardware should be checked by Field Service.
Data: TDR - TDR value
>,,<DB%NND>)
BUG.(CHK,KNICDF,PHYKNI,HARD,<PHYKNI - Collision detect check failed>,,<
Cause: The H4000 did not assert the collision detect signal shortly after
completion of a transmission. (This signal is also known as the
"Heartbeat" of the H4000).
Action: Check the transceiver cable and make sure both ends are securely
fastened to both the H4000 and NIA20. Field Service should also check
the H4000 or DELNI hardware.
>,,<DB%NND>)
BUG.(CHK,KNICFF,PHYKNI,SOFT,<PHYKNI - Cannot reload the NIA20>,<<T1,ERROR>>,<
Cause: The monitor was unable to find SYSTEM:KNILDR.EXE when it attempted to
reload or dump the port.
Action: Make sure that KNILDR.EXE is installed in SYSTEM:.
Data: ERROR - Error code from RUNDII (Probably a JSYS error).
>,,<DB%NND>)
BUG.(HLT,KNICFP,NISRV,SOFT,<Cannot find portal block during close.>,<<PR,PR>>,<
Cause: NISRV was unable to find a portal block on the portal block list during
a close portal callback.
Data: PR - Portal block address
>)
BUG.(INF,KNICPE,PHYKNI,HARD,<PHYKNI - NIA20 detected CBUS parity error>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The NIA20 detected bad parity for data that was read over the CBUS.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 and KL10 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(INF,KNIDM1,NISRV,SOFT,<KNIDMD continued>,<<T1,PROTO>>,<
Cause: Additional data for KNIDMD.
Action: See KNIDMD.
Data: PROTO - Protocol type
>,,<DB%NND>)
BUG.(INF,KNIDMD,NISRV,SOFT,<Portal not enabled for this multicast>,<<P1,HIDST>,<P2,LODST>,<T3,HISRC>,<T4,LOSRC>>,<
Cause: A portal received a multicast frame on an address for which it wasn't
enabled. The frame is discarded, and the buffer is re-used.
A KNIDM1 BUGINF follows with the protocol type.
Action: No immediate action is required. However, if this BUGINF persists,
examine the additional data and try to determine by the protocol type
and source and destination addresses what is going on. It is possible,
but very unlikely, that some sort of NIA20 hardware problem is at
fault.
Data: HIDST - High order destination address
LODST - Low order destination address
HISRC - High order source address
LOSRC - Low order source address
>)
BUG.(CHK,KNIDOV,PHYKNI,HARD,<PHYKNI - NIA20 buffer overrun>,,<
Cause: The NIA20 hardware did not have enough free space to store an
incoming datagram.
Action: It is likely that there is a NIA20 hardware problem. Field Service
should check out the NIA20.
>,,<DB%NND>)
BUG.(INF,KNIDPE,PHYKNI,HARD,<PHYKNI - NIA20 data path error>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The threshold (5) for data mover parity errors was exceeded.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(INF,KNIEPE,PHYKNI,HARD,<PHYKNI - NIA20 detected EBUS parity error>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The NIA20 received a word with bad parity from the EBUS.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 and KL10 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(INF,KNIERE,PHYKNI,HARD,<PHYKNI - NIA20 got EBUS request timeout>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The NIA20 was unable to get control of the EBUS within 20 milliseconds
after making a PI request.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 and KL10 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(HLT,KNIERP,NISRV,SOFT,<Illegal error return from PHYKNI>,<<T1,ERROR>>,<
Cause: NISRV got an error return from PHYKNI while processing a state
change callback. The error code (one of the UNxyz% errors) is in
T1.
Data: ERROR - Error returned from PHYKNI
>,RTN)
BUG.(INF,KNIFBE,PHYKNI,HARD,<PHYKNI - NIA20 free buffer list parity error>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The NIA20 receive status indicated that there was a free buffer list
parity error.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(INF,KNIFQE,PHYKNI,SOFT,<PHYKNI - Free Queue Error>,,<
Cause: The NIA20 received a packet for a protocol, and there were no free
packets available for that protocol type.
Action: Determine which protocol type ran out of packets, and fix the driver
for that protocol type.
>)
BUG.(INF,KNIFST,PHYKNI,HARD,<PHYKNI - NIA20 failed self test>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: When the NIA20 is idle it performs a self test to check out various
pieces of logic (such as the ALU, the microsequencer, and the data
mover/formatter). It also performs a self test when it is first
started. In one of those cases, the self test failed.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(CHK,KNIFTL,PHYKNI,HARD,<PHYKNI - Frame too long>,,<
Cause: The NIA20 port module detected that it was transmitting a frame longer
than 1536. bytes.
Action: Field Service should check the NIA20.
>,,<DB%NND>)
BUG.(HLT,KNIFTS,PHYKNI,SOFT,<PHYKNI - Frame too short>,,<
Cause: The port was told to transmit a frame with less than 46. bytes of user
data and the pad flag (CMPAD) was not set. This should have been
detected by NISND.
>)
BUG.(INF,KNIGCE,PHYKNI,HARD,<PHYKNI - NIA20 got Grant CSR timeout>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The NIA20 was unable to acquire control of the CSR (CONI word) within
10 milliseconds after requesting it.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 and KL10 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(CHK,KNIHED,PHYKNI,HARD,<PHYKNI - Hard error detected>,<<P1,CONI>,<T1,PC>>,<
Cause: The NIA20 has detected MBUS ERROR or EBUS PARITY ERROR.
Action: This is a NIA20 hardware problem. The address (ADDR) and it's contents
(LOCMSB and LOCLSB) are printed out. Call Field Service and have the
NIA20 checked.
Data: CONI - CONI KNI,
PC - PC (Microcode PC at time of problem)
>,,<DB%NND>)
BUG.(CHK,KNIIAM,PHYKNI,SOFT,<PHYKNI - Illegal addressing mode>,<<T2,ADR>>,<
Cause: An illegal addressing mode was specified.
Action: If this BUGCHK persists, change the BUGCHK to a BUGHLT, and submit an
SPR along with the dump and instructions on reproducing the problem.
Data: ADR - The mode specfied.
>)
BUG.(HLT,KNIICA,PHYKNI,SOFT,<PHYKNI - Illegal channel block address>,<<PS,PS>,<PR,PR>>,<
Cause: The channel block address for this NIA20 portal is invalid.
Data: PS - Bad channel block address
PR - Bad portal block address
>)
BUG.(HLT,KNIICF,PHYKNI,SOFT,<PHYKNI - Illegal read counters function>,,<
Cause: The read counters callback routine detected an illegal function code
in the field C1FNC of the command block.
>)
BUG.(HLT,KNIIEC,PHYKNI,HARD,<PHYKNI - Illegal port error code>,<<T1,CODE>,<T4,CMD>>,<
Cause: The port generated a response which contained:
a. An unknown error code
b. An inappropriate error code for the command
Action: There may be an NIA20 microcode problem. It is more likely that there
is a NIA20 hardware problem. Field Service should check out the NIA20.
Data: CODE - Error Code
CMD - Command
>)
BUG.(CHK,KNIIFD,PHYKNI,SOFT,<PHYKNI - Illegal function from DLL>,<<T1,PASED>,<T2,BLKADR>,<T3,FNC>>,<
Cause: The NIDLL called the driver with a function we don't handle yet.
Action: If this BUGCHK persists, change the BUGCHK to a BUGHLT, and submit an
SPR along with the dump and instructions on reproducing the problem.
Data: BLKADR - The function block address.
FNC - The function code
>)
BUG.(CHK,KNIINF,PHYKNI,HARD,<PHYKNI - NIA20 initialization timed out>,<<T1,CONI>>,<
Cause: The NIA20 timed out during initialization. Either "disable complete"
didn't set or "enable complete" didn't set (the CONI indicates
which). This is very likely a hardware problem, because the microcode
version number was valid, and there was no specific error indication in
the CONI.
Action: Try reloading the NIA20 microcode. If the problem still persists call
Field Service.
Data: CONI - CONI KNI,
>,,<DB%NND>)
BUG.(INF,KNIIPE,PHYKNI,HARD,<PHYKNI - Internal NIA20 port error>,<<P1,CSR>,<P2,VERSION>,<T1,ADDR>>,<
Cause: The NIA20 detected an inconsistency with an operation it was
performing. The inconsistency can be caused by any number of things,
but the end result is that the function did not occur correctly or was
not logical.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 checked. It is unlikely, but
possible, that this can be caused by a NIA20 microcode bug.
Data: CSR - CONI KNI,
VERSION - Version number of the NIA20 microcode
ADDR - Address of parity error
>,,<DB%NND>)
BUG.(HLT,KNIIPF,PHYKNI,SOFT,<PHYKNI - Illegal channel dispatch>,,<
Cause: The NIA20 driver was called to perform a PHYSIO function it is not
capable of doing.
>)
BUG.(HLT,KNIIPT,PHYKNI,SOFT,<PHYKNI - Illegal protocol type on close>,<<T1,PTYPE>>,<
Cause: A protocol type was specified on the close that was NOT enabled.
Data: PTYPE - The specified protocol type.
>)
BUG.(HLT,KNIIRC,PHYKNI,SOFT,<PHYKNI - Illegal status on close>,<<T1,STATUS>>,<
Cause: The NIA20 driver has discovered that status field contained an
unexpected value upon return from the close (command flush) function.
Data: STATUS - Status
>)
BUG.(HLT,KNINBS,PHYKNI,SOFT,<PHYKNI - Non-BSD datagram sent>,<<CM,BUFFER>>,<
Cause: A NON-BSD style datagram was sent. The driver does not send this style.
Data: BUFFER - Buffer address
>)
BUG.(HLT,KNINIB,PHYKNI,SOFT,<PHYKNI - No control buffer at interrupt level>,,<
Cause: The Port Storage (PS) block was not set up with the address of a UN
block to be used at interrupt level.
>)
BUG.(CHK,KNIPER,PHYKNI,HARD,<PHYKNI - CRAM parity error>,<<P1,CONI>,<P2,ADDR>,<T1,LOCMSB>,<T2,LOCLSB>>,<
Cause: The KLNI has detected a unplanned parity error in it's Control RAM.
Action: There is a NIA20 hardware problem. Reload the NIA20 microcode. Field
Service should check out the NIA20.
Data: CONI - CONI
ADDR - Address of parity error
LOCMSB & LOCLSB - Contents of memory location
>,,<DB%NND>)
BUG.(INF,KNIPIE,PHYKNI,HARD,<PHYKNI - NIA20 detected PLI parity error>,,<
Cause: More than five parity errors occurred when reading data over the PLI
interface.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 checked.
>,,<DB%NND>)
BUG.(CHK,KNIQUE,PHYKNI,SOFT,<PHYKNI - Queue empty on entry>,<<T1,QUE>>,<
Cause: A queue was empty when the routine REMQUE was called.
Action: This may be a NIA20 microcode problem or a software problem. It is
possible, but unlikely, that there is a NIA20 hardware problem. If
this BUGCHK persists, change the BUGCHK to a BUGHLT, and submit an SPR
along with the dump and instructions on reproducing the problem.
Data: QUE - The queue header address.
>)
BUG.(INF,KNIRFD,PHYKNI,HARD,<PHYKNI - Remote failure to defer>,<<T1,TDR>>,<
Cause: A collision was detected after the NIA20 had "acquired" control of the
Ethernet cable. This is also known as a "late collision".
A collision may only occur during the transmission of the preamble of a
frame. This problem occurs when the collision is detected after the
preamble has been transmitted.
Action: Field Service should check the Ethernet cable. The maximum distance
between any two stations on the cable may not exceed 1500. meters. A
longer cable may result in late collisions of this sort. This problem
may also be caused by a malfunctioning Ethernet station. Check the
other Ethernet stations on the cable to see if they are having similar
problems.
Data: TDR - TDR value
>,,<DB%NND>)
BUG.(CHK,KNIRIT,PHYKNI,SOFT,<PHYKNI - Response queue interlock timed out>,,<
Cause: PHYKNI did not succeed in getting the response queue interlock after
5000. tries.
Action: There may be an NIA20 microcode problem. It is more likely that there
is a NIA20 hardware problem. Field Service should check out the NIA20.
>,RTN,<DB%NND>)
BUG.(CHK,KNIRLF,PHYKNI,SOFT,<PHYKNI - NIA20 Reload Failed>,<<T1,STATE>>,<
Cause: KNILDR ran, but failed to reload the NIA20 for some reason.
Action: Look at any possible KNILDR output. Call Field Service and have the
NIA20 hardware checked.
Data: STATE - State of the KLNI
>,,<DB%NND>)
BUG.(INF,KNISCE,PHYKNI,HARD,<PHYKNI - NIA20 spurious channel error>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: A spurious channel error occurs whenever the channel raises the error
signal, but no error bits are present in the channel logout area. This
error occurs after the threshold (5) of spurious channel errors has
been exceeded.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 and KL10 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(INF,KNISTA,PHYKNI,HARD,<PHYKNI - NIA20 spurious transmit attention>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The NIA20 module set the PLI transmit attention bit, but there was no
transmit pending according to the microcode.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(CHK,KNISTP,PHYKNI,SOFT,<PHYKNI - NIA20 stopped>,<<P1,CONI>,<T1,LAR>>,<
Cause: No response from NIA20 after 5 seconds.
Action: Unless the NIA20 has been halted by hand, call Field Service and have
the NIA20 hardware checked.
Data: CONI - CONI KNI,
LAR - Latched Address Register
>,,<DB%NND>)
BUG.(INF,KNISWC,PHYKNI,HARD,<PHYKNI - NIA20 short word count>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: When the NIA20 completed a CBUS transfer, the channel had a short word
count error.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 and KL10 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(INF,KNIUBE,PHYKNI,HARD,<PHYKNI - NIA20 used buffer list parity error>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The NIA20 port received a PLI parity error while reading the NIA
module's user buffer list. This error is only reported after a
threshold (5) for this type of error has been exceeded.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(HLT,KNIUOP,PHYKNI,HARD,<PHYKNI - Unknown response>,<<T1,RESP>>,<
Cause: The NIA20 port gave us a response we don't know about.
Data: RESP - Response
>)
BUG.(INF,KNIUPE,PHYKNI,HARD,<PHYKNI - NIA20 unknown planned CRAM parity error>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The NIA20 got a CRAM parity error in the range of 7750 to 7777. This
particular error falls into the range of planned CRAM parity errors,
but is not known to TOPS-20.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(CHK,KNIVAR,PHYKNI,SOFT,<Monitor variables do not match NIA20 variables>,<<T1,KLNI>,<T2,MON>>,<
Cause: PHYKNI just read some status variables from the NIA20 and found them
different from the shadow copies stored in the monitor. The port is
shutdown.
Action: It is possible that there is a monitor software or NIA20 microcode
problem. If this BUGCHK continues and is reprooducable, change this
CHGCHK to a BUGHLT and send in an SPR along with the dump and how to
reproduce the problem.
Data: KLNI - KLNI's version of the variables
MON - Monitor's version of the variables
>)
BUG.(CHK,KNIVER,PHYKNI,SOFT,<Bad NIA20 microcode version>,<<T3,BADEDT>,<<XADDR. UCVEDT>,GODEDT>>,<
Cause: The NIA20 port driver has read the microcode edit number from the
NIA20, and has determined that it is below the minimum revision level
required for proper port/driver operation. The NI is started anyway.
Action: Obtain the proper version of the NIA20 microcode, and install it into
SYSTEM:KNILDR.EXE. The minimum acceptable version is indicated by the
second additional data item.
Data: BADEDT - Edit number read from the KLNI
GODEDT - Edit number we require
>,,<DB%NND>)
BUG.(INF,KNIXPE,PHYKNI,HARD,<PHYKNI - NIA20 transmit buffer parity error>,<<P1,CSR>,<T1,ADDR>,<T3,LOGOU1>,<T4,LOGOU2>>,<
Cause: The NIA20 transmit status indicated a transmit buffer parity error.
This error is not reported until a threshold (5) of this type of error
has been exceeded.
The NIA20 is dumped and restarted by KNILDR.
Action: Call Field Service and have the NIA20 hardware checked.
Data: CSR - CONI KNI,
ADDR - Address of parity error
LOGOU1 - Channel logout word 1
LOGOU2 - Channel logout word 2
>,,<DB%NND>)
BUG.(HLT,KPALVH,APRSRV,SOFT,<Keep alive ceased>,,<
Cause: The immediate cause of this BUGHLT is the execution of location 71.
The front end does this if the monitor has not updated its keep-alive
counter recently. This usually indicates that the monitor is looping
and preventing the scheduler from running. This can be due to a
software bug or hardware that interrupts abnormally frequently. This
BUGHLT can be caused manually by requesting the front end to jump to
location 71.
Action: Look at the CTY output to see which case occurred. Look at the PC to
see where the monitor was running. If the crash was done manually, the
PC contains 72. If this BUGHLT is not caused manually, and this
BUGHLT is reproducible, send in an SPR with the dump and instructions
on reproducing the problem.
>)
BUG.(CHK,LAPRBF,LATSRV,SOFT,<Specify Receive Buffer Failure>,<<T1,DLLERC>>,<
Cause: LATSRV received an error from NISRV while attempting to post a
receive buffer. This indicates a problem in NISRV.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: DLLERC - Error code returned by NISRV
>)
BUG.(CHK,LATICB,LATSRV,SOFT,<LATCBR called from NISRV with illegal callback function code>,<<T1,CODE>>,<
Cause: NISRV has called the LATSRV callback routine with an invalid function
code. This indicates a problem in NISRV.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: CODE - Function code
>)
BUG.(CHK,LATIMT,LATSRV,SOFT,<LAT Illegal Message Type>,<<T1,MSGTYP>>,<
Cause: The LAT virtual circuit message was received with a message type out
of range. This indicates a protocol problem.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: MSGTYP - Message type
>)
BUG.(CHK,LATINE,LATSRV,SOFT,<LATINI failed to initialize>,,<
Cause: Could not obtain sufficient memory for the LAT host databases.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump in order to
determine why there is so little resident memory available at system
startup. The values to consider are: HC.LST words for the host node
database, CBMAXI words for CBVECT, NTTLAH words for SBVECT, and
PRMAXI words for PRVECT and PRRAND.
>)
BUG.(CHK,LATIPR,LATSRV,SOFT,<LAT Invalid PR block>,<<P1,PRBLOK>,<T1,SBBLOK>>,<
Cause: A Pending Request block was about to be deleted that still had a
Slot block attached to it.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: PRBLOK - Address of pending request block
SBBLOK - Address of slot block
>)
BUG.(INF,LATIST,LATSRV,SOFT,<LAT Illegal Slot Received>,<<T1,PC>,<T2,SLTID>,<T3,HIADDR>,<T4,LOADDR>>,<
Cause: An illegal LAT slot has been received. Normally, this indicates a
protocol violation by the remote server. We can get here on one of
the following conditions:
A run message ended prematurely in the middle of a slot,
The Slot type is out of range,
The Slot type is undefined,
The Slot type is REJECT,
The source or destination id is zero,
The Slot byte count is out of range,
Something other than interactive service class was requested,
The Local Slot ID is out of range, or
A Slot was received even though no credits had been extended.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: SLTID - Slot ID
>)
BUG.(INF,LATNSC,LATSRV,SOFT,<LAT Host node stopped circuit>,<<T1,CODE>,<T2,PC>,<T3,HIADDR>,<T4,LOADDR>>,<
Cause: LAT Host node stopped the circuit.
Action: Look at the Reason Code in T1 and the PC in T2. This error, if
relatively infrequent is nothing to be concerned about. If it occurs
frequently, use the CODE and PC to determine further action.
HIADDR and LOADDR specify the Ethernet address of the remote
server whose circuit has been stopped. The reason codes are as
follows:
CE.NSL==1 ;No slots connected on circuit
CE.ILL==2 ;Illegal message or slot format
CE.HLT==3 ;Circuit halted by local system
CE.NPM==4 ;No progress being made
CE.TIM==5 ;Time limit expired
CE.LIM==6 ;Retransmit limit exceeded
CE.RES==7 ;Insufficient resources
CE.STO==10 ;Server circuit timer out of range
CE.SKW==11 ;Protocol version skew
CE.INV==12 ;Invalid Message
If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: CODE - Reason code
PC - PC
HIADDR - High order 32 bits of Ethernet address
LOADDR - Low order 16 bits of Ethernet address
>)
BUG.(INF,LGFAIL,MEXEC,SOFT,<LGOUT or LOGIN JSYS failed>,<<T1,JOBPT>,<T2,LSTERR>>,<
Cause: An attempt to log in/out a job has failed when it should have succeded.
The most likely cause of this is terminals that have been TTYSTPed,
then a LGOUT or LOGIN JSYS was attemped. An attempt to detach the
terminal, then logout the job is made. If either of these fail,
the job is put in a permanent wait state.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: JOBPT - The terminal number
LSTERR - The reason of the failure
>,,<DB%NND>)
BUG.(CHK,LGSBBT,DSKALC,SOFT,<FNDLGS - Bit table is no good on Login Structure>,<<T1,STRNAM>>,<
Cause: Routine FNDLGS was attempting to mount the Login Structure,
but the bit table was found to be damaged.
Action: Examine the structure and repair the problem. Then re-boot the
system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LGSBRD,DSKALC,SOFT,<FNDLGS - Root-Directory is no good on Login Structure>,<<T1,STRNAM>>,<
Cause: Routine FNDLGS was attempting to mount the Login Structure,
but the Root Directory was found to be damaged.
Action: Examine the structure and repair the problem. Then re-boot the
system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LGSCCB,DSKALC,SOFT,<FNDLGS - Couldn't create a backup Root-Directory for Login Structure>,<<T1,STRNAM>>,<
Cause: Routine FNDLGS was attempting to mount the Login Structure,
but the backup copy of the Root Directory is bad and an
attempt to create a new copy failed.
Action: Examine the structure and repair the problem. Then re-boot the
system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LGSCMR,DSKALC,SOFT,<FNDLGS - Couldn't map in the Root-Directory for Login Structure>,<<T1,STRNAM>>,<
Cause: Routine FNDLGS was attempting to mount the Login Structure,
but it could not map in the root directory.
Action: Examine the structure and repair the problem. Then re-boot the
system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LGSDNA,DSKALC,SOFT,<CHKUDB - Disk unit is unavailable for use as Login Structure>,<<P3,UDB>,<T3,UDBSTS>,<T4,UDBST1>>,<
Cause: Routine CHKUDB is being called to determine if a disk unit is
useable a part of the Login Structure. This BUGCHK indicates
that within a 5 second timeout period either the unit did not
come online via MSCP or the home block check did not complete.
These conditions are indicated by bits US.CHB or U1.NOL
being set in words UDBSTS and UDBST1 of the unit's UDB.
Action: Examine the additional data of this BUGCHK to determine why
the unit is not available. Examine the unit for problems when
the system is fully booted.
This unit is not used as part of the Login Structure.
Data: UDB - Address of UDB for unit
UDBSTS - The primary UDB status word
UDBST1 - The secondary UDB status word
>,,<DB%NND>)
BUG.(CHK,LGSHBM,DSKALC,SOFT,<FNDLGS - Login Structure contains unit mismatches>,<<T3,STRNAM>>,<
Cause: Routine FNDLGS was trying to build the Login Structure. CKHOMU
found discrepencies between the home blocks of the units composing
this structure.
Action: Examine the units that are a part of the Login Structure. Fix the
unit mismatches and re-boot the system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(INF,LGSMIS,DSKALC,SOFT,<FNDLGS - Login Structure missing>,,<
Cause: Routine FNDLGS was called to mount a Login Structure as enabled in
the system configuration file. However no Login Structure was found.
The system will continue startup without the Login Structure.
Action: Examine the disk configuration to find out why the Login Structure
is not available.
>)
BUG.(CHK,LGSMSU,DSKALC,SOFT,<FNDLGS - Login Structure is missing a physical unit>,<<T3,STRNAM>>,<
Cause: Routine FNDLGS was trying to build the Login Structure. It
discovered that the structure built by CKHOMU was missing a unit.
Action: Replace the missing Login Structure unit and re-boot the system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LGSMTF,DSKALC,SOFT,<FNDLGS - Failed to mount the Login Structure in the cluster>,<<T3,STRNAM>>,<
Cause: Routine FNDLGS was trying to build the Login Structure. It
called routine MNTLGS top register the mount with CFS with shared
access. This call failed do the structure cannot be used as the
Login Structure. The most likely cause if this BUGCHK is that some
other system in the cluster has this structure mounted with
exclusive access.
Action: Change the access of this structure on other nodes in the cluster
to "shared", the re-boot the system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LGSMUN,DSKALC,SOFT,<FNDLGS - Login Structure contains multiple units>,<<T3,STRNAM>>,<
Cause: Routine FNDLGS was trying to build the Login Structure. It
discovered that the structure built by CKHOMU contained multiple
units with the same logical unit number.
Action: Fix the duplicate units and re-boot the system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LGSMXB,DSKALC,SOFT,<FNDLGS - Login Structure bit table is too large for monitor buffer>,<<T3,STRNAM>>,<
Cause: Routine FNDLGS was trying to build the Login Structure. CKHOMU
found that the bit table for the Login Structure is too large for
the monitor's internal buffers.
Action: Examine the units that are a part of the Login Structure. Fix the
structure and re-boot the system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LGSNBT,DSKALC,SOFT,<FNDLGS - Couldn't get OFN for the bit table on Login Structure>,<<T1,STRNAM>>,<
Cause: Routine FNDLGS was attempting to mount the Login Structure,
but it could not get an OFN for the bit table.
Action: Examine the structure and repair the problem. Then re-boot the
system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LGSNIT,DSKALC,SOFT,<FNDLGS - Couldn't get OFN for INDEX-TABLE.BIN on Login Structure>,<<T1,STRNAM>>,<
Cause: Routine FNDLGS was attempting to mount the Login Structure,
but it could not get an OFN for the INDEX-TABLE.BIN file.
Action: Examine the structure and repair the problem. Then re-boot the
system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(HLT,LGSNPS,DSKALC,SOFT,<FNDLGS - Failed to redefine "PS:" logical name>,<<T4,STRNAM>>,<
Cause: Routine FNDLGS was attempting to define logical name "PS:" to point
to the physical name of the Login Structure that was just mounted.
This BUGHLT indicates that for some reason, this failed. This
definition must be made in order for the system to run properly
with the Login Structure mounted.
Action: Examine the dump and determine why the CRLNM% JSYS failed.
Data: STRNAM - Sixbit Structure Name
>)
BUG.(CHK,LGSNRD,DSKALC,SOFT,<FNDLGS - Couldn't get OFN for the Root-Directory on the Login Structure>,<<T1,STRNAM>>,<
Cause: Routine FNDLGS was attempting to mount the Login Structure,
but it could not get an OFN for the root-directory.
Action: Examine the structure and repair the problem. Then re-boot the
system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LGSWLK,DSKALC,SOFT,<FNDLGS - Login Structure has a write-locked unit>,<<T3,STRNAM>>,<
Cause: Routine FNDLGS was trying to build the Login Structure. It
discovered that the structure built by CKHOMU was missing a unit.
Action: Replace the missing Login Structure unit and re-boot the system.
The Login Structure has not been mounted. The boot structure is
used as PS:.
Data: STRNAM - Sixbit Structure Name
>,,<DB%NND>)
BUG.(CHK,LLIBWK,LLINKS,SOFT,<NSPLCW called without lock while NOSKED or in scheduler>,<<T6,CALLER>>,<
Cause: The DECnet entry point NSPLCW has been called while the NSP interlock
was locked and the process is NOSKED or the scheduler performed the
call. This should never happen.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and inspect the
stack to find out who the offender was.
Data: CALLER - The address of the routine that requested the interlock
>)
BUG.(CHK,LLIDIR,LLINKS,SOFT,<Duplicate Interrupt Message Received>,<<EL,ELPTR>,<ES,ESPTR>,<MB,MBPTR>>,<
Cause: There is a duplicate interrupt message on the unacked interrupt
receive queue. This should not happen because the NSP interlock
should not release with anything on the receive queue.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and consider that
either the interrupt flow control is wrong and more than one data
request was sent or the remote node sent an interrupt message without
a data request.
Data: ELPTR - Pointer to EL block
ESPTR - Pointer to ES block
MBPTR - Pointer to message block
>,CLRSRQ)
BUG.(CHK,LLIFNS,LLINKS,SOFT,<SCTL passed bad NSPpid>,<<EL,ELPTR>>,<
Cause: Session control gave LLINKS a bad ID. This is a coding error
in SCLINK, or a memory manager problem.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and inspect
the stack to find out how the monitor got here. Inspect the ELB
to see if it otherwise looks like an ELB.
Data: ELPTR - Pointer to the bad ELB
>,RTN)
BUG.(CHK,LLIFZM,LLINKS,SOFT,<Tried to free zero message>,,<
Cause: FREMSG was requested to free a message. However, the pointer
to the message block was zero. This is a coding error in LLINKS.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Inspect the stack to find out
which routine called FREMSG.
>,RTN)
BUG.(HLT,LLIHTG,LLINKS,SOFT,<INIHSH can't get a hash table>,,<
Cause: The routine that initializes the LLINKS link hash table failed to get
memory for the hash table. If the value for the hash table size is
reasonable, this should never fail.
Action: Check that the contents of NSPHTS is a reasonable value.
>,RTN)
BUG.(HLT,LLIHTS,LLINKS,SOFT,<NSPHTS not set up>,,<
Cause: The monitor has a bad value for the hash table size.
Action: Rebuild or patch the monitor with a positive value in NSPHTS.
This value only resides in LLINKS.MAC so only source sites can
rebuild the monitor.
>,RTN)
BUG.(CHK,LLIIFC,LLINKS,SOFT,<Illegal flow control type>,<<EL,ELPTR>,<ES,ESPTR>,<MB,MBPTR>>,<
Cause: An illegal flow control type was requested on transmit. This
should have been checked by a higher layer.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Inspect the stack to find the
path that caused the bad value.
Data: ELPTR - Pointer to EL block
ESPTR - Pointer to ES block
MBPTR - Pointer to message block
>,PROCX1)
BUG.(CHK,LLIORC,LLINKS,SOFT,<ORC should never be negative>,,<
Cause: LLINKS has requested that a message be returned from ROUTER
after transmission. ROUTER just returned such a message to
LLINKS, but the count of outstanding messages was zero.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,LLIPIM,LLINKS,SOFT,<RSNMSG found illegal message type>,<<MB,MBPTR>>,<
Cause: A message that was being resent had a bad message type. This means
that the message was overwritten while it was waiting on the resend
queue. The message type was good when the message was sent the first
time.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Inspect the stack to find the
Data: MBPTR - Pointer to the message block describing the bad message
>,SNDATA)
BUG.(CHK,LLIQIN,LLINKS,SOFT,<Queued interrupt message illegal>,<<EL,ELPTR>,<ES,ESPTR>,<MB,MBPTR>>,<
Cause: LLINKS was asked to transmit two interrupt messages simultaneously.
A maximum of one is allowed. This is a software problem.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: ELPTR - Address of EL block
ESPTR - Address of ES block
MBPTR - Address of message block
>)
BUG.(CHK,LLIS2S,LLINKS,SOFT,<Illegal flow control at PRCRQS>,<<EL,ELPTR>,<ES,ESPTR>,<MB,MBPTR>>,<
Cause: An illegal flow control type was found at PRCRQS when the receive
queue was processed. If a remote node had sent us a bad flow control
type, it should have been found by the message parsing routines.
Therefore this should never happen.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Inspect the stack to find the
Data: ELPTR - Address of EL block
ESPTR - Address of ES block
MBPTR - Address of message block
>,PRCRS1)
BUG.(CHK,LLITNE,LLINKS,SOFT,<Unknown event at NSPEVT>,<<T1,EVC>,<T2,EVT>>,<
Cause: The caller of the NSPEVT routine supplied a bad event class and type.
NSPEVT may be called by SCLINK as well as by LLINKS. The caller's
address is on the stack.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: EVC - Event class
EVT - Event type
>)
BUG.(CHK,LLMCIF,LLMOP,SOFT,<LLMOP Read Channel Info Failed>,<<T1,DLLERC>>,<
Cause: A LLMOP attempt to read the Ethernet channel status
failed when the Data Link Layer was called.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: DLLERC - The error code returned from the DLL
>)
BUG.(INF,LLMIL1,LLMOP,SOFT,<LLMOP Received Invalid Loopback Message>,<<T1,MSGLEN>,<T2,HIORD>,<T3,LOORD>>,<
Cause: LLMOP received a loopback message that was too short or was
improperly formatted. This is a MOP protocol violation by a
remote node.
Action: Using the high and low order bits of the Ethernet address, attempt
to locate the remote node which is sending the illegal message.
Data: MSGLEN - The received message length
HIORD - The Ethernet address (high order bits)
LOORD - The Ethernet address (low order bits)
>,,<DB%NND>)
BUG.(INF,LLMILF,LLMOP,SOFT,<LLMOP Invalid Loopback Function Code>,<<T1,FUNCOD>,<T2,HIORD>,<T3,LOORD>>,<
Cause: LLMOP received a loopback message that was neither a loopback
reply message or a forward data message. This is a MOP protocol
violation by a remote node.
Action: Using the high and low order bits of the Ethernet address, attempt
to locate the remote node which is sending the illegal message.
Data: FUNCOD - The function code
HIORD - The Ethernet address of the transmitting node (high order)
LOORD - The Ethernet address of the transmitting node (low order)
>,,<DB%NND>)
BUG.(INF,LLMIR1,LLMOP,SOFT,<LLMOP Received Invalid Remote Console Message>,<<T1,MSGLEN>>,<
Cause: LLMOP received a remote console message that was too short, was too
long or was improperly formatted. This is a MOP protocol violation
by a remote node.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: MSGLEN - Received message length
>,,<DB%NND>)
BUG.(INF,LLMLXF,LLMOP,SOFT,<LLMOP Loopback Transmit Failed>,<<T1,DLLERC>,<T2,STATUS>,<T3,CHANNEL>>,<
Cause: LLMOP was unable to transmit a forward data message.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: DLLERC - The error code returned from the DLL
STATUS - The channel status returned from the DLL
CHANNEL - The channel on which the failure occurred
>,,<DB%NND>)
BUG.(CHK,LLMMCF,LLMOP,SOFT,<LLMOP Declare Multicast Address Failed>,<<T1,DLLERC>>,<
Cause: A LLMOP attempt to declare the Assistant Multi-Cast Address
failed when the Data Link Layer was called.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: DLLERC - The error code returned from the DLL
>,,<DB%NND>)
BUG.(CHK,LLMOPF,LLMOP,SOFT,<LLMOP Open Portal Failed>,<<T1,DLLERC>>,<
Cause: LLMOP failed to open an NI portal with the Data Link Layer.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: DLLERC - The error code returned from the DLL
>)
BUG.(CHK,LLMRQC,LLMOP,SOFT,<LLMOP RB Queue Corrupted>,<<T1,RBADDRESS>>,<
Cause: LLMOP attempted to remove an RB queue entry from an empty queue.
It is also possible that the RB was not on the queue.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: RBADDRESS - Address of RB queue entry
>)
BUG.(INF,LLMRRF,LLMOP,SOFT,<LLMOP Response Transmit Failed>,<<T1,DLLERC>,<T2,CHANNEL>>,<
Cause: LLMOP was unable to transmit a MOP request message.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: DLLERC - The error code returned from the DLL
CHANNEL - The channel on which the failure occurred
>,,<DB%NND>)
BUG.(CHK,LLMRXF,LLMOP,SOFT,<LLMOP Resource Failure>,,<
Cause: LLMOP was not able to obtain resources from the memory manager.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,,<DB%NND>)
BUG.(CHK,LLMSB2,LLMOP,SOFT,<LLMOP Specify Receive Buffer Failure>,<<T1,DLLERC>>,<
Cause: LLMOP could not post a receive buffer to the Data Link Layer.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: DLLERC - The error code returned from the DLL
>)
BUG.(INF,LLMSCA,LLMOP,SOFT,<LLMOP Ethernet Channel Address Change - CHAN,ADDR1,ADDR2>,<<T1,CHANNEL>,<T2,ADDR1>,<T3,ADDR2>>,<
Cause: LLMOP was called by NIDDL on change of state.
Action: No action is required as this is informational only.
Data: CHANNEL - Channel number
>,,<DB%NND>)
BUG.(INF,LLMSTC,LLMOP,SOFT,<LLMOP data link state change>,<<T1,CHANNEL>,<T2,PRTLID>,<T3,STATUS>>,<
Cause: LLMOP was called by NIDDL on change of state. This message is for
information only. No corrective action is required.
Data: CHANNEL - Channel number
PTRLID - Portal ID
STATUS - Status bits
>,,<DB%NND>)
BUG.(CHK,LLPSIF,LLMOP,SOFT,<LLPUTQ - Couldn't get free space to put an entry on the LLMOP PSI queue>,,<
Cause: ASGRES was called to get some free space from the general pool
to queue up a PSI for a fork. Unfortunately, there was no more
space in the free pool to assign. This is a free space problem
somewhere.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,LNGDIR,DIRECT,HARD,<Long directory file in directory>,<<T3,DIRNUM>>,<
Cause: The subdirectory has an incorrect superior directory.
Action: Use the EXPUNGE command with subcommand REBUILD to rebuild index table
of the directory listed in the additional data. If this doesn't cure
the problem, delete the directory and rebuild it.
Data: DIRNUM - Directory number
>,,<DB%NND>)
BUG.(HLT,LNGLNG,DISC,SOFT,<NEWLFP - File going long is already long>,,<
Cause: A file is becoming long for the first time. This BUG indicates that
the file is already long.
>)
BUG.(CHK,LNMILI,LOGNAM,SOFT,<LNMLUK - Illegal value of logical name table index>,,<
Cause: A call was made to LNMLUK to lookup a logical name in the logical
name tables but the caller specified neither a job-wide nor
a system-wide logical name.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,LOKINT,FUTILI,SOFT,<Lock being locked while OKINT>,<<T1,LOCK>,<T2,CALLER>>,<
Cause: A routine is locking a lock while OKINT. This is dangerous since
allowing interrupts can cause the lock to be held indefinetly or
lock ownership to be lost.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. The dump shows which routine
is OKINT while attempting to get the lock. Make the routine go
NOINT for the duration of the lock being locked.
Data: LOCK - Lock index and flags
CALLER - Caller's address
>)
BUG.(HLT,LOKODR,FUTILI,SOFT,<Lock requested out of order>,<<T4,LOKREQ>,<T3,LOKOWN>>,<
Cause: There is a priority locking scheme in the monitor. A lock is
being requested that should have been locked previously.
Data: LOKREQ - the requested lock
LOKOWN - the highest lock held thus far
>)
BUG.(HLT,LOKWRG,FUTILI,SOFT,<Wrong fork is releasing lock>,<<T4,LOKREQ>,<T3,FORK>>,<
Cause: A fork is trying to unlock a fork it has never owned, or unlocking
it too many times.
Data: LOKREQ - the requested lock
FORK - the fork trying to release the lock
>)
BUG.(HLT,LPRIXC,LLMOP,SOFT,<LLMOP Invalid Xmit Complete>,<<T1,RBSTT>,<T2,UNSTA>>,<
Cause: NIDLL called back to LLMOP with a transmit complete event
for an RB which is not in Transmit Initiated state. This is
a software bug.
Data: RBSTT - The current RB state
UNSTA - The status in the UN block
>)
BUG.(INF,LPRLXF,LLMOP,SOFT,<LLMOP Loop Request Transmit Failed>,<<T1,DLLERC>,<T2,STATUS>,<T3,CHANNEL>>,<
Cause: LLMOP was unable to transmit a forward data message.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: DLLERC - The error code returned from the DLL
CHANNEL - The channel on which the failure occurred
>,,<DB%NND>)
BUG.(CHK,LPSIFC,LLMOP,SOFT,<LLMOP LPSCBR called with invalid function code>,<<T1,FUNCODE>>,<
Cause: The LLMOP Loopback Protocol Server Call Back Routine was called by
the Data Link Layer with an invalid callback function code. This is
a software bug.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: FUNCODE - Function code
>)
BUG.(HLT,LUUMN0,APRSRV,SOFT,<LUUO in monitor context>,,<
Cause: While running in section 0, the monitor has executed an LUUO. The
flags and PC are stored in LUUBLK and LUUBLK+1, respectively.
>)
BUG.(HLT,LUUMON,APRSRV,SOFT,<Illegal LUUO from monitor context>,,<
Cause: While running in a non-zero section, the monitor executed an LUUO. The
LUUO block is at the 4 locations starting at .LUTRP. Note that the
hardware reference manual incorrectly states that an LUUO in exec mode
becomes an MUUO.
>)
BUG.(INF,MACBTO,DIAG,HARD,<DIAG - Close buffer timed out>,,<
Cause: The DIAG close buffer operation has timed out before completion.
>)
BUG.(HLT,MAP41F,FORK,SOFT,<MAPF41 failed to skip>,,<
Cause: The MAPFKH routine calls itself recursively in order to
find every fork in a specified tree. For each fork found, the
instruction following the call to MAPFKH is executed. MAPFKH
finally skip-returns in order not to fall into that coinstruction
at .+1. The recursive calls skip-return too, merely because they
fall through the same RETSKP instruction.
The MAP41F BUGHLT should never happen, and is merely a placeholder
for the impossible non-skip return from the recursive call to
MAPFKH.
>)
BUG.(HLT,MAPBT1,DSKALC,SOFT,<OFN for bit table is zero>,,<
Cause: There is no OFN for the file structure bit table currently being
mapped.
>)
BUG.(CHK,MAPCLF,SCHED,SOFT,<Failed to clear maps when killing job>,,<
Cause: A call to MSETPT to clear the job map or process map for the top fork
of a job being killed has failed.
Action: If this BUGCHK can be reproduced, set it dumpable and submit an SPR
along with a dump and how to reproduce the problem.
>)
BUG.(HLT,MARK1,PAGUTL,SOFT,<BADCPG - Not an OFN>,<<T2,SPTIDX>,<T1,COREPG>>,<
Cause: An OFN is in error but the SPT index is not pointing to an OFN.
Data: SPTIDX - SPT index
COREPG - Core page number
>)
BUG.(HLT,MDDJFN,LOOKUP,SOFT,<GETFDB - Called for non-MDD device>,,<
Cause: The monitor tried to get a FDB for a device other than a structure.
>)
BUG.(HLT,MNTLNG,DSKALC,SOFT,<MNTBTB - Bit table is a long file>,,<
Cause: While mounting the structure, the monitor discovered that the bit table
for the structure is a long file.
Action: Use CHECKD to rebuild the file structure bit table. If that does not
work, recreate the structure.
>)
BUG.(CHK,MONBKB,MEXEC,SOFT,<Cannot set monitor error interrupt>,<<T1,LSTERR>>,<
Cause: The monitor was attempting to enable interrupts on the monitor error
channels. This BUG. indicates that the AIC failed.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: LSTERR - Last process error
>,R)
BUG.(CHK,MONNEJ,SCHED,SOFT,<Nested JSYS without ERJMP>,<<T1,FLAGS>,<T2,PC>>,<
Cause: An illegal instruction trap has occurred and the previous context is
the monitor but no ERJMP is present following the nested JSYS call.
This violates required coding practice because the previous context may
have locks that need to be released.
Action: If this problem can be reproduced, set this bug dumpable and submit
an SPR along with the dump and instructions on reproducing the problem.
Data: FLAGS - Processor flags
PC - PC at which faulty nested JSYS was done
>)
BUG.(HLT,MONPDL,APRSRV,SOFT,<Stack fault in monitor>,<<FPC,PC>>,<
Cause: The monitor has executed a PUSH instruction that caused a stack
overflow. The central processor detected this condition and reported
it to the monitor.
Data: PC - PC of instruction which caused stack overflow
>)
BUG.(INF,MOPIFC,LLMOP,SOFT,<LLMOP Received an invalid MOP message>,<<T1,FUNCODE>>,<
Cause: The LLMOP Remote Console Protocol Server received a MOP message
with an invalid function code. This is a MOP protocol violation by a
remote node.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: FUNCODE - Function code
>,,<DB%NND>)
BUG.(HLT,MPEUTP,APRSRV,HARD,<PFCDPE - Unknown trap on test reference>,,<
Cause: The monitor was processing an AR or ARX parity error when a second
error occurred. The monitor retries the reference that caused the
original error and is prepared to handle a second error. However, the
BUGHLT indicates that the second error (caused by the retry) was not an
AR or ARX parity error and thus was not expected.
Action: This BUGHLT indicates a hardware problem. Field Service should check
the system.
>)
BUG.(CHK,MPIDXO,DIRECT,SOFT,<MAPIDX - No OFN for Index Table File>,,<
Cause: There is no open file number for the structure index table. The
structure index table file cannot be mapped.
Action: If this BUGCHK can be reproduced, set it dumpable and submit an SPR
with the dump and instructions on reproducing the problem.
>)
BUG.(CHK,MSCAOL,PHYMSC,SOFT,<PHYMSC - Online node event while node already online>,<<T2,NODE>,<T1,CID>,<Q1,SBI>>,<
Cause: SCAMPI told us that this node was coming back on line but we think that
it is already online. We believe SCAMPI and put it online. This is
commonly seen from the HSC.
Action: No action is required. However, if this bug occurs often or is
reproducible, change it to a BUGHLT and submit an SPR along with a dump
and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
SBI - system block index
>,R,<DB%NND>)
BUG.(INF,MSCAVA,PHYMSC,SOFT,<PHYMSC - Available message received>,<<T2,NODE>,<T1,CID>,<T4,UNIT>>,<
Cause: When a disk becomes available we get a message that tells us that it
has returned from a state where it we could not use it. We then build
a UDB if needed and start checking the home blocks.
Action: No action required, this bug is for information only.
Data: NODE - node number
CID - connect ID
UNIT - Unit number
>,,<DB%NND>)
BUG.(HLT,MSCBAD,PHYMSC,SOFT,<PHYMSC - Bad dispatch from PHYSIO>,,<
Cause: PHYSIO called PHYMSC at the MSCDSP controller dispatch vector to
perform a function that is illegal for MSCP devices. This is a
software problem.
>)
BUG.(CHK,MSCBCN,PHYMSC,SOFT,<PHYMSC - Command reference number bad>,<<T2,NODE>,<T1,CID>,<T3,ENDCODE>,<T4,FUNCTION>>,<
Cause: The command reference number is invalid. This is an MSCP protocol
problem.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
ENDCODE - packet end code
FUNCTION - command request
>)
BUG.(HLT,MSCBHE,PHYMSC,SOFT,<PHYMSC - BHD error bit set>,,<
Cause: The BHD error bit was set. This implies that the BSD had the wrong
length. Something is inconsistent in the state or too much data was
sent.
>)
BUG.(HLT,MSCBID,PHYMSC,SOFT,<PHYMSC - bad connect ID from SCAMPI>,<<T2,CID>>,<
Cause: A connect response available occurred and a negative or zero Connect ID
was returned from SCA. This indicates a SCAMPI problem.
Data: CID - connect ID
>)
BUG.(CHK,MSCBPK,PHYMSC,SOFT,<PHYMSC - QOR bad packet>,<<T2,NODE>,<T1,CID>,<T3,ENDCODE>,<T4,CRN>>,<
Cause: The HSC sent a packet whose command reference number can't be found.
The packet is ignored.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
ENDCODE - packet end code
CRN - Command reference number
>,,<DB%NND>)
BUG.(INF,MSCCDF,PHYMSC,SOFT,<PHYMSC - Connect to disk failure>,<<Q1,NODE>,<T1,ERRCOD>>,<
Cause: A connect failure to use the disks on an HSC occurred after an
indication that an HSC was present. Connection attempts are not timed
out by SCAMPI. If PHYMSC is in the middle of connecting to a server on
another node, and that node crashes, these BUGCHKs continue to appear
until that node reappears.
Action: Bring up the remote node as soon as possible. No other action is
required.
Data: NODE - node number
ERRCOD - error code
>,CONFLD,<DB%NND>)
BUG.(INF,MSCCRN,PHYMSC,SOFT,<PHYMSC - Connect did not complete in reasonable timeout>,<<T2,NODE>,<T1,CID>,<Q3,INDEX>>,<
Cause: There was a connect request and no response. The remote node probably
is sick or has gone away.
Action: Check the status of the remote node, and reload it if needed.
Data: NODE - node number
CID - connect ID
INDEX - MSCCID table index
>)
BUG.(INF,MSCCTF,PHYMSC,SOFT,<PHYMSC - Connect to tape failure>,<<Q1,NODE>,<T1,ERRCOD>>,<
Cause: A connect failure to use the tapes on an HSC occurred after an
indication that an HSC was present. Connection attempts are not timed
out by SCAMPI. If PHYMSC is in the middle of connecting to a server on
another node, and that node crashes, these BUGCHKs continue to appear
until that node reappears.
Action: Bring up the remote node as soon as possible. No other action is
required.
Data: NODE - node number
ERRCOD - error code
>,TACFAL)
BUG.(INF,MSCCTO,PHYMSC,SOFT,<PHYMSC - Request HSC disconnect - command timeout>,<<T2,NODE>,<T1,CID>>,<
Cause: The HSC has not correctly responded to Get Command Status request.
Action: Check the remote node for evidence of problems and reload it if needed.
Data: NODE - node number
CID - connect ID
>)
BUG.(INF,MSCCWM,PHYMSC,HARD,<PHYMSC - Controller not in 576 mode>,<<T2,NODE>,<T1,CID>,<T3,UNIT>>,<
Cause: The HSC controller is not in 576 bytes per sector mode. It cannot be
used by TOPS-20 unless it is in 576 bytes per sector mode.
Action: Set the HSC in 576 mode.
Data: NODE - node number
CID - connect ID
UNIT - unit number
>,,<DB%NND>)
BUG.(INF,MSCDIS,PHYMSC,SOFT,<PHYMSC - Request HSC disconnect>,<<T2,NODE>,<T1,CID>,<Q3,INDEX>>,<
Cause: The messages from the HSC indicate a problem. The HSC has probably
crashed. The HSC is disconnected and reconnected.
Action: Check the remote node for evidence of problems.
Data: NODE - node number
CID - connect ID
INDEX - MSCCID table index
>)
BUG.(INF,MSCDSR,PHYMSC,SOFT,<PHYMSC - Disconnect request by remote node>,<<T2,NODE>,<T1,CID>,<T3,REASON>>,<
Cause: The remote node has disconnected, the remote node has probably timed
out on some operation to the MSCP driver. All drives connected to the
node are put offline.
Action: No action is required, this bug is informational only. The remote node
might indicate why it disconnected.
Data: NODE - node number
CID - connect ID
REASON - reason for disconnect
>,,<DB%NND>)
BUG.(INF,MSCDWM,PHYMSC,HARD,<PHYMSC - Disk not in 576 mode>,<<T2,NODE>,<T1,CID>,<T3,UNIT>>,<
Cause: A disk unit is not a 576 bytes per sector disk. The disk unit will not
be used. This bug will be seen when a 16 bit HDA (used on VAX systems)
is connected to a HSC that TOPS-20 is trying to use.
Action: No action is required, this bug is for information only.
Data: NODE - node number
CID - connect ID
UNIT - unit number
>,,<DB%NND>)
BUG.(INF,MSCGON,PHYMSC,SOFT,<PHYMSC - IORB/QOR gone>,<<P5,CID>,<P4,IORB>,<T2,STATUS>>,<
Cause: PHYMSC had a data structure which pointed at an IORB. It cannot find
the IORB on the unit transfer queue. This seems to be a problem with
PHYMSC's handling of the QOR database.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: CID - Connect ID
IORB - IORB address
STATUS - Status of IORB
>)
BUG.(CHK,MSCIDG,PHYMSC,SOFT,<PHYMSC - Connect ID gone>,<<T2,NODE>,<T1,CID>>,<
Cause: When the MSCP driver was tyring to send a request to a server, the
source connect ID disappeared (call to SC.DCI failed). This appears to
be a SCAMPI problem.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - destination node number
CID - source connect ID
>)
BUG.(HLT,MSCILD,PHYMSC,SOFT,<PHYMSC - Illegal dispatch from SCAMPI>,<<T1,CODE>>,<
Cause: PHYMSC was called by SCAMPI with an illegal dispatch value (less than
zero or greater than INTRLG). This appears to be a SCAMPI problem.
Data: CODE - Dispatch value
>)
BUG.(CHK,MSCILF,PHYMSC,SOFT,<PHYMSC - Illegal function at start IO>,<<T1,FCN>>,<
Cause: Illegal function at call to start IO on a MSCP device. The caller of
MSCRIO or MSCSIO has specified an function code that is not legal for
MSCP devices.
Action: If this BUGCHK is reproducable, set it dumpable, and send in an SPR
with the dump and how to reproduce the problem.
Data: FCN - The illegal function
>)
BUG.(CHK,MSCIVC,PHYMSC,SOFT,<PHYMSC - Illegal command>,<<P1,CHAN>,<P2,KONT>,<P3,UNIT>,<T3,STS>>,<
Cause: The remote node claimed we sent it an illegal command. This indicates
a MSCP protocol problem with the local or remote node.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: CHAN - Channel number
KONT - Controller number
UNIT - Unit number
STS - Status returned by remote node
>)
BUG.(CHK,MSCMID,PHYMSC,SOFT,<PHYMSC - Missing connect ID>,,<
Cause: There is a missing or zero connect ID on call to FNDNDX. This has to
be a SCAMPI problem.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
>)
BUG.(INF,MSCN2S,PHYMSC,HARD,<PHYMSC - More tape drives than table space, excess ignored>,<<P2,KDB>,<P1,CHN>>,<
Cause: The number of tape drives available exceeds the constant value MTAN.
Only MTAN drives can be configured.
Action: The monitor should be rebuilt after changing MTAN in STG with a value
of MTAN large enough to accommodate all the tape drives available to
the system.
Data: KDB - KDB address
CHN - Channel number
>,,<DB%NND>)
BUG.(HLT,MSCNIR,PHYMSC,SOFT,<PHYMSC - IORB zero>,<<P5,CID>>,<
Cause: PHYMSC found the IORB register zero in a place it did not expect.
Data: CID - Connect ID
>)
BUG.(CHK,MSCNRA,PHYMSC,SOFT,<PHYMSC - Node response available when not requested>,<<T2,NODE>,<T1,CID>>,<
Cause: A connect response available occurred on a node that isn't expected to
have an available happen. This could be a SCAMPI or PHYMSC problem.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
>,R)
BUG.(INF,MSCNUF,PHYMSC,SOFT,<PHYMSC - Get next unit failed>,<<T2,NODE>,<T1,CID>,<T3,ERRCOD>>,<
Cause: PHYMSC was unable to get the next unit from a HSC, probably because
SC.SMG failed. This is seen most often with broken HSC hardware.
Action: If the hardware checks out OK, and if this bug occurs often or is
reproducible, change it to a BUGHLT and submit an SPR along with a dump
and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
ERRCOD - error code
>,,<DB%NND>)
BUG.(INF,MSCNXF,PHYMSC,SOFT,<PHYMSC - Get next unit failed>,<<T2,NODE>,<T1,CID>,<T3,ERRCOD>,<Q3,INDEX>>,<
Cause: Get a next unit failed. All the units on this HSC50 may not be found.
This is seen most often with broken HSC hardware.
Action: If the hardware checks out OK, and if this bug occurs often or is
reproducible, change it to a BUGHLT and submit an SPR along with a dump
and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
ERRCOD - error code
INDEX - MSCCID table index
>)
BUG.(INF,MSCOLE,PHYMSC,HARD,<PHYMSC - Online failed>,<<T3,STATUS>,<T2,NODE>,<T1,UNIT>>,<
Cause: An online request failed. This has been known to happen when duplicate
unit numbers are found, and, in some cases, when the TOPS-20 MSCP
server returns a status of offline.
Action: Check the remote node to see if it crashes or any other information on
why the online failed.
Data: STATUS - Status code in ONLINE end message
>,RTNBUF,<DB%NND>)
BUG.(INF,MSCOLF,PHYMSC,SOFT,<PHYMSC - Available online failed>,<<T2,NODE>,<T1,CID>,<T3,ERRCOD>>,<
Cause: An attempt to put an available unit online failed because of a send
failure. The remote node could have crashed during the online attempt.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
ERRCOD - error code
>,,<DB%NND>)
BUG.(INF,MSCORO,PHYMSC,SOFT,<PHYMSC - Offline return to online when we were told avail>,<<T2,NODE>,<T1,CID>,<T3,CODE>>,<
Cause: A node that indicated an online is not available when the online is
attempted. The remote node could have crashed during the online
attempt.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
CODE - end packet status code
>,,<DB%NND>)
BUG.(CHK,MSCPEI,PHYMSC,SOFT,<PHYMSC - Packet end code incorrect>,<<T2,NODE>,<T1,CID>,<T3,ENDCODE>,<T4,CRN>>,<
Cause: The HSC sent a packet that had a bad packet end code. There may be a
problem with the HSC or it could be a software problem.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
ENDCODE - packet end code
CRN - command reference number
>)
BUG.(INF,MSCPTG,PHYMSC,SOFT,<PHYMSC - port went away>,<<T2,NODE>,<T1,CID>>,<
Cause: The remote node has dropped the connection. All drives connected to
the node are put offline.
Action: No action is required, this bug is informational only.
Data: NODE - node number
CID - connect ID
>,,<DB%NND>)
BUG.(CHK,MSCQRC,PHYMSC,SOFT,<PHYMSC - QOR list clobbered>,<<T2,NODE>,<P2,KONT>,<T4,CRN>>,<
Cause: The QOR (the link between MSCP commands and IORBs) list has been
clobbered and has a 0 in it. This indicates a PHYMSC problem.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - Node number
KONT - Controller number
CRN - Command reference number
>,RTNBUF)
BUG.(INF,MSCREJ,PHYMSC,SOFT,<PHYMSC - Node connection reject>,<<T2,NODE>,<T1,CID>>,<
Cause: A connection response available was rejected. The node cannot be
reached. The MSCP server on another TOPS-20 system rejects all
connections until that system has joined the CFS cluster.
Action: No action is required, this bug is for information only.
Data: NODE - node number
CID - connect ID
>,R,<DB%NND>)
BUG.(INF,MSCRLD,PHYMSC,SOFT,<PHYMSC - HSC control reload initiated>,<<T2,NODE>,<T1,CID>>,<
Cause: After a disconnect and reconnect to the HSC to clear up problems, the
HSC is still not responding correctly and is reloaded.
Action: Check the remote node for evidence of problems.
Data: NODE - node number
CID - connect ID
>,,<DB%NND>)
BUG.(CHK,MSCRLF,PHYMSC,SOFT,<PHYMSC - Start or reset failed>,<<T2,NODE>,<T1,CID>,<T3,ERRCOD>>,<
Cause: After problems were seen with a HSC, a disconnect and reconnect was
tired. The problems were not cleared up, it was sent a message to
reload itself. The HSC was unable to restart or reset.
Action: Check the remote node for evidence of problems.
Data: NODE - node number
CID - connect ID
ERRCOD - error code
>)
BUG.(INF,MSCSCF,PHYMSC,SOFT,<PHYMSC - SETCCH failed to set characteristics>,<<T2,NODE>,<T1,CID>,<T3,ERRCOD>,<Q3,INDEX>>,<
Cause: SETCHH failed to set characteristics. This appears to be a hardware
problem with the remote node.
Action: Field Service should check the remote node's hardware.
Data: NODE - node number
CID - connect ID
ERRCOD - error code
INDEX - MSCCID table index
>)
BUG.(CHK,MSCSCW,PHYMSC,SOFT,<PHYMSC - Send found wrong connect state>,<<T2,NODE>,<T1,CID>,<T3,ERRCOD>>,<
Cause: The state of the connection is incorrect for the connect state.
Previous states should have caught this unless the state changed during
the send. The send should have been done with the channel off. The
send is tried again. This appears to be a SCAMPI problem.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
ERRCOD - error code
>)
BUG.(CHK,MSCSDF,PHYMSC,SOFT,<PHYMSC - Send failure>,<<T2,NODE>,<T1,CID>,<T3,ERRCOD>>,<
Cause: A message sent to SCAMPI failed for reasons other than no credit or
connection in wrong state. The send request is retried. This
appears to be a SCAMPI problem.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
ERRCOD - error code
>)
BUG.(CHK,MSCSIF,PHYMSC,SOFT,<PHYMSC - Start IO failed>,<<P3,UDB>,<P2,KDB>,<P1,CHAN>>,<
Cause: A call to MSCRIO failed when it was not expected to in UNQUNT. This
appears to be a PHYMSC problem.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: UDB - UDB address
KDB - KDB address
CHAN - Channel number
>)
BUG.(HLT,MSCSOA,PHYMSC,SOFT,<PHYMSC - SC.SOA failed>,<<T1,ERRCOD>>,<
Cause: Interrupts were requested and failed. This has to be a SCAMPI problem.
Data: ERRCOD - Error Code
>)
BUG.(INF,MSCSUF,PHYMSC,SOFT,<PHYMSC - Set density failed>,<<T2,NODE>,<T1,CID>,<3,CODE>>,<
Cause: The set unit characteristics command failed for a tape drive.
Action: If this bug occurs often or is reproducible, change it to a BUGHLT and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - Node number
CID - Connect ID
CODE - Status code
>)
BUG.(INF,MSCTMU,PHYMSC,HARD,<PHYMSC - Too many units for KDB>,<<P2,KDB>,<P1,CHN>>,<
Cause: There are more than PRTMXU units on a particular HSC, therefore there
is not enough room in the KDB for UDB entries.
Action: If you want to support more than PRTMXU units on a HSC, a source
rebuild of the monitor is required. Change the symbol PRTMXU in SCAPAR
and rebuild PHYMSC and PHYKLP.
Data: KDB - KDB address
CHN - Channel
>,,<DB%NND>)
BUG.(HLT,MSCUDB,PHYMSC,SOFT,<PHYMSC - UDB missing>,,<
Cause: We have just set up a unit during initalization and now we can't find
it. This indicates a software problem.
>)
BUG.(INF,MSCUKD,PHYMSC,HARD,<PHYMSC - Unknown disk type>,<<T2,NODE>,<T1,CID>>,<
Cause: A device on HSC is not a device recognized by TOPS-20 and is
not used.
Action: No action required, this bug is for information only.
Data: NODE - node number
CID - connect ID
>,,<DB%NND>)
BUG.(INF,MSSBCM,PHYMVR,SOFT,<BADCMD - MSCP server bad command>,<<T2,NODE>,<T1,CID>,<T3,OPCODE>,<T4,ERRBIT>>,<
Cause: The MSCP server received a command with an illegal or unsupported
operation specified.
Action: If this bug occurs often or is reproducible, set it dumpable and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
OPCODE - operation code
ERRBIT - error bits and status of command
>)
BUG.(CHK,MSSCAC,PHYMVR,SOFT,<MSCP server can't accept connection>,<<T2,NODE>,<T1,CID>,<T3,REASON>>,<
Cause: The MSCP server cannot accept a connection.
Action: If this bug occurs often or is reproducible, set it dumpable and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
REASON - reason for failure
>,R)
BUG.(INF,MSSCGL,PHYMVR,SOFT,<MSCP server can't get listener>,<<T1,ERROR>>,<
Cause: The MSCP server cannot get a listener for connection requests.
The server continues to try to get a listener.
Action: If this bug occurs often or is reproducible, set it dumpable and
submit an SPR along with a dump and instructions on reproducing it.
Data: ERROR - Error code returned by SC.LIS
>,,<DB%NND>)
BUG.(HLT,MSSCID,PHYMVR,SOFT,<Illegal connect ID index>,<<T2,NODE>,<T1,CID>>,<
Cause: The MSCP server cannot locate a SCDB for the given connect ID.
Action: If this bug occurs often or is reproducible, set it dumpable and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
>)
BUG.(INF,MSSCTO,PHYMVR,SOFT,<PHYMVR - Command timeout>,<<T2,NODE>,<T1,CID>,<T4,STATE>>,<
Cause: Unknown. A command did not complete in the timeout interval.
Action: If this bug occurs often or is reproducible, set it dumpable and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
STATE - command state
>,,<DB%NND>)
BUG.(HLT,MSSDNQ,PHYMVR,SOFT,<DMADON - DMA done queue entry not found>,<<T2,CID>,<T3,BUFF>>,<
Cause: A DMA complete interrupt occurred and no commands were found which had
a matching buffer name. This indicates a software inconsistency.
Data: CID - connect ID
BUFF - 32 bit buffer name
>)
BUG.(CHK,MSSER0,PHYMVR,SOFT,<IORB done error and error bits 0>,<<T2,IRBERR>>,<
Cause: An IORB completed with bit IS.ERR set indicating an error. The MSCP
server could not find any relevent error.
Action: If this bug occurs often or is reproducible, set it dumpable and
submit an SPR along with a dump and instructions on reproducing it.
Data: IRBERR - IORB status word
>)
BUG.(HLT,MSSLNM,PHYMVR,SOFT,<MSCP server listner does not match>,,<
Cause: The listner index does not match the known index of the listner.
>)
BUG.(HLT,MSSNWO,PHYMVR,SOFT,<OK2SND - OK to send when not waiting>,<<T2,CID>>,<
Cause: The MSCP server received notification of OK to send from a node. The
node in question was not flagged as waiting for an OK to send.
Data: CID - connect ID
>)
BUG.(INF,MSSREJ,PHYMVR,SOFT,<MSCP server rejecting connection>,<<T2,NODE>,<T1,CID>,<T3,ERROR>>,<
Cause: The MSCP server is rejecting a connection because the connector cannot
be identified due to an SCA failure or because the connector is not on
a KL10 processor.
Action: No action required, this bug is for information only.
Data: NODE - Node number
CID - Connect ID
ERROR - SCA error code
>,,<DB%NND>)
BUG.(INF,MSSSBD,PHYMVR,SOFT,<Send failed>,<<T2,NODE>,<T1,CID>,<T3,ERROR>>,<
Cause: A send of a message failed for an unexpected reason. The connection
is shut down.
Action: If this bug occurs often or is reproducible, set it dumpable and
submit an SPR along with a dump and instructions on reproducing it.
Data: NODE - node number
CID - connect ID
ERROR - SCA error code
>,MSSFSD)
BUG.(HLT,MSSSCA,PHYMVR,SOFT,<MSCP SERVER - Server detected SCA error>,<<T1,SCAFNC>,<T2,ARG1>,<T3,ARG2>,<T4,ARG3>>,<
Cause: The MSCP server detected an illegal response from SCA.
Data: SCAFNC - SCA function code
ARG1 - ARG3 - SCA function arguments
>)
BUG.(INF,MSSSHT,PHYMVR,SOFT,<MSCP server shutdown node>,<<T2,NODE>,<T1,CID>,<T3,STATUS>,<T4,ERROR>>,<
Cause: The MSCP server was forced to shut down a node.
Action: No action required, this bug is for information only.
Data: NODE - node number
CID - connect ID
STATUS - connection status
ERROR - last SCA error
>,,<DB%NND>)
BUG.(HLT,MSSSTA,PHYMVR,SOFT,<MSCP SERVER - Illegal state>,,<
Cause: The MSCP server detected an illegal command or connection state.
>,R)
BUG.(HLT,MSSTML,PHYMVR,SOFT,<LISTEN - MSCP server too many listners>,,<
Cause: The MSCP server tried to obtain a listner when one already existed.
This indicates an inconsistency in the software.
>)
BUG.(CHK,MSSUMP,PHYMVR,SOFT,<Unmap buffer failed>,<<T1,REASON>>,<
Cause: A routine was called to unmap a buffer and failed when it should not
have.
Action: If this bug occurs often or is reproducible, set it dumpable and
submit an SPR along with a dump and instructions on reproducing it.
Data: REASON - error code
>,CLNUME)
BUG.(CHK,MTANOA,MAGTAP,HARD,<IRBDN2 - IRBDON called for an active IORB>,,<
Cause: IRBDON was called to mark an IORB as done but the IORB was
currently active. This is believed to be a hardware error.
Action: Have Field Service check the tape subsystem. If this BUG
still persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,MTANOI,MAGTAP,HARD,<GETUBF - No queued IORB's for input>,,<
Cause: GTUBFA was called to get the next user IORB for input but none
were queued. This is believed to be a hardware error.
Action: Have Field Service check the tape subsystem. If this BUG
still persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,MTANOQ,MAGTAP,HARD,<IRBDN1 - IRBDON called for non-queued up IORB>,,<
Cause: IRBDON was called to mark an IORB as done but the IORB was not
queued. This is believed to be a hardware problem.
Action: Have Field Service check the tape subsystem. If this BUG
still persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,MTAORN,MAGTAP,SOFT,<MTDIR0 - Magtape IORB overrun>,,<
Cause: MTDIRQ was called to queue an IORB for PHYSIO. The caller provided
an IORB and a function code for the IORB. This BUGCHK indicates
that there was a function code already stored in the IORB. The
function code in the IORB should be zero.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(HLT,MTARIN,MAGTAP,HARD,<MTAINT - Interrupt received for nonactive IORB>,,<
Cause: A done interrupt occurred for a magtape IORB, and PHYSIO called
one of the routines MTAINT, MTDINT, or MTPINT to handle the done
IORB. However, the indicated IORB was not marked as being active.
>)
BUG.(INF,MTMSG,TAPE,HARD,<Failed to send MT message to MOUNTR>,<<T1,ERRCOD>>,<
Cause: This message is from TAPE. TAPE sends IPCF messages to MOUNTR under
certain conditions, such as volume switch. TAPE was unable to send the
IPCF message. The user program involved receives an error return to
its tape operation.
Action: There are many reasons IPCF refuses to send a message. The IPCF error
code is passed back to the user. If it is a resource problem, try to
improve system resources.
If it seems like a monitor bug, change this BUGINF to a BUGHLT, and
submit an SPR with the dump stating how to reproduce the problem.
Data: ERRCOD - Error code
>)
BUG.(CHK,NAMFUL,MNETDV,SOFT,<Internet host name table full>,,<
Cause: The monitor wanted to add a host to the internet host table but it
could not because the HSTNAM table is full. This could happen when the
file SYSTEM:HOSTS.TXT is loaded, or by the monitor resolving too many
hosts from the DNS nameservers identified in the file
SYSTEM:INTERNET.NAMESERVERS. This BUGCHK only appears only once in any
5 minutes to prevent too many of them flooding the CTY.
Action: Rebuild the monitor with a higher value for NHSTN. NHSTN is set to six
times the size of NHOSTS, so increasing NHOSTS will result in larger
tables. If the host name table has been filled by a large number of
hosts added by the DNS resolver in the monitor, the host table can be
reloaded with just HOSTS.TXT information by using the IPHOST program's
LOAD command.
>)
BUG.(CHK,NEGJRT,SCHED,SOFT,<UCLOCK - Negative JOBRT detected>,<<T2,JOBNO>>,<
Cause: The job runtime (JOBRT) is negative for an existing job. This would
cause the job to appear non-existent to most JSYSes. The monitor will
use a reasonable value for JOBRT (0) and logout the job.
Action: If this BUGCHK can be reproduced, set it dumpable and submit an SPR
along with a dump and how to reproduce the problem.
Data: JOBNO - Job number
>)
BUG.(INF,NETABF,IMPDV,SOFT,<IMPDV: Assign of buffer failed>,,<
Cause: The monitor has tried to assign an 1822 buffer and has failed or
an illegal size for the buffer was requested. This probably indicates
a software problem.
>)
BUG.(INF,NETRBF,IMPDV,SOFT,<IMPDV: Release of 1822 buffer failed>,,<
Cause: The monitor has attempted to release an 1822 buffer and has determined
that the buffer is already released or has been smashed. This
probably indicates a software problem.
>)
BUG.(HLT,NEWBAK,FILINI,HARD,<FILRFS - NEWIB failure for BACKUP ROOT-DIRECTORY>,,<
Cause: This BUGHLT happens when NEWIB fails to assign a backup index block for
the BS: root-directory. This happens if DSKASA fails to assign a
disk address, or if ASROFN fails to assign an OFN.
Action: Select a different disk pack to build the system on. If the trouble
persists, have Field Service check the system.
>)
BUG.(HLT,NEWROT,FILINI,HARD,<FILRFS - NEWIB failure for ROOT-DIRECTORY>,,<
Cause: This is identical to NEWBAK, except it is for the primary
root-directory rather than the backup root-directory.
Action: Select a different disk pack to build the system on. If the trouble
persists, have Field Service check the system.
>)
BUG.(HLT,NIDUNF,NISRV,SOFT,<Unknown Callback code from Port Driver.>,,<
Cause: The port driver has called back with either a code in T1 that is
not understood or that is not expected to be called back on.
>,RTN)
BUG.(HLT,NIJECL,NIUSR,SOFT,<Error closing portal>,<<T1,ERROR>>,<
Cause: NISRV returned an error when we tried to close a portal. The error
code was not UNRES% (Resource error), which is the only one that may
occur.
Data: ERROR - The returned error code
>)
BUG.(HLT,NIJIPB,NIUSR,SOFT,<Illegal Portal Block>,<<T1,JOBPR>>,<
Cause: NIUSR did not find a proper portal block pointer in the job's portal
list.
Data: JOBPPR - Job's portal list address
>,RTN)
BUG.(HLT,NIJPMU,NIUSR,SOFT,<Portal List messed up>,<<P1,PRLIST>>,<
Cause: NIUSR was attempting to create a portal and install it in the portal
list. According to PLNUM, there were some free spots in the portal
list. An exhaustive search of the list was not able to find
a free slot. This is inconsistent.
Data: PRLIST - Portal list address
>,CREDIE)
BUG.(CHK,NISEC6,D36COM,SOFT,<Not in section 6>,<<T1,CALADR>>,<
Cause: Code that should be running in section 6 is not.
Action: If the DOB% facility did not produce a dump, change this to
a BUGHLT and submit an SPR with the dump.
Data: CALADR - Address of routine not in section 6
>)
BUG.(CHK,NMXTBG,JNTMAN,SOFT,<NMXTIM table obsolete>,,<
Cause: The table used by NMXTIM is obsolete.
Action: Create a new table.
>,RTN)
BUG.(HLT,NOACB,SCHED,SOFT,<MENTR - No more AC blocks>,<<CX,PC>>,<
Cause: When a JSYS is executed from within the monitor, the AC's of the
current process are stored in a special area in the monitor. This area
consists of several 20-word blocks that are used successively as one
JSYS invokes another. The BUGHLT indicates that a JSYS has been called
but that no 20-word block is left in which to store the contents of the
AC's. This usually means that the counter that the monitor uses to
keep track of these blocks has been clobbered.
Data: PC - PC at which last JSYS was executed
>)
BUG.(CHK,NOACJF,MEXEC,SOFT,<No ACJ file to run>,,<
Cause: CHKR attempted to service a user request to start up an ACJ fork
under job 0. This request failed because CHKR could not find an
ACJ program to run.
Action: One of two things can be done:
1. Define DEFAULT-ACJ: on the system to point to the ACJ program.
2. Insure that there is an ACJ.EXE on SYSTEM:
>,,<DB%NND>)
BUG.(CHK,NOACTF,JSYSA,SOFT,<No account database - account validation disabled>,,<
Cause: An account is to be validated because account validation is enabled,
but no account database has been loaded. It would appear that account
validation was turned on (possibly with the ENABLE ACCOUNT-VALIDATION
command in SYSTEM:n-CONFIG.CMD) without putting an account database in
BS:<SYSTEM>ACCOUNTS-TABLE.BIN.
Action: Account validation has been disabled. Either turn off account
validation by editing SYSTEM:n-CONFIG.CMD or install an account
database in BS:<SYSTEM>ACCOUNTS-TABLE.BIN with the ACTGEN utility.
>,,<DB%NND>)
BUG.(INF,NOADDR,MNETDV,SOFT,<Failed to find SYSTEM:INTERNET.ADDRESS file>,,<
Cause: The SYSTEM:INTERNET.ADDRESS file was either not found or is corrupted.
Action: Insure that the file is in SYSTEM: and is valid.
>,,<DB%NND>)
BUG.(HLT,NOADXB,PAGUTL,SOFT,<RELOFN - No disk address for XB>,,<
Cause: A routine has been called to release an OFN. The OFN is the identifier
for the index block of a file that is being closed. This routine
forces the index block into memory. The backup address for the index
block should be on the disk. The BUGHLT indicates that the backup
address is not on the disk.
>)
BUG.(CHK,NOALCM,IPCF,HARD,<ALCMES - Cannot send message to allocator>,<<T1,PID>>,<
Cause: Messages cannot be sent to the device allocator PID.
Action: Check the state of the allocator and see if it is running properly.
Chances are it is not processing the messages queued for it or that
the system is out of IPCF free space for some other reason. One known
cause of these BUGCHKs is when MOUNTR has crashed and has not been
restarted. Also, if MOUNTR is flooded with disk or tape requests
(maybe via a .CMD file in OPR) then these BUGCHKs may appear.
Data: PID - PID
>,,<DB%NND>)
BUG.(INF,NOARCS,IPCF,HARD,<ARCMSG - PID for QUASAR is not valid>,,<
Cause: ARCMSG could not validate the PID for QUASAR.
Action: Check QUASAR and be sure it is processing requests.
>)
BUG.(CHK,NOBAT1,DSKALC,HARD,<Failed to write primary BAT BLOCK>,<<T1,CKUNUM>>,<
Cause: Primary BAT blocks were not written due to one of the following
errors:
1. Invalid unit or channel specified
2. Channel and unit is not a disk device
3. Hardware error
Data: CKUNUM - Channel, controller, and unit numbers (12 bits each)
>,,<DB%NND>)
BUG.(CHK,NOBAT2,DSKALC,HARD,<Failed to write secondary BAT BLOCK>,<<T1,CKUNUM>>,<
Cause: Secondary BAT blocks were not written due to one of the following
errors:
1. Invalid unit or channel specified
2. Channel and unit is not a disk device
3. Hardware error
Data: CKUNUM - Channel, controller, and unit numbers (12 bits each)
>,,<DB%NND>)
BUG.(CHK,NOBTB,FILINI,HARD,<FILINI - Unable to open bit table file>,,<
Cause: During normal system startup, the call to MNTBTB to get an OFN for the
bit table of BS: failed. MNTBTB fails if it cannot get a JFN for the
bit table or if it cannot get an OFN for the index block.
Action: TOPS-20 attempts to initialize a private copy of the bit table
using CRTBTB. If this also fails, a BTBCR1 BUGHLT results.
>,,<DB%NND>)
BUG.(HLT,NOBTBN,FILINI,SOFT,<FILRFS - Unable to get size of BOOTSTRAP.BIN file>,,<
Cause: This BUGHLT should never occur. The routine that must fail for this
BUGHLT to occur should never be called, since BOTSIZ is 0 on a
normal startup, or some non-negative number if the FSIDIA routine asked
the typist for a number.
>)
BUG.(CHK,NOCHKR,SCHED,HARD,<CHKR fork blocked>,<<T1,CHKDUE>>,<
Cause: The CHKR fork has not run in a while. The monitor is getting nervous.
If the CHKR fork continues to not run for a long time the a CHKRNR
BUGHLT will result.
Possible causes for CHKR not running include:
1. A disk failure that prevents fork 0 from updating the disk
2. Removal of a disk that is mounted
3. An HSC or MSCP server disk is hung
4. Logic errors in the monitor.
Action: Check the console output from this system. Try to find out if any disk
problems are blocking job 0. It is unlikely that this is a software
problem.
Data: CHKDUE - Count of times CHKR was found overdue
>,,<DB%NND>)
BUG.(HLT,NOCTY,TTYSRV,SOFT,<Unable to allocate data for CTY>,,<
Cause: During initialization of terminal lines, a call to ASGRES was made to
get resident free space for the CTY's data base. The call got a
failure return - no free space available.
>)
BUG.(CHK,NODDMP,SCHED,HARD,<DDMP fork blocked>,<<T1,DDPDUE>>,<
Cause: The monitor creates a fork in job zero that exists for the life of the
system. This fork runs periodically to move pages from the swapping
space to files on disk. This is an essential system function. The
DDMP fork has not run in a while. The monitor is getting nervous. If
the DDMP fork continues to not run for a long time the a DDMPNR BUGHLT
will result.
Possible causes for DDMP not running include:
1. A disk failure that prevents DDMP from updating the disk
2. Removal of a disk that is mounted causing DDMP to block
3. An HSC or MSCP server disk is hung causing DDMP to block
4. Logic errors in the monitor.
Action: Check the console output from this system. Try to find out if any disk
problems are blocking DDMP. It is unlikely that this is a software
problem.
Data: DDPDUE - Count of times DDMP was found overdue
>,,<DB%NND>)
BUG.(CHK,NODIR1,IPCF,HARD,<SPLMES - DIRST failed on existing directory name>,<<T2,DIRNUM>>,<
Cause: DIRST% failed to translate a directory number into a string for the
currently mapped directory.
Action: Verify the integrity of the directory.
Data: DIRNUM - Directory number
>)
BUG.(CHK,NODMPF,MEXEC,SOFT,<Could not find CI-20 microcode dump program>,,<
Cause: The KLIPA (IPA20) RAM needs to be dumped. The file
BS:<SYSTEM>IPADMP.EXE is supposed to be run to do this. However,
the file does not exist.
Action: Currently, TOPS-20 does not support dumping of the CI20 via the
IPADMP.EXE program. Presently, nothing in TOPS-20 should be
attempting to dump the KLIPA and this BUG should never appear.
>,,<DB%NND>)
BUG.(CHK,NODTEN,DTESRV,SOFT,<DTESRV - NO DTE buffers available in critical case>,,<
Cause: A buffer is needed for a queued protocol message to a front-end
-11 via a DTE. There are no buffers available, and the
caller is not prepared to handle failure.
Action: Assume the message was sent.
>)
BUG.(HLT,NOEQFK,MEXEC,SOFT,<Creation or starting of ENQ fork failed>,,<
Cause: A CFORK and MSFRK was done to create and start a fork for one of
the ENQ forks. For some reason, this has failed.
>)
BUG.(HLT,NOFEFS,FILINI,SOFT,<FILRFS - Unable to get size of front end file system>,,<
Cause: This BUGHLT occurs if GTFESZ fails to get the size of the front end
file system. This only happens if ASGPAG fails.
>)
BUG.(HLT,NOFNDU,DEVICE,SOFT,<FNDUNT - Cannot find device for JFN>,,<
Cause: The block that describes the JFN, or the table used to initialize
the device for the JFN, is clobbered or zero. If this JFN is not
locked, another fork may have closed this JFN or otherwise modified
FILXXX variables, causing this situation.
>)
BUG.(HLT,NOFSEC,PAGUTL,SOFT,<ASGVAS failure at startup>,,<
Cause: ASGVAS was called to get a free section for SCA at startup but failed.
>)
BUG.(INF,NOHSTN,MNETDV,SOFT,<Failed to find host name file SYSTEM:HOSTS.TXT>,,<
Cause: The SYSTEM:HOSTS.TXT file was not found. The file SYSTEM:HOSTS.DEBUG
is used if DBUGSW is 2 or greater.
Action: Insure that the appropriate file is in SYSTEM: and is valid.
>,,<DB%NND>)
BUG.(HLT,NOIORB,PHYSIO,HARD,<SETIRB - Missing IORB>,,<
Cause: The routine SETIRB was called for an active unit to return the
currently active IORB for the unit, but the position wait queue or
transfer wait queue was empty.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(CHK,NOIPCI,IPCIDV,SOFT,<IPCI interface selected without CI20 hardware>,,<
Cause: A IPCI device has been specified in the SYSTEM:INTERNET.ADDRESS file
however the CI20 hardware is not initialized. Either this system does
not have CI20 hardware installed, or the CI20 hardware is disabled.
Action: One of two actions may be taken. Add CI20 hardware to the system,
or remove the configuration of the IPCI device from the file
SYSTEM:INTERNET.ADDRESS.
>,,<DB%NND>)
BUG.(HLT,NOLEN,DISC,SOFT,<UPDLEN - No length info for OFN>,,<
Cause: The table OFNLEN, which gives the file length for each OFN, has
an invalid entry for the OFN in question.
>)
BUG.(CHK,NOLODF,MEXEC,SOFT,<Could not find CI-20 microcode load program>,,<
Cause: The KLIPA (IPA20) RAM needs to be reloaded. The program
BS:<SYSTEM>IPALOD.EXE is supposed to be run to do this. However,
the program does not exist. TOPS-20 is now ignoring the CI20.
Action: If you wish to use the CI20, you must install the load progam, which
also contains the CI20 microcode, as BS:<SYSTEM>IPALOD.EXE and reload
the system.
>,,<DB%NND>)
BUG.(INF,NOOFN,PAGUTL,SOFT,<ASOF4 - Attempt to create new OFN failed - no more OFNs available>,,<
Cause: As a result of an OPENF, an attempt has been made to create a new OFN.
This attempt fails because the system has no more OFNs available for
use. The user receives an OPNX10 error. This BUGINF is
issued once every 30 minutes regardless of how many OPENF
attempts are made during the time the OFN space is exhausted.
Action: If more OFN space is desired, increase the value of NOFN in STG and
rebuild the monitor.
>,,<DB%NND>)
BUG.(HLT,NOPGT0,DISC,SOFT,<OPNLNG - No page table 0 in long file.>,,<
Cause: There is no page 0 for long file being opened.
>)
BUG.(CHK,NOPID,IPCF,HARD,<PIDKFL - PID disappeared>,,<
Cause: DELPID rejected a PID that had been returned from GETNPF.
>)
BUG.(HLT,NORMIA,MNETDV,SOFT,<No room in host tables for local address>,,<
Cause: The Internet Multinet code is unable to allocate space in the network
hash table for a local address from the SYSTEM:INTERNET.ADDRESS file.
>)
BUG.(HLT,NORSXF,DTESRV,SOFT,<Failed to get space for master DTE>,,<
Cause: While attempting to initialize RSX20F protocol for the console
front end, the call to ASGRES (assign resident free space) failed.
>)
BUG.(HLT,NOSEB2,APRSRV,SOFT,<No SYSERR buffer available>,,<
Cause: An AR or ARX parity error has occurred, and the monitor is creating a
SYSERR block. The BUGHLT indicates that no free space is available for
the SYSERR block. Therefore, no block is created. UPTPFW
contains the page fail word.
Action: Free space may be congested with other SYSERR blocks. Have Field
Service check the system.
>)
BUG.(CHK,NOSERF,MEXEC,HARD,<Cannot GTJFN error report file>,<<T1,ERRCOD>>,<
Cause: The CHKR fork failed to get a JFN for the ERROR.REPORT file.
Action: Based upon the error code returned from GTJFN%, attempt to diagnose
the problem. If all appears to be in order and the BUG still
persists, make it dumpable and submit an SPR with the dump and a
copy of MONITR.EXE. If possible, include any known method for
reproducing the problem and/or the state of the system at the time
the BUG was observed.
Data: ERRCOD - GTJFN error code
>,,<DB%NND>)
BUG.(HLT,NOSKTR,SCHED,SOFT,<ITRAP from NOSKED or CSKED context>,<<KIMUPC,MUUOPC>,<LSTERR,LSTERR>,<LSTIPC,LSTIPC>>,<
Cause: An illegal instruction trap has occurred while the process was NOSKED
or CSKED. This suggests that important resources may be left locked.
Action: This BUGHLT is normally seen with bad hardware. If the hardware checks
out OK, send in an SPR along with a dump and indicate how this problem
can be reproduced.
Data: MUUOPC - PC of last MUUO
LSTERR - Last error code
LSTIPC - PC from which ITRAP was called
>)
BUG.(CHK,NOSLNM,LOGNAM,SOFT,<SLNINI - Cannot create system logical name>,<<T1,ERRCOD>>,<
Cause: A call to CRLNM% to create the default system-wide logical names at
system startup failed.
Action: This logical name is not defined. System operation may be impaired.
If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: ERRCOD - Error code returned by CRLNM%
>)
BUG.(CHK,NOSPLM,GTJFN,SOFT,<RELJFN - Could not send spool message to QUASAR>,,<
Cause: Could not tell QUASAR of spooled file for output.
Action: See if QUASAR is running and check to see that the system has some
IPCF free space available. If the system appears to be normal and
if this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,,<DB%NND>)
BUG.(HLT,NOTOFN,PAGUTL,SOFT,<UPDOF0 - Updating file argument not OFN>,,<
Cause: A routine has been called to write an updated index block for a file
onto the disk. However, the identifier that was provided by the caller
is not a valid id for a file. (It is not an OFN.)
>)
BUG.(CHK,NOUTF1,DISC,SOFT,<SPLOPN - NOUT of directory number failed>,,<
Cause: The NOUT JSYS failed in trying to open the spooled disk file.
Action: See what the JSYS error from NOUT% was and try to determine what
may be wrong. It is possible that the destination disk may be
having problems.
>)
BUG.(CHK,NOUTF2,IPCF,HARD,<SPLMES - NOUT of generation number failed>,,<
Cause: SPLMES attempted to NOUT the generation number of a spooled file
and it failed.
Action: If this problem becomes chronic, then change this BUGCHK to a BUGHLT.
Determine why NOUT% failed by looking at the error it has returned.
It is possible that the disk we are trying to write to is having
hardware problems. If this is the case, have field service look at
the disk.
>)
BUG.(CHK,NPWQPD,PHYSIO,HARD,<PHYSIO - Null PWQ at position done>,<<T1,CHAN>,<T2,CONTRL>,<T3,UNIT>>,<
Cause: A position-done interrupt occurred, and the routine PHYPDN was called
to move IORBs from the position wait queue into the transfer wait
queue, but the position wait queue was empty.
Action: If the problem persists, Field Service should check out the unit
specified in the additional data.
Data: CHAN - The channel number
CONTRL - The controller number (-1 if no controller)
UNIT - The unit number
>)
BUG.(CHK,NRFTCL,PHYSIO,HARD,<PHYSIO - No requests found for cylinder seeked>,,<
Cause: The routine PHYPDN was called on a position-done interrupt to transfer
any IORBs that were on the position wait queue into the transfer wait
queue; but no IORBs were found which were for this cylinder.
Action: If this BUGCHK is persistant on the system, change it to a BUGHLT and
send in an SPR with the dump and how to reproduce the problem.
>)
BUG.(HLT,NSKDIS,SCHED,SOFT,<Dismiss while nosked or with non-res test address>,,<
Cause: A process has declared its intention to cease running (dismiss) until a
particular event occurs. This is clearly a software problem. The
scheduler will test for the occurrence of the event by calling a
routine that the process has provided. The BUGHLT occurs if one of the
following happens:
1. The process has already declared itself to be NOSKED, thereby
preventing the running of other processes;
2. The test routine is in part of the monitor's swappable code and
could therefore cause an illegal page fault in the scheduler.
>)
BUG.(CHK,NSKDT2,PAGEM,SOFT,<PGRTRP - Bad NSKED or INTDF>,,<
Cause: When a page fault occurred, the running process's interrupt indicator
(INTDF) had an abnormally low value. INTDF should never be less than
-1. This abnormally low value could have let an interruption occur
during a page fault. This indicates a bug in which too many OKINT's
have been executed.
Action: The monitor has zeroed INTDF to prevent interruptions during the page
fault. If this bug can be reproduced, set it dumpable and submit an
SPR with a dump indicating how the bug can be reproduced.
>)
BUG.(CHK,NTBSUP,D36COM,SOFT,<Buffer supplied>,,<
Cause: The routine NTPARM was called to handle a network management parameter.
The routine can only handle returns of a single value, but NTMAN had
supplied a multi-word buffer.
>)
BUG.(CHK,NTBTSM,D36COM,SOFT,<Buffer too small>,,<
Cause: NTMAN requested a show counter operation, but did not supply a
buffer large enough to store all the counters.
>)
BUG.(CHK,NTMBCF,NTMAN,SOFT,<Bad coded field on output>,,<
Cause: Output for a SHOW is being formatted, and there has been a request
to generate a CODED field of more than one byte. This can't be
done.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and look at the
descriptor block pointed to by NT. Check to see if this item is
supposed to be a multiple byte code. If not, fix the item's entry.
If it is correct, you are going to have to write the code to handle
multiple byte codes.
>,NTEMPE)
BUG.(CHK,NTMBCL,NTMAN,SOFT,<Bad counter byte length>,,<
Cause: While generating output for a numeric field, there has been a
request to generate an illegal number of bytes.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(CHK,NTMBDL,NTMAN,SOFT,<Bad multiple byte length>,,<
Cause: While generating output for a numeric field, there has been
a request to generate an illegal number of bytes.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(CHK,NTMBFP,NTMAN,SOFT,<Bad format type encountered>,,<
Cause: In the process of reading a value from the user string,
descriptor tables have returned an invalid format for this
item. The AC "NT" points to the descriptor for this item, and
field NTSEQ tells which item is being referred to.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Fix the entry for this item
to contain a valid format type.
>,NTEMPE)
BUG.(CHK,NTMCBL,NTMAN,SOFT,<Bad Counter Block length>,,<
Cause: A DECnet Layer has returned an invalid length for a
Counter Block.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(HLT,NTMCNO,NTMAN,SOFT,<Circuit name overrun>,,<
Cause: More than 16 bytes of data have been returned into a 16 byte field.
The data beyond the buffer has been trashed.
Action: Examine the algorithm above to determine why more bytes than
expected were returned. Fix the above code to check for overrun
while it is producing the bytes, so that this halt does not occur.
>)
BUG.(CHK,NTMDVI,NTMAN,SOFT,<NMXDSP value illegal>,,<
Cause: There is a call to a "layer" to obtain or set a value for
an item. The routine value in the descriptor block pointed
to by NT is illegal.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and examine
the data structure pointed to by NT. In all probability this is
caused by a trashed NT, since the descriptor block generation macros
are supposed to range check this value. A "layer" is any routine
described at NMXDSP.
>,NTEMPE)
BUG.(CHK,NTMEFO,NTMAN,SOFT,<Event function out of range>,,<
Cause: The event function supplied by a DECnet layer to NMXEVT was
out of range.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and make callers
of NMXEVT supply the correct function code.
>,RTN)
BUG.(CHK,NTMEOR,NTMAN,SOFT,<Entity type out of range>,,<
Cause: While double checking the entity ID before dispatching
on it, the value was found to be illegal. Since the
value the user supplies is checked at GETBLK, this means that
field NXENT has been trashed.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(CHK,NTMFOR,NTMAN,SOFT,<Format out of range>,,<
Cause: While formatting output for a show, the format block for
this item has been found to have an illegal format type.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(CHK,NTMFUR,NTMAN,SOFT,<Function code out of range>,,<
Cause: While dispatching by function code, the function code is found
to be out of range. Since the function code the user supplies is
checked in GETBLK, this means that field NXFNC has been trashed
in the meantime.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(CHK,NTMICF,NTMAN,SOFT,<Non-counter function in PRSCOU>,,<
Cause: There is an illegal function in the PRSCOU routine. NXFNC
is wrong.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(CHK,NTMILN,NTMAN,SOFT,<Illegal number size>,,<
Cause: When going to read a numeric value from the user's string,
the format descriptor block for this item has specified an illegal
number of bytes to read.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(CHK,NTMINT,NTMAN,SOFT,<Invalid numeric type>,,<
Cause: When generating output for a numeric field, something other than
Decimal, Hexadecimal or Octal was requested.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(CHK,NTMKOR,NTMAN,SOFT,<Controller out of range in Circuit-id>,,<
Cause: The controller field in a line-id is out of range. The value
LD.MAX defines the number of controllers known by D36PAR, and
thus by NTMAN. The most likely cause of this bug is a trashed
AC.
Note:
A controller is any device driver to which a router interfaces.
It is currently used to define the name of a Circuit/Line,
under the assumption that each Kontroller will control only
a single line type.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,RTN)
BUG.(CHK,NTMLTR,NTMAN,SOFT,<Line type is out of range>,,<
Cause: To determine entries to return (for function .NTSHO),
it is necessary to know the Line type (CI,NI,DTE,...).
Other entities (Nodes,Modules) should have this field
zero. This field is set by ENTCVT.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(CHK,NTMNEC,NTMAN,SOFT,<No error code with error return>,,<
Cause: A routine has returned non-skip, but has not given
an error code by calling NTExxx. A return to the top level found
field NXERR zero.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and
determine which routine is failing, and make the error return
give an error code.
>)
BUG.(CHK,NTMNTR,NTMAN,SOFT,<Node type is out of range>,,<
Cause: It is necessary to know the node type (executor,remote, or loop)
to select entries to return (for function .NTSHO). Other entities
(circuit, lines) should have this field zero. This field is set by
ENTCVT.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,NTEMPE)
BUG.(CHK,NTMORE,NTMAN,SOFT,<Unrecognized entity type>,,<
Cause: An event was received from a DECnet layer, and the entity type
is not legal.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and find the
routine that logged the event, and change it to a legal entity type.
>,NTEMPE)
BUG.(CHK,NTMSOR,NTMAN,SOFT,<Selection criteria is out of range>,,<
Cause: The criteria is out of range for selecting items to return (for .NTSHO)
dependent on the selection criteria.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and fix the
check in GETBLK or find out who is trashing field NXSEL.
>,NTEUFO)
BUG.(CHK,NTMSQF,NTMAN,SOFT,<Signal queue full>,,<
Cause: The signal queue was full when a new signal was logged.
This might be caused by a malfunctioning NMLT20 that does not
read the signals from the signal queue, or it may be caused by
a DECnet device driver going bad. A signal is used to tell
NMLT20 that a device needs attention/reload.
Action: Restart NMLT20, or turn off malfunctioning DECnet device.
If necessary, reload any devices by hand. Note that this has
been known to occur at startup. If this is the case, simply
increase the size of the signal queue. You must have sources
in order to do this, however, as NMXSLN resides in D36PAR.MAC.
>,EVSIG2)
BUG.(CHK,NTNBFS,D36COM,SOFT,<No buffer supplied>,,<
Cause: The routine NTPARM was called to handle a network management parameter.
The caller of NTPARM said that it expects the call from NTMAN to
supply a buffer for the parameters to be read from or stored into.
None was supplied.
>)
BUG.(CHK,NTNBUF,D36COM,SOFT,<No buffer supplied>,,<
Cause: NTMAN requested a show counter operation, but did not supply a
buffer to store the counters in.
>)
BUG.(INF,NTOHNG,MNETDV,SOFT,<Network output hung>,<<P1,ENTRY>>,<
Cause: Multinet has declared the output interface for a network hung.
The network interface is reset.
Data: ENTRY - Entry into NCT Vector table of hung interface
>)
BUG.(CHK,NUMFUL,MNETDV,SOFT,<Internet host index table full>,,<
Cause: The monitor wanted to add a host to the internet host table but it
could not because the HOSTN table is full. This could happen when the
file SYSTEM:HOSTS.TXT is loaded, or by the monitor resolving too many
hosts from the DNS nameservers identified in the file
SYSTEM:INTERNET.NAMESERVERS. This BUGCHK only appears only once in any
5 minutes to prevent too many of them flooding the CTY.
Action: Rebuild the monitor with a higher value for NHOSTN. NHOSTN is set to
twice the size of NHOSTS, so increasing NHOSTS will result in larger
tables. If the host name table has been filled by a large number of
hosts added by the DNS resolver in the monitor, the host table can be
reloaded with just HOSTS.TXT information by using the IPHOST program's
LOAD command.
>)
BUG.(CHK,NVTILS,NRTSRV,SOFT,<NRT link in unexpected state>,,<
Cause: NRT's host service has been called by DECnet for a link in
an unexpected state.
Action: Determine whether the link indeed is in an unexpected state, or
whether it is in a state which NRT must deal. If the former,
the problem is probably in DECnet, if the latter, code must
be added to NRT.
If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,NVTINP,NRTSRV,SOFT,<NRT Input to DECnet failed>,,<
Cause: An input call to DECnet's SCTNSF entry point failed unexpectedly.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and examine
the DECnet error code in register T1.
>)
BUG.(CHK,NVTNHB,NRTSRV,SOFT,<NRTHBR should never be called>,,<
Cause: DECnet has called NRT's host service at its "hiber" address.
This should never happen, since NRT always uses asynchronous
calls to DECnet.
Action: Find a DECnet call which has the .NSWAIt flag on and turn it
off, being sure that the surrounding code can handle asynch
I/O. If none is found, DECnet must be in error.
If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,NVTOUT,NRTSRV,SOFT,<NRT output to DECnet failed>,,<
Cause: An output call to DECnet's SCTNSF entry point failed unexpectedly.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and examine
the DECnet error code in register T1.
>)
BUG.(CHK,NVTPCL,NRTSRV,SOFT,<Partial Configuration Msg Loss>,,<
Cause: NRT's host service failed to send the configuration message in
a single DECnet message segment.
Action: If the message segment size for the link is really less than
ten bytes, it should probably be enlarged, else the code has
deal with the possibility of segmented configuration messages.
If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(HLT,NVTSAB,NRTSRV,SOFT,<No memory for NRT's SAB>,,<
Cause: NRT's initialization code was unable to get resident free
space to build its control blocks.
Action: Find out why there is so little resident free space so early
in the system's life.
>)
BUG.(HLT,NVTSJB,NRTSRV,SOFT,<No memory for NRT's SJB>,,<
Cause: NRT's initialization code was unable to get resident free
space to build its control blocks.
Action: Find out why there is so little resident free space so early
in the system's life.
>)
BUG.(CHK,NVTWWC,NRTSRV,SOFT,<Wrong Channel on Connect Wait Wake>,,<
Cause: NRT's host service has been waked for a circuit which is not
the logical link in connect wait state and which has no TTY
line number associated with it.
Action: Either the connect wait link is out of phase and should be corrected
or an active link has lost its TTY line number which should be
in the DECnet PSI mask.
If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,NVTWWN,NRTSRV,SOFT,<No NRTCWN Connect Wait Wake>,,<
Cause: NRT's host service has been waked for a circuit which has no TTY
line number associated with it, yet there is no NRB for a
logical link in connect wait state.
Action: Either the NRB pointer in NRTCWN has been stepped on or an active
logical link has lost its TTY line number which should be in the
DECnet PSI mask.
If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,NWJTBE,FORK,SOFT,<No free JTB blocks>,,<
Cause: Word JTBFRE in the JSB has bit n on if JSYS trap block n is
available. The NEWJTB routine assigns trap blocks, looking in JTBFRE
for a bit on. If no bit is found to be on in JTBFRE, the NWJTBE BUGCHK
occurs.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(HLT,OCSPTH,PAGUTL,SOFT,<ASOFN - SPTH values disagree>,,<
Cause: We are assigning an OFN which is cached and notice that the old value
of SPTH does not match what was just written. This is a bug because
the cached OFN should not have changed its index block address.
>)
BUG.(HLT,OFFONX,JSYSF,SOFT,<ARRST - File marked offline has index block pointer>,,<
Cause: In restoring an offline file, it was discovered the file already
has some contents.
>)
BUG.(HLT,OFFSPE,PAGUTL,SOFT,<OFFSPQ - Page not on SPMQ>,,<
Cause: A routine has been called to remove a core page from the special memory
queue. If a page is on the queue, its age should be PSSPQ. The BUGHLT
indicates that the age is incorrect. The entry may or may not actually
be on the queue. The caller is expected to ensure that the page is on
the queue.
>)
BUG.(CHK,OFFSTR,PHYSIO,SOFT,<UDBCHK - Structure has been marked offline>,<<T3,STRNUM>,<P3,UDB>>,<
Cause: PHYSIO has detected that a disk unit has been offline long enough to
mark the structure to which it belongs as offline. This interval has
been preset by the system manager.
Action: The second additional data word of this BUGCHK shows the address of the
UDB of the offending disk drive. Determine why this drive is offline
and repair the condition.
Data: STRNUM - The structure number that has been marked offline
UDB - The address of the UDB of the disk drive that caused this
action.
>,,<DB%NND>)
BUG.(HLT,OFJFBD,DISC,SOFT,<OFNJFN - OFNJFN found bad data>,,<
Cause: An OFN was found whose bits indicated that it was or was not a
secondary index block. SPTO4 was found to disagree.
Action: One already discovered cause of this BUGHLT is the accidental
clearing of the OFN2XB bit by CHKLAC. There may be other spots
in the monitor where this bit is handled incorrectly.
>)
BUG.(HLT,OFNBDB,PAGUTL,SOFT,<OFN bad data base>,,<
Cause: There are multiple causes of this BUGHLT. They all indicate some error
in the monitors internal OFN data. The cause of the BUGHLT can be
found by examining the dump.
>)
BUG.(HLT,OFNBLC,PAGUTL,SOFT,<OFN has bad lock count>,,<
Cause: DASOFN was called to delete an OFN slot but the page table's lock count
in CST1 is greater than one. This likely indicates that some of this
OFN's pages have not yet been written to disk.
Action: Submit an SPR along with a dump and how the problem can be reproduced.
When looking at the dump, look at the caller to DASOFN. This routine
should either call SCNOFN to write the pages to disk or should not be
calling DASOFN at all.
>)
BUG.(HLT,OKSKBG,SCHED,SOFT,<OKSKD0 - OKSKED when not NOSKED>,<<CX,ADR>>,<
Cause: An OKSKED or OKSKD1 was done when the code was not NOSKED. Clearly
this is a software problem. This is bad as sensitive code may be
getting ruined because of races. A NSKDIS BUGHLT would probably have
resulted when a DISMS was done later on.
Data: ADR - Address of caller
>)
BUG.(CHK,ONSTR,PHYSIO,SOFT,<UDBCHK - Structure has been marked online>,<<T3,STRNUM>>,<
Cause: A structure that had been previsouly marked offline due an offline
disk unit is now online again.
Action: No action is required, this BUG is for information only.
Data: STRNUM - The number of the structure that is now online
>,,<DB%NND>)
BUG.(HLT,OPOPAC,SCHED,SOFT,<MRETN - Tried to over-pop AC stack>,,<
Cause: When a JSYS is executed from within the monitor, the AC's of the
current process are stored in a special area in the monitor. This area
consists of several 20-word blocks that are used successively as one
JSYS calls another.
As each nested JSYS returns, the monitor's pointer to this area of
memory is decremented. The BUGHLT indicates that the pointer has been
decremented too far. This indicates either a clobbered pointer, or an
attempt to return from a JSYS without having entered one.
>)
BUG.(HLT,OVFLOW,PAGUTL,SOFT,<ASOFN - Allocation table overflow>,,<
Cause: The monitor maintains information for disk quota enforcement in two
parallel tables called the allocation tables. These contain one entry
for each directory to which at least one OFN is assigned (that has at
least one file open). The size of these tables is the maximum number
of OFNs; therefore even if every OFN were associated with a unique
directory, there should be enough room in the allocation tables. The
BUGHLT indicates that the tables have overflowed.
>)
BUG.(INF,OVRDTA,PHYSIO,HARD,<PHYSIO - Overdue transfer aborted>,<<T1,CHAN>,<T2,CONTRL>,<T3,UNIT>,<T4,FUNC>>,<
Cause: The routine UNICKT checks the status of each unit periodically. During
one such check, some unit had an active IORB which timed out. The I/O
operation had been started, but not completed within 17 seconds. This
BUGINF can be followed by other BUGINFs or BUGCHKs when the device
finally responds (such as PH2DNA).
Action: This BUGINF is usually caused by flakey or broken hardware. Field
Service should examine the problem.
If the involved device is a tape drive controlled by a DX20, a common
cause of the BUGINF is the microcode halting. Reloading the DX20
microcode with DX20LD fixs the problem, and the DX20 should be
monitored by Field Service.
Data: CHAN - The channel number
CONTRL - The controller number (-1 if no controller)
UNIT - The unit number
FUNC - The operation that failed
>,,<DB%NND>)
BUG.(CHK,P2RAE1,PHYH2,HARD,<PHYH2 - RH20 register access error reading register>,<<T1,DATAI>,<T2,CONI>,<T3,CHAN>>,<
Cause: The routine RDREG was called to read a MASSBUS register, but the read
failed due to a register access error from the RH20. This is almost
always due to a hardware malfunction. It can also happen when a RP06
LAP (Logical Address Plug) is removed or inserted.
Action: Call Field Service if this is seen unless RP06 LAPs are being removed
or inserted.
Data: DATAI - The result of a DATAI done after the error was detected
CONI - The CONI which showed the register access error
CHAN - The channel number
>,,<DB%NND>)
BUG.(CHK,P2RAE2,PHYH2,HARD,<PHYH2 - Register access error writing register>,<<T1,DATAI>,<T2,DATA>,<T3,CONI>,<T4,CHAN>>,<
Cause: The routine WTREG was called to write a MASSBUS register, but the write
failed due to a register access error from the RH20. This is almost
always due to a hardware malfunction. It can also happen when a RP06
LAP (Logical Address Plug) is removed or inserted.
Action: Call Field Service if this is seen unless RP06 LAPs are being removed
or inserted.
Data: DATAI - The result of a DATAI done after the error was detected
DATA - The register and data that was attempted to be written
CONI - The CONI which showed the register access error
CHAN - The channel number
>,,<DB%NND>)
BUG.(CHK,P2RAE3,PHYH2,HARD,<PHYH2 - Register access error on done or ATN interrupt>,<<T1,DATAI>,<T2,CONI>,<T3,CHAN>>,<
Cause: The routine PHYINT was called to process an interrupt for the RH20, and
a check was made to see if a register access error occured, and it did.
This is almost always due to a hardware malfunction. It can also
happen when a RP06 LAP (Logical Address Plug) is removed or inserted.
Action: Call Field Service if this is seen unless RP06 LAPs are being removed
or inserted.
Data: DATAI - The result of a DATAI done after the error was detected
CONI - The CONI which showed the register access error
CHAN - The channel number
>,,<DB%NND>)
BUG.(HLT,PAGLCK,PAGUTL,SOFT,<DESPT - Page locked>,,<
Cause: The monitor is attempting to deassign a slot in the non-OFN part of the
SPT tables. The caller is expected to ensure that the SPT is no longer
in use. The BUGHLT indicates that the SPT slot is associated with a
page that has been locked into memory even though the SPT share count
is 0. This indicates an inconsistency in the monitor's data base.
Probably the page was used as a page table, and not all its page
pointers were cleared properly. A page table is locked in memory once
for each page in memory to which it points.
>)
BUG.(HLT,PAGNIC,PAGUTL,SOFT,<GETCPP - Page not in core>,,<
Cause: A routine was called to convert a virtual address or page id to its
corresponding core page. However, the page table is not in core.
>)
BUG.(CHK,PCIN0,PAGEM,SOFT,<PAGEM - PC has gone into section 0>,<<T2,PC>,<T1,PFW>>,<
Cause: A reference has been made to RSCOD or NRCOD in section 0. This should
not happen because section 0 code cannot reference data in extended
sections. As an expedient, the page being referenced is mapped to
section 1 with an indirect pointer.
Action: If this bug is reproducable, set it dumpable and send in an SPR along
with the dump and how to reproduce the problem.
Data: PC - PC
PFW - Page fail word
>)
BUG.(CHK,PDBSTA,PHYSIO,SOFT,<PHYSIO - Inconsistent state of UDB status bits>,,<
Cause: CHBDON is called by DONIRB as the exit routine for the home block IORB.
This allows the monitor to process the completed IORB before the poller
can see the completed request. However, CHBDON was called with UDBST1
bits indicating that no homeblock read, PDB read, or PDB write was in
progress.
Action: If this BUGCHK is reproducable, change it to a BUGHLT, and seind in an
SPR with the dump and instructions on how to reproduce the problem.
>,,<DB%NND>)
BUG.(HLT,PGNDEL,PAGEM,SOFT,<REMFPB - Page not completely deleted>,,<
Cause: A page has been marked as partially deleted and placed on a queue. The
routine that processes the queue has found that the page still has a
backup on disk. The routine that marked the page should have deleted
all backup pages.
>)
BUG.(HLT,PGRIXM,PAGUTL,SOFT,<PGRINI - Boot overlaps resident tables>,,<
Cause: The values in BUTPHY indicate that BOOT has been left in pages which
are expected to be available for resident code or storage areas. This
could be the result of a bad monitor build or an attempt to run the
monitor with insufficient memory.
>)
BUG.(HLT,PGUNDX,PAGEM,SOFT,<PGUNTD - In nested trap>,,<
Cause: There was an attempt to use a special untrap address while in a nested
trap. This is a software problem.
>)
BUG.(INF,PH2DNA,PHYH2,HARD,<PHYH2 - Done interrupt and channel not active>,<<T2,CHAN>>,<
Cause: The routine RH2INT was called to handle an interrupt on the RH20 and
the CONI said done was up, but no I/O transfer was in progress. If an
OVRDTA had previously occured, and the device finally responds, this
BUGINF will happen. This usually indicates a hardware failure.
Action: Field Service should check the devices on the channel listed in the
additional data; any channel/controller/unit listed in OVRDTA BUGCHKs
should be suspected.
Data: CHAN - The channel number
>,,<DB%NND>)
BUG.(CHK,PH2IHM,PHYH2,SOFT,<PHYH2 - Illegal hardware mode - word mode assumed>,,<
Cause: The routine RH2CCW was called to generate a channel transfer word, and
one of it's arguments is the data mode to use. But the data mode
supplied was illegal.
>,,<DB%NND>)
BUG.(HLT,PH2IUA,PHYH2,HARD,<Wrong and inactive unit interrupted>,,<
Cause: The routine RH2INT was called to handle an interrupt, and it determined
that I/O had finished for a controller but that active controller can't
be found. This bug may also be seen in combination with PH2WUI
BUGHLTs.
Action: Have Field Service verify proper operation of the RH20s and other
MASSBUS device hardware.
>)
BUG.(CHK,PH2PIM,PHYH2,HARD,<PHYH2 - RH20 lost PI assignment>,<<T1,CHAN>,<T3,CONI>,<T2,PIA>>,<
Cause: The routine RH2CHK was called for a periodic check on the status of the
RH20, and it discovered that the channel assignment of the RH20 changed
from what it should be. This usually indicates a hardware malfunction.
This situation is serious as it could cause file structure damage.
Action: Call Field Service and have them check out the channel listed in the
additional data.
Data: CHAN - The channel number
CONI - The results of the CONI on the RH
PIA - The PI assignment we expected to see but did not see
>,,<DB%NND>)
BUG.(CHK,PH2PIX,PHYH2,HARD,<PHYH2 - RH20 returned from the twilight zone>,<<T1,CHAN>,<T3,CONI>,<T4,OLD>,<T2,PIA>>,<
Cause: The routine RH2CHK was called for a periodic check on the status of the
channel, and found that the PI assignment for the channel was not what
was expected. A second check of the channel status found the correct
PI assignment.
Action: Call Field Service and have them check the channel listed as the first
item of additional data.
Data: CHAN - The channel number
CONI - The results of the final CONI on the RH
OLD - The results of the first CONI on the RH
PIA - The PI assignment we expected to see
>)
BUG.(HLT,PH2WUI,PHYH2,HARD,<Wrong unit interrupted>,,<
Cause: The routine RH2INT was called to handle an interrupt, and it determined
that I/O had finished for a unit. The operation was a write operation.
The CONI said that a particular unit completed the I/O, but that was
not the unit to which the I/O was begun. This BUGHLT occurs on the
second such error for a particular IORB. This bug may also be seen in
combination with PH2IUA BUGHLTs.
Action: Have Field Service verify proper operation of the RH20s and other
MASSBUS device hardware.
>)
BUG.(INF,PHYCPI,PHYSIO,HARD,<CI path ignored for Massbus disk>,<<P3,OUDB>,<P5,NUDB>>,<
Cause: TOPS-20 is able to access a disk over the CI (through another system's
MSCP server) but it already has had access to the disk via the MASSBUS.
The system ignores the CI path.
Action: No action is required. The purpose of the BUGINF is to let you know
that the system does not use the CI path to the disk even if the
MASSBUS path is disabled.
If you were to reboot the system after disabling the MASSBUS path,
then TOPS-20 would see the disk through an MSCP server only and
would use that path for access.
Data: OUDB - Old UDB (path)
NUDB - New UDB (path) marked offline
>,,<DB%NND>)
BUG.(INF,PHYDCD,PHYSIO,SOFT,<PHYSIO - Don't-care disk on do-care drive>,<<P1,CHAN>,<P2,CONT>,<P3,UNIT>>,<
Cause: A don't-care disk has been encountered on a standard drive. This
combination is treated as a standard drive.
Action: Either set the disk as a "do-care" or set the drive as "don't-care".
Data: CHAN - The channel number
CONT - The controller number or -1
UNIT - The unit number
>,,<DB%NND>)
BUG.(INF,PHYDCR,PHYSIO,SOFT,<PHYSIO - Disk being treated as DON'T-CARE>,<<P1,CHAN>,<P2,CONT>,<P3,UNIT>>,<
Cause: A don't-care disk was found on a don't-care drive.
Action: No action required, this bug is for information only.
Data: CHAN - The channel number
CONT - The controller number or -1
UNIT - The unit number
>,,<DB%NND>)
BUG.(INF,PHYDCU,PHYSIO,SOFT,<PHYSIO - Do-care disk on don't-care drive>,<<P1,CHAN>,<P2,CONT>,<P3,UNIT>>,<
Cause: A standard disk has been detected on a drive which has been declared
DON'T-CARE. This combination is treated as a standard drive.
Action: Either set the disk as don't-care, or set the drive as do-care.
Data: CHAN - The channel number
CONT - The controller number or -1
UNIT - The unit number
>,,<DB%NND>)
BUG.(HLT,PHYICA,PHYSIO,SOFT,<PHYINI - Illegal argument to core alloc>,,<
Cause: The routine PHYALC was called asking for a negative number of words.
This routine is called to allocate resident storage for data such as
CDBs, KDBs, and UDBs.
>)
BUG.(INF,PHYICE,PHYSIO,SOFT,<PHYINI - Failed to assign resident STG>,,<
Cause: The routine PHYALC was called to allocate storage for data such as a
CDB, KDB, or UDB, but there was not enough free resident storage to
allocate it. The monitor ignores any device for which it cannot
build tables.
Action: The monitor can be rebuilt with a larger units pool by adding to symbol
.RESUQ in STG and rebuilding the monitor.
>)
BUG.(HLT,PHYLTF,PHYSIO,HARD,<PHYSIO - SCHLTM - Unexpected LATOPT failure>,,<
Cause: The routine SCHLTM was called to do disk latency optimization, by
scanning all units for the best IORB. A unit was found to have a
nonnull transfer wait queue, but the lower level code to select the
best IORB for that unit gave the non-skip return, indicating that no
IORBs existed.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(CHK,PHYNIR,PHYSIO,SOFT,<PHYSIO - Null interrupt routine at operation done>,,<
Cause: The routine DONIRB, when terminating a long IORB, attempted to notify
higher level code about the finished IORB, but the field in the IORB
that contained the address to call was zero.
Action: If this BUGCHK is persistant on the system, change it to a BUGHLT and
send in an SPR with the dump and how to reproduce the problem.
>)
BUG.(CHK,PHYNOS,PHYSIO,HARD,<PHYSIO - No serial number for disk drive>,<<P1,CHAN>,<P2,CONT>,<P3,UNIT>>,<
Cause: The serial number of a disk drive is missing (zero). A non-zero unique
disk drive serial number is required for all disks in a TOPS-20
environment. The drive is marked as offline and is not used.
Action: Field Service must be called to set a non-zero unique serial number for
the disk drive.
Data: CHAN - Channel number
CONT- Controller number
UNIT - Unit number
>,,<DB%NND>)
BUG.(HLT,PHYNUN,PHYSIO,SOFT,<PHYSIO - No unit number found in FNDCKS>,<<T1,CHAN>,<T2,KONT>,<P3,UDB>>,<
Cause: A unit number could not be found for a given CDB, KDB, UDB in FNDCKS.
This indicates a software problem. P3 may not contain a valid UDB.
Data: CHAN - Channel number
KONT - Controller number
UDB - UDB address
>)
BUG.(HLT,PI0ERR,APRSRV,HARD,<Unvectored interrupt on channel 0>,,<
Cause: The monitor has received an unvectored hardware interrupt on PI channel
0. This is not supposed to happen. This could possibly be faulty
hardware generating incorrect PI requests.
Action: Have Field Service check the system to make sure that it is functioning
properly.
>)
BUG.(CHK,PI1ERR,APRSRV,HARD,<Unexpected unvectored interrupt on channel 1>,,<
Cause: The monitor has received an unvectored hardware interrupt on PI channel
1. Currently, there is no processing assigned to this channel. This
could possibly indicate faulty hardware that is generating spurious PI
requests.
Action: Have Field Service check the system to make sure that it is functioning
properly.
>,,<DB%NND>)
BUG.(CHK,PI2ERR,APRSRV,HARD,<Unexpected unvectored interrupt on channel 2>,,<
Cause: The monitor has received an unvectored hardware interrupt on PI channel
2. Currently, there is no processing assigned to this channel. This
could possibly indicate faulty hardware that is generating spurious PI
requests.
Action: Have Field Service check the system to make sure that it is functioning
properly.
>,,<DB%NND>)
BUG.(CHK,PI4ERR,APRSRV,HARD,<Unexpected unvectored interrupt on channel 4>,,<
Cause: The monitor has received an unvectored hardware interrupt on PI channel
4. Currently, there is no processing assigned to this channel. This
could possibly indicate faulty hardware that is generating spurious PI
requests.
Action: Have Field Service check the system to make sure that it is functioning
properly.
>,,<DB%NND>)
BUG.(CHK,PIDFLF,IPCF,HARD,<CREPID - Free PID list fouled up>,,<
Cause: An invalid PID number was passed to GETPID by CREPID. This value
passed was retrieved from PIDLST.
Action: If these BUGCHKs persist, change the BUGCHK to a BUGHLT and submit
an SPR. In the dump, look at the PIDLST and try to determine how it
was corrupted.
>)
BUG.(CHK,PIDOD1,IPCF,HARD,<MUTCHO - PID count overly decremented>,,<
Cause: When attempting to tranfer ownership of a PID from one job to
another, MUTIL% discovered that the count of PIDs of the original
owner has gone negative.
>)
BUG.(CHK,PIDOD2,IPCF,HARD,<DELPID - Overly decremented pid count>,,<
Cause: When releasing a PID, DELPID discovered that the PID count for the
current job has gone negative.
>)
BUG.(HLT,PIITRP,SCHED,HARD,<Instruction trap while PI in progress or in scheduler>,<<LSTERR,LSTERR>,<LSTIPC,ERRPC>>,<
Cause: An error occurred, resulting in an illegal instruction trap. If a JSYS
was being executed by the monitor, the process would receive an error
return. However, in this case the error occurred while a hardware
interrupt (PI) was being processed, or while the monitor was executing
code that starts the scheduler cycle.
Action: Although it is possible for bad hardware to cause this BUGHLT, it is
usually bad software. If the hardware checks out OK, send in an SPR
along with a dump and indicate how this problem can be reproduced.
Data: LSTERR - Last error code, this may indicate where error was generated.
ERRPC - PC at which error was generated.
>)
BUG.(HLT,PINIC1,APRSRV,SOFT,<MAPIPG - Page table not in core>,,<
Cause: A routine has been called to map a page into a special address slot.
The requested page is not in legal range for physical memory. Look at
the stack and check the offending caller.
>)
BUG.(HLT,PIRACE,APRSRV,SOFT,<MAPIPG - Called with PI on>,,<
Cause: This routine uses a MMAP entry which may be used at PI level. To avoid
races the PI should be off when it is called. This particular caller
did not turn off the PI. Check the stack to find the caller.
>)
BUG.(HLT,PISKED,SCHED,HARD,<Entered scheduler with PI in progress>,,<
Cause: The monitor started to execute the main scheduler routine. The
hardware indicates that a hardware interrupt is being held. Since
hardware interrupts operate at a higher priority than the scheduler,
this should not happen.
Action: Field Service should check out the system carefully. This BUGHLT is
seen with bad hardware. If the hardware checks out OK, send in an SPR
along with a dump and indicate how this problem can be reproduced.
>)
BUG.(HLT,PITRAP,PAGEM,SOFT,<Pager trap while PI in progress>,<<T1,PFW>>,<
Cause: A page fault occurred while a hardware interrupt was in progress. This
can be the result of hardware failure or a software bug.
Action: If the page fail word indicates an AR or ARX parity error, the monitor
has printed an analysis of the problem on the CTY, and a SYSERR entry
is created when the monitor is rebooted. If it wasn't a AR/ARX
parity error, please submit an SPR along with the crash dump and any
other information on reproducing this bug.
Data: PFW - Page fault word.
>)
BUG.(HLT,PLKMOD,PAGEM,SOFT,<Page lock overly decremented>,,<
Cause: The monitor decremented the lock count of a page past zero. This
indicates a software problem.
>)
BUG.(HLT,PLKRPQ,PAGEM,SOFT,<Locked page being put on replaceable queue>,,<
Cause: ONRQ or OFRQ was called to put a page on the replaceable queue but
ONRQ1 has detected that the lock in CST1 for this page is not zero.
>)
BUG.(CHK,PM2SIO,PHYM2,HARD,<PHYM2 - Illegal function at start IO>,,<
Cause: The IORB function code provided to TM2SIO is less than or equal to
zero or the short form (PAGEM) request bit is set in the IORB.
Action: If this BUGCHK is reproducable, set it dumpable, and send in an SPR
with the dump and how to reproduce the problem.
>)
BUG.(HLT,PMVWMC,STG,HARD,<Wrong UCODE - PMOVE/M instructions not present>,,<
Cause: The KL microcode currently running does not have the PMOVE or PMOVEM
instructions.
Action: Install the correct KL10 microcode on the front end and reload the
system. Edit 442 or greater of the KL10 mocrocode is required. Be
sure to answer "YES" to the "RELOAD MICROCODE" prompt from KLI.
>)
BUG.(HLT,PPGOFN,PAGEM,SOFT,<SPHYPT - Destination is OFN>,<<T4,OFN>>,<
Cause: SPHYPG or SPHYPG has been given an destination argument which is an
OFN. This type of mapping may only be done into non-file page tables.
Data: OFN - The OFN
>)
BUG.(HLT,PRONX2,APRSRV,HARD,<NXM detected by processor>,,<
Cause: A page fault occurred indicating that the processor attempted to access
a memory that did not respond within a preset time. The monitor is
presently running in process context. The interrupt system is on.
Since non-existent memory also produces an APR interrupt, which results
in an APRNX1 BUGHLT, this BUGHLT does not normally occur.
Action: This is usually a hardware problem. See the action for APRNX1. Note,
however, that the occurrence of this BUGHLT instead of APRNX1 may
indicate a failure in the interrupt system.
>)
BUG.(HLT,PSBNIC,PAGUTL,SOFT,<SETPPG - PSB not in core>,,<
Cause: The monitor is establishing the context for running a process by making
its per-process area part of the monitor's map. It is about to copy
the SPT entry for the PSB into a special SPT slot but the PSB is not in
core.
>)
BUG.(CHK,PSINSK,SCHED,SOFT,<PSI From NOSKED or CRSKED context>,,<
Cause: A process is NOSKED or CSKED, but is not NOINT. This indicates a
monitor software error.
Action: If this BUGCHK can be reproduced, set it dumpable and submit an SPR
along with a dump and how to reproduce the problem.
>)
BUG.(HLT,PSISTK,SCHED,SOFT,<PSI Storage stack overflow>,,<
Cause: A software interrupt occurred while a process was running in the
monitor. The monitor is saving information regarding the state of the
process so that in can restore that state when the process dismisses
the interrupt. The BUGHLT indicates that the storage area has
overflowed.
>)
BUG.(HLT,PTAIC,PAGEM,SOFT,<SWPIN - PT page already in core>,,<
Cause: A routine has been called to swap a page into core. The id for the
page indicates that it is a data page. The BUGHLT occurred because the
entry in its page table contains a core address. This is a software
problem.
>)
BUG.(HLT,PTDEL,PAGUTL,SOFT,<DESPT - PT not deleted>,,<
Cause: The monitor is attempting to deassign a slot in the non-OFN part of the
SPT tables. It assumes that the slot was used as a page table. The
BUGHLT occurs because the SPT entry or its backup address is on disk.
The caller probably has used the wrong routine in releasing an OFN.
>)
BUG.(HLT,PTMPE,APRSRV,HARD,<Page table parity error>,,<
Cause: The monitor encountered multiple page table parity errors.
Action: This bug is caused by a hardware problem. Have Field Service
check out the system.
>)
BUG.(HLT,PTNIC1,APRSRV,SOFT,<SWPIN - Page table not in core>,,<
Cause: A routine has been called to map a page table into a special page used
only by the swapping routines. The caller is expected to provide an
identifier for a page that is in memory. When a page is in memory, the
page table that points to it must be in memory. The BUGHLT indicates
that the storage address for the page table is not a valid core page.
This can indicate that the page is not in memory or that its memory
address is larger than the physical memory on the machine. The most
likely cause is corruption of the monitor's data base.
>)
BUG.(HLT,PTNON0,PAGEM,SOFT,<SETPT0 - Previous contents NON-0>,,<
Cause: A routine has been called to change the map for a page of a process.
The caller is expected to have unmapped any previous contents of the
page. The BUGHLT indicates that the page table contains a non-zero
pointer for the page.
>)
BUG.(HLT,PTOVRN,PAGUTL,SOFT,<UPDPGS - Count too large>,,<
Cause: A routine has been called to update pages of a file on the disk to
which a specified index block (OFN) points. The caller provides a
starting page and a count. The BUGHLT occurs because the sum of the
two extends beyond the end of the index block.
>)
BUG.(CHK,PTPTE1,APRSRV,HARD,<Page table parity error>,<<UPTPFW,PFW>>,<
Cause: A page table entry has bad parity. The monitor clears the entry
and try again. If it fails repeatedly, a PTMPE BUGHLT results.
Action: This bug is caused by a hardware problem. Have Field Service check out
the system.
Data: PFW - Page fail word
>,,<DB%NND>)
BUG.(HLT,PVTRP,APRSRV,HARD,<Proprietary violation trap>,,<
Cause: A page fault occurred indicating a proprietary violation while the
monitor was running in scheduler context. An instruction in a public
page attempted to reference a concealed page. Since TOPS-20 uses only
concealed mode, this BUGHLT should never happen.
Action: This bug is caused by a hardware problem. Have Field Service check out
the system.
>)
BUG.(HLT,PWRFL,APRSRV,HARD,<Fatal power failure>,,<
Cause: The monitor has been started at the power-fail recovery code and is
attempting to recover. However, the loss of power that preceded this
occurred too quickly to allow an orderly shutdown. Therefore the
monitor is reloaded. This BUGHLT is preceeded by the messages:
"Attempting automatic restart..."
"PWRDWN .NE. -1, restarting..."
Action: No action required. The system should reload itself.
>)
BUG.(INF,PWRRES,APRSRV,HARD,<Power restart>,,<
Cause: The monitor was started at the power fail recovery code and is
attempting to recover. This BUGCHK is preceded by:
"Attempting automatic restart..."
"Attempting to continue system"
This indicates that an an orderly shutdown was accomplished before the
power fail so the system continues.
Action: No action required. The system attempts to restart itself.
>,,<DB%NND>)
BUG.(HLT,PYILUN,PHYSIO,HARD,<PHYSIO - Illegal unit number>,,<
Cause: The routine SETUDB was called to find the UDB and KDB pointers given
the CDB and unit number, but the unit number given was out of range.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(INF,RCS3XF,LLMOP,SOFT,<LLMOP Transmit Failed>,<<T1,DLLERC>,<T2,CHANNEL>>,<
Cause: LLMOP was unable to transmit a forward data message.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: DLLERC - The error code returned from the DLL
CHANNEL - The channel on which the failure occurred
>)
BUG.(CHK,RCSIFC,LLMOP,SOFT,<LLMOP RCSCBR called with invalid function code>,<<T1,FUNCODE>>,<
Cause: The LLMOP Remote Console Protocol Server Call Back Routine was
called by the Data Link Layer with an invalid callback function
code. This is a software bug.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: FUNCODE - Function code
>)
BUG.(INF,RCSPIS,LLMOP,SOFT,<LLMOP Ethernet Periodic Identify-Self>,,<
Cause: This is a temporary debugging BUGINF. It is here to provide an
indication that the periodic Identify-Self transmission
is being performed.
>,,<DB%NND>)
BUG.(CHK,RCVNOE,JSYSA,SOFT,<RCVOK - No entry found in queue>,,<
Cause: The RCVOK JSYS has detected that the list of unprocessed GETOK requests
is empty, but the count of entries in the list is nonzero. This may
have happened if a GIVTMR BUGCHK has already been issued.
Action: Check the health of the system's access control program. If it is
healthy and this BUGCHK is reproducible, set this bug dumpable and
submit an SPR along with the dump along with instructions on
reproducing the problem.
>)
BUG.(CHK,RCVTMR,JSYSA,SOFT,<RCVOK TIMEOUT - Ignoring access control job>,,<
Cause: The access control job did not do a RCVOK within the designated time
period. A GETOK request was pending.
Action: The access control job should be examined to see if its receiving
requests can be made faster. It is also possible that the ACJ was hung
processing something due to some other system malfunction (a disk going
offline for instance).
>,,<DB%NND>)
BUG.(HLT,RELBAD,FREE,SOFT,<RELFRE - Bad block being released>,<<CX,CALLER>,<A,POLHDR>,<B,LSTBLK>,<C,NXTBLK>>,<
Cause: This is a free space problem. The block being returned does not fit
into the free space. When blocks are returned to the free space pool,
there is a consistency check performed. The block is merged into
existing blocks that follow it in free space. This block overlaps
into existing free blocks. It cannot be merged.
Action: Looking at the stack shows the caller. It is possible that the
length of the current block is incorrect. It is equally likely that
the block(s) before this block (in free space) have had incorrect
lengths on return. Thus, the caller may not be the culprit.
Data: CALLER - Caller to this BUGHLT
POLHDR - Address of header of this pool
LSTBLK - Address of last block before this one
NXTBLK - Address of first block after this one
>)
BUG.(HLT,RELFRM,FREE,SOFT,<Illegal to deassign 0 free space>,,<
Cause: This is a free space problem. The calling routine is trying to release
a block of storage of zero length. It is illegal to free a block of
zero length.
Action: Look at the dump. Backing up the stack shows which routine made
the call to release the storage.
>)
BUG.(HLT,RELINC,FREE,SOFT,<RELFSP - Bad block being released>,<<Q1,POOLN>,<T4,CALRPC>,<P2,BLKADR>>,<
Cause: This is a free space problem. The block being returned does not fit
into the free space. When blocks are returned to the free space pool,
there is a consistency check performed. The block is merged into
existing blocks that follow it in free space. This block overlaps
into existing free blocks. It cannot be merged.
Action: Looking at the stack shows the caller. It is possible that the
length of the current block is incorrect. It is equally likely that
the block(s) before this block in free space have had incorrect
lengths on return. Thus, the caller may not be the culprit.
Data: POOLN - Pool number
CALRPC - PC of caller of RELFSP
BLKADR - Address of user block
>)
BUG.(CHK,RELINT,FREE,SOFT,<RELFRE called OKINT>,<<A,CALLER>>,<
Cause: This is a free space problem. The calling routine is trying to release
a swapable free space block while it is OKINT. This is dangerous since
it could get interrupted and loose the block. All free space actions
should occur while NOINT.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. The dump shows the routine
which is calling OKINT. Make it be NOINT when it removes the address
of the block about to be released from the database. The routine
can be made OKINT when control is returned to it.
Data: CALLER - The address of the calling routine.
>)
BUG.(CHK,RELRNG,FREE,SOFT,<RELFRE - Block out of range>,<<B,BLOCK>,<C,POLHDR>,<A,POLLOW>,<D,POLHGH>>,<
Cause: This is a free space problem. The caller to the free space
routines is trying to return a block that was not given
out by the free space manager. The block is outside the
range of free space management.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. By looking at the stack you
should be able to determine who called for the releasing
of the block.
Data: BLOCK - Address of block being released
POLHDR - Address of free storage header (e.g. ASGRES)
POLHGH - High address of free space pool
POLLOW - Low address of free space pool
>)
BUG.(CHK,RESBAD,FREE,SOFT,<Illegal address passed to RELRES>,<<T1,BADADR>,<T2,CALLER>>,<
Cause: This is a free space problem. The caller is trying to release some
resident free space. The address being specified is not a legal
resident free space address.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. The dump indicates the
caller which is providing the illegal address. Find where the
caller gets the address and how that location gets modified.
Data: BADADR - the address given to the free space manager
CALLER - the PC when the free space manager was called
>)
BUG.(HLT,RESBAZ,FREE,SOFT,<RELRES - Free block returned more than once>,<<T1,BADADR>,<T2,CALLER>>,<
Cause: This is a free space problem. The caller is returning a block to
resident free space. The block being returned is already a released
block in the resident free space pool. Thus, the caller is either
returning the same block twice or has a completely random address which
is incorrect.
Action: The caller may or may not be the culprit. It is possible that some
other routine is picking up the wrong address and releasing it.
Data: BADADR - the address given to the free space manager
CALLER - the PC when the free space manager was called
>)
BUG.(CHK,RESBND,FREE,SOFT,<RELRES - Releasing space beyond end of resident free pool>,<<T1,BADADR>,<T2,CALLER>>,<
Cause: This is a free space problem. The caller is trying to release resident
free space. The address passed to RELRES is outside the range of the
resident free space pool.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: BADADR - the address given to the free space manager
CALLER - the PC when the free space manager was called
>)
BUG.(HLT,RESCHK,FREE,SOFT,<RELRES - Resident free space was overwritten>,<<T1,BADADR>,<T2,CALLER>>,<
Cause: Resident free space has been overwritten.
Action: Look at the header of the free space segment; it contains the PC of the
assigner of the space. Try to figure out why more space was used
than was requested.
Data: BADDAR - Address passed to RELRES
CALLER - PC when RELRES was called
>)
BUG.(CHK,RETSPK,IPIPIP,SOFT,<RETPKT - Internet buffer wrong size>,,<
Cause: The monitor has detected an input buffer that is not of the maximum
size required by this routine. This is probably a software problem.
Action: If this problem can be reproduced, please set theis BUG dumpable and
submit an SPR along with the crash dump, and unrun monitor, and
instructions on how to reproduce the problem.
>)
BUG.(INF,REVLEV,PHYM78,SOFT,<TM78 Microcode is outdated>,<<T1,ACTUAL LEVELS>,<T2,ACTUAL LEVELS>,<T3,MINIMUM LEVELS>,<T4,MINIMUM LEVELS>>,<
Cause: The TM78 does not have a microcode version that is needed by this
monitor.
Action: Field Service must install the new microcode.
Data: ACTUAL LEVELS - The actual revision levels in the TM78
MINIMUM LEVELS - The levels that are required by this monitor
>,,<DB%NND>)
BUG.(CHK,RFILPF,APRSRV,HARD,<Refill error page fail>,,<
Cause: A page fault occurred indicating a refill error. This condition is
indicated by a "hard" failure code of 22 in the page fail word and
should occur only under KI-style paging. TOPS-20 does not use this
style of paging. The monitor retries the instruction.
Action: This bug is caused by a hardware problem. Have Field Service check out
the system.
>)
BUG.(HLT,RH2ICF,PHYH2,SOFT,<PHYRH2 - Invalid channel function>,,<
Cause: The routine CHSTRT was called to start I/O on the channel, but the
supplied arguments were illegal. Either no DATAO word was specified,
or the function code was zero.
>)
BUG.(HLT,RH2NXC,PHYH2,HARD,<Interrupt from non-existant RH20>,<<T1,RHCHAN>>,<
Cause: The monitor received a vectored interrupt from an RH20 channel. But
during system startup this RH20 did not exist in the configuration.
This is almost always caused by faulty hardware (ie - 2 RH20s may be
interrupting the KL simultaneously).
Action: Contact Field Service and have them find out which RH20 is faulty.
Data: RHCHAN - RH20 that requested the interrupt that does not exist.
>)
BUG.(CHK,ROUATL,ROUTER,SOFT,<A routing message contains a start ID greater than we can handle>,,<
Cause: An adjacent node has sent a routing message with the start ID
that would cause indexing into the per adjacency vector past
the end of the vector.
>,RTRBA9)
BUG.(CHK,ROUAWS,ROUTER,SOFT,<Adjacency block in queue when state is unused>,,<
Cause: An adjacency block has been left in the queue of active adjacencies
but its state is unused.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,RTREH3)
BUG.(CHK,ROUBCD,ROUTER,SOFT,<Bad Checksum detected when building routing msg>,,<
Cause: Somehow our internal reachability vector has been damaged since the
last rebuilding.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,ROUBMB,ROUTER,SOFT,<Bad message block pointer>,,<
Cause: DNADLL has called RTRDLE with a function requiring a message
block, and the pointer supplied (in T3) is either 0 or out of
range.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Determine why DNADLL gave a
bogus pointer since it originally should have obtained it from
the monitor.
>,RTRDS9)
BUG.(CHK,ROUBMT,ROUTER,SOFT,<Bad message type received from the DLL>,,<
Cause: The DLL received a bad message from another node or incorrectly copied
a message into the message block.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,DNFMSG)
BUG.(CHK,ROUBSN,ROUTER,SOFT,<Bad source node in message from NSP>,,<
Cause: We have received a message from NSP to send. However, the source node
address is not that of the local Router.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Check in LLINKS or SCLINK to
see how an invalid source node address occurred.
>,FREMSG)
BUG.(CHK,ROUBSZ,ROUTER,SOFT,<Router circuit block size was zero on a running circuit>,,<
Cause: The blocksize for a circuit is defaulted to RTRBSZ and updated with
information from nodes on the circuit to determine a new minimum
blocksize for the circuit. Somehow this ended up as zero.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,UPDLOP)
BUG.(INF,ROUBTF,ROUTER,SOFT,<Bad Test message format>,,<
Cause: We received a hello message from a P3 node or a P4 endnode
that contained too many bytes of test data.
>,FREMSG)
BUG.(INF,ROUBTM,ROUTER,SOFT,<Bad Hello or Test message>,,<
Cause: We have received bad test data in a hello message.
>,FREMSG)
BUG.(INF,ROUCGV,ROUTER,SOFT,<Couldn't get memory for event arg block>,,<
Cause: DECnet has exhausted its free space.
Action: This BUG is informational and no action is required. However, you may
wish to investigate why there is no more DECnet free space.
>,RTN,<DB%NND>)
BUG.(CHK,ROUEHB,ROUTER,SOFT,<No Message Block for Event data>,,<
Cause: We are attempting to read data from an MB to report in an event but
the caller failed to supply a message address.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Check caller and see why it didn't
supply a message block address.
>,RTN)
BUG.(CHK,ROUEHM,ROUTER,SOFT,<No Message Block for Event data>,,<
Cause: We are attempting to read data from an MB to report in an event but
the caller failed to supply a message address.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Check caller and see why it didn't
supply a message block address.
>,RTN)
BUG.(CHK,ROUIFS,ROUTER,SOFT,<Router got through the forward routine without picking a route>,,<
Cause: RTRFWD got through its Forward process and either did
not pick up a route or failed to flag a message which was for the
local node or an unreachable message.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Analyze the dump and look for
corruption in the routing vector.
>,FREMSG)
BUG.(CHK,ROUILS,ROUTER,SOFT,<Illegal Circuit Specified in NSP msg>,,<
Cause: There was a request to send a message on a particular circuit, however
the circuit has never been intialized by the routing layer.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,FREMSG)
BUG.(CHK,ROUNAV,ROUTER,SOFT,<An adjacency has no routing vector>,<<AJ,Adjacency block>>,<
Cause: A routing vector is built for each routing adjacency when the adjacency
block is created if the node is a router. The monitor has discovered
that a routing node has supplied a routing update but there is no
routing vector. This can happen if a node comes up as an endnode and
later changes to a router (DECrouter-2000s do this). The monitor
attempted to create a routing vector but none could be created,
probably due to a lack of freespace. The routing update is ignored.
Action: Reloading the system should clear up this problem. It may be that
there is not enoug DECnet free space on the system. If this BUG
persists and is reproducible, set this BUG dumpable and submit an SPR
with the dump, a copy of MONITR.EXE, and details about the local DECnet
configuration and DECnet applications. Include any known method for
reproducing the problem and/or loading and state of the system at the
time the BUG was observed.
Data: AJ - Adjacency block
>,RTN)
BUG.(CHK,ROUNLN,ROUTER,SOFT,<Trying to return msg to non-local NSP>,,<
Cause: We have decided to return a message to the local NSP but the local
NSP was not the originator.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,R2NCLE)
BUG.(CHK,ROUNSO,ROUTER,SOFT,<NSP sent out of range packet>,,<
Cause: There is a request to forward a packet to a node whose address is
outside the range of our routing vector. Either our NSP has given
a packet the monitor cannot forward or the monitor has received one
from the wire.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
If the source is local check to see how NSP could give a packet
whose destination node address is greater than RTRMXN. If the
source is remote then there is something wrong with its routing
database or algorithm.
>,FREMSG)
BUG.(INF,ROURCE,ROUTER,SOFT,<Bad NI Router list message format>,,<
Cause: The monitor has received a router hello message with more than 256
known 2-way adjacencies.
>,FREMSG)
BUG.(INF,ROURFN,ROUTER,SOFT,<Routing message received from non-routing node>,<<T1,ADDRESS>>,<
Cause: The monitor has received a routing message from a node the monitor
believe to be an endnode so the monitor has no vector to store it in.
Action: Check the address of the node and then see if it thinks it is a
routing/non-routing node.
Data: ADDRESS - Address of node
>,FREMSG)
BUG.(CHK,ROURML,ROUTER,SOFT,<Stored routing message format error in RTRBAV>,<<T1,COUNT>>,<
Cause: The monitor has received a P3 routing message with a negative count
of nodes in it or no checksum or a P4 routing message with a negative
segment count.
Data: COUNT - Count or checksum
>,RTRBA9)
BUG.(INF,ROUTEU,ROUTER,SOFT,<Endnode upgraded to router>,<<T2,AREA>,<T3,NODE>>,<
Cause: A routing vector is built for each routing adjacency when the adjacency
block is created if the node is a router. The monitor has discovered
that a routing node has supplied a routing update but there is no
routing vector. A routing vector has been created for this node.
Data: AREA - DECnet area number of node sending routing update
NODE - DECnet node number of node sending routing update
>)
BUG.(CHK,ROUUER,ROUTER,SOFT,<Unexpected end of routing message>,,<
Cause: The number of bytes in the routing message did not correspond
to the length expected. This may be caused by reading too many
bytes out of the message without decrementing the count of bytes
read or caused by an improper routing message.
>,RTRBA9)
BUG.(CHK,ROUUET,ROUTER,SOFT,<Unknown event type in RTNEVT>,,<
Cause: The monitor supplied us with a bad event code.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Look for someone smashing T1 or
a problem with the EVENTS macro.
>,RTN)
BUG.(CHK,ROUUOC,ROUTER,SOFT,<Unable to obtain count of nodes in Phase IV message>,,<
Cause: The monitor has received a routing message that DNLENG indicates has
more bytes than has been read. When another read is attempted, DNLENG
indicates the count is exhausted.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>,RTRBA9)
BUG.(CHK,ROUXNZ,ROUTER,SOFT,<R2NCAL called with MB=0>,,<
Cause: Somehow MB was trashed in the forward process. It is unlikely
to get this far if RTRFWD rececived a bad MB.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Look for faulty code in the
forward process.
>,RTN)
BUG.(CHK,ROUZXT,ROUTER,SOFT,<Tried to free msg with MB=0>,,<
Cause: FREMSG called to free an MB but was given a zero pointer.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. Check caller and see why no MB
address was supplied.
>,RTN)
BUG.(HLT,RP4FEX,PHYP4,SOFT,<PHYP4 - Illegal function>,,<
Cause: The routine RP4SIO was called to start I/O for a unit, but the function
code supplied in the IORB was out of range.
>)
BUG.(HLT,RP4IF2,PHYP4,SOFT,<PHYP4 - Illegal function at STKIO>,,<
Cause: The routine RP4STK was called to start stacked I/O for a unit, but the
function code supplied in the IORB was out of range.
>)
BUG.(HLT,RP4IFC,PHYP4,SOFT,<PHYP4 - Illegal function at CNV>,,<
Cause: The routine RP4CNV was called to return the cylinder associated with an
IORB. The routine checked the function in the IORB, and it was
illegal.
>)
BUG.(HLT,RP4ILF,PHYP4,HARD,<PHYP4 - Illegal function on interrupt>,,<
Cause: The routine RP4INT was called by the channel routine to handle a
non-attention interrupt. The function code for the IORB that I/O was
done for was either illegal, or else the function was one which did not
transfer data. Functions which do not transfer data should give an
attention interrupt.
Action: Field Service should check the condition RH20s and RP0x drives.
>)
BUG.(HLT,RP4LTF,PHYP4,SOFT,<PHYP4 - Failed to find TWQ entry at RP4LTM>,,<
Cause: The routine RP4LTM was called to find the entry on the transfer wait
queue that had the best latency. After searching the queue, no IORB
was found to return. This routine should only be called when the
transfer wait queue is nonempty.
>)
BUG.(HLT,RP4PNF,PHYP4,HARD,<PHYP4 - Disk physical parameters not found>,,<
Cause: The routine RP4INI was called to initialize a UDB for a disk. It
converted the hardware drive type into the internal drive type, and
then looked in the physical parameter table (DSKUTP) for that type, so
that the disk parameters could be obtained. The drive type could not
be found.
Action: There is a hardware problem which should be checked by Field Service.
Either a legal drive is reporting an illegal drive type, or there is an
illegal drive connected to the system.
>)
BUG.(CHK,RP4SSC,PHYP4,HARD,<PHYP4 - Stuck sector counter>,<<T1,CDBADR>,<T2,UDBAKA>>,<
Cause: During initialization of a disk unit in the routine RP4INI, the sector
counter for the disk was examined to see if it was changing as it
should. After watching the value for 100000 times, it never varied.
Action: Field Service must be called to repair the disk.
Data: CDBADR - Channel address
UDBAKA - Unit address
>)
BUG.(HLT,RP4UNF,PHYP4,HARD,<PHYP4 - Unit type not found>,<<T1,DRVTYP>>,<
Cause: During initialization of a disk in the routine RP4INI, the hardware
drive type of the disk was read, and then the XTYPE table was searched
for the corresponding internal drive type. The search failed,
indicating the disk was of an unknown type.
Action: There is a hardware problem which should be checked by Field Service.
Either a legal drive is reporting an illegal drive type, or there is an
illegal drive connected to the system.
Data: DRVTYP - Drive type
>)
BUG.(HLT,RPGERR,PAGUTL,HARD,<BADCPG - Fatal error in resident page>,,<
Cause: A hardware error (AR/ARX parity error or MB parity error) was detected
when the monitor referenced a page in memory that contained part of the
resident monitor. The monitor has printed an analysis of the error on
the CTY, and A SYSERR entry will be created when the monitor is
rebooted.
Action: Field Service should check the system for a hardware problem.
>)
BUG.(HLT,RSMFAI,PAGUTL,HARD,<RESSMM - Failed to assign swap mon page>,,<
Cause: The monitor is trying to restore the swappable monitor from the
swapping space after a system crash. It is unable to assign a page in
the swapping space to which a monitor page was previously written. This
code is executed only if the monitor is manually started at location
EVLDGO. This is not a recommended procedure.
>)
BUG.(INF,SBSERF,APRSRV,SOFT,<SBSERR - Could not get error block>,,<
Cause: An APR interrupt occurred because a memory controller detected an error
in its own operation or in information received over the SBUS or from a
memory module. The monitor has determined that a MOS controller is
involved. Normally the monitor creates a block and records information
about the error for later retrieval by TGHA. However, no free space is
available so this information is lost.
Action: Some user on the system could be consuming a lot of the general pool
free space. Run SYSDPY and look at the RE display to check on the
general pool free space. Try and determine who is using the free
space. If it appears that there is insufficient free space, then
rebuild the monitor with a bigger general pool.
>,,<DB%NND>)
BUG.(HLT,SBXSE0,SYSERR,SOFT,<SYSERR called from SEC 0 with ext blk>,<<T4,PC>>,<
Cause: SEBCPY, QUESEB, or SEBCPY with unextended function call address was
performed when the SYSERR block was in extended free space. The
inconsistency must be fixed because it indicates that referencing the
data block may fail if performed by unextended instructions.
Data: PC - the PC of the caller to the SYSERR routine.
>)
BUG.(CHK,SCABAL,SCAMPI,SOFT,<SCA - Connection block already linked>,<<T1,NODE>,<T2,CID>,<T3,FLINK>,<T4,BLINK>>,<
Cause: SCA is linking a connection block onto a system block. However, the
connection block's pointers indicate that it is already linked to some
other block.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Data: NODE - node number
CID - Connect ID
FLINK - Address of next connection block
BLINK - Address of previous connection block
>)
BUG.(CHK,SCABMT,SCAMPI,SOFT,<SCA - Bad message type from remote node>,<<T1,NODE>,<T2,CID>,<T3,OPCODE>>,<
Cause: A bad message type was found on range checking. This shouldn't happen
if the port and port driver are working correctly. The message is
thrown away.
Data: NODE - node number
CID - Connect ID
OPCODE - SCS op code received
>)
BUG.(CHK,SCABSF,SCAMPI,SOFT,<SCA - Buffer section full>,<>,<
Cause: SCA went to create more buffers and discovered that the section is
full. This is an indication that buffers are not being returned.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
>)
BUG.(CHK,SCACCD,SCAMPI,SOFT,<SCA - Can't cancel datagram buffer>,<<T1,NODE>,<T2,CID>,<T3,COUNT>>,<
Cause: A SYSAP has done the "cancel receive datagram" function of SCA, and the
port's queue did not contain as many buffers as the system believes it
should contain.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Data: NODE - Node number
CID - Connect ID
COUNT - Number of buffers we couldn't get
>)
BUG.(HLT,SCACCI,SCAMPI,SOFT,<SCA - Cannot complete initialization>,,<
Cause: During the init SCA detected an error it could not recover from.
SCADIE is called by the location and the stack points out the faultly
phase of init.
>)
BUG.(HLT,SCACFO,SCAMPI,SOFT,<SCA - SC.CON received failure from SC.OUT>,,<
Cause: SC.CON created a new connection block and then called SC.OUT to check
its state. The call should never fail.
>)
BUG.(CHK,SCACGD,SCAMPI,SOFT,<SCA - Can't get datagram buffer when reaping>,<<T1,NODE>,<T2,CID>,<Q1,COUNT>>,<
Cause: When reaping a connection block, a buffer count indicates that datagram
buffers are queued to the port. However, the port's queue has been
emptied while these buffers were removed.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Data: NODE - node number
CID - Connect ID
COUNT - number of buffers remaining to be dequeued
>)
BUG.(CHK,SCACGM,SCAMPI,SOFT,<SCA - Can't get message buffer when reaping>,<<T1,NODE>,<T2,CID>,<Q1,COUNT>>,<
Cause: While reaping a connection block, a receive credit indicates that
message buffers are queued to the port. However, the port's queue has
been emptied while these buffers were removed.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Data: NODE - node number
CID - Connect ID
COUNT - number of buffers remaining to be dequeued
>)
BUG.(HLT,SCACLB,SCAMPI,SOFT,<SCA - Incoming connect_request on closed v.c.>,<<T2,NODE>>,<
Cause: SCAMPI received a connect_request and matched it to a listener. But
when SCAMPI tried to queue the connection block to the system block, it
found that the virtual circuit was closed. Since SCAMPI had checked
for that state earlier, and this is happening at interrupt level,
something unexpected has happened.
Data: NODE - Node number
>)
BUG.(CHK,SCACRB,SCAMPI,SOFT,<SCA - Can't reclaim buffers>,<<T1,NODE>,<T2,CID>,<P3,COUNT>>,<
Cause: Based on the return_credit field for this connection, SCAMPI is trying
to reclaim buffers from the port's queue. The queue is empty. This
reflects confusion about credit, since these buffers should have been
queued at some time in the past.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Data: NODE - Node number
CID - Connect ID at this node
COUNT - Number of buffers we couldn't get
>)
BUG.(CHK,SCACSC,SCAMPI,SOFT,<SCA - Can't send credit request>,<<T1,NODE>,<T2,CID>,<T3,STATE>>,<
Cause: SCA wants to send a credit request, but the connection block already
has some other message pending. This reflects some sort of
inconsistency, since the state was "open", and the interlock word for
credit requests was 0.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Data: NODE - Node number
CID - Connect ID
STATE - BLock state
>)
BUG.(CHK,SCADCF,SCAMPI,SOFT,<SCA - Datagram buffer creation failure>,<<T1,ERROR>,<T2,COUNT>>,<
Cause: SCA detected that the level of buffers maintained was below minimum.
The attempt to create more datagram buffers failed. The error code is
in T1. Output is given as additional data.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Data: ERROR - error code
COUNT - number of datagram buffers in SCA's pool
>)
BUG.(HLT,SCAEBD,SCAMPI,SOFT,<SCA - Error handling buffer deferral request>,<<T2,NODE>,<T3,CID>,<T1,ERROR>>,<
Cause: SCA was unable to create buffers when running in job 0. This should
never happen since job 0 can create pages as needed.
Data: NODE - Node number
CID - Connect ID
ERROR - error code
>)
BUG.(HLT,SCAFN2,SCAMPI,SOFT,<SCA - Can't complete deferred call to SC.DIS>,,<
Cause: A SYSAP called SCAMPI at SC.DIS when the connection block was locked.
The connection block is being unlocked, and the request is being
processed. SC.OUT has returned failure, indicating that this function
can't be performed for the current state. There is no way to return
that failure to the SYSAP, which believes that the disconnect has
proceeded normally. A system crash determines the cause.
Action: Submit an SPR with the dump and instructions for reproducing it.
It may be impossible to analyze this from the current data. If the SCA
ring buffer was enabled, try to determine the sequence of events. Look
at the current state of the connection block, and try to see why SC.OUT
failed.
>)
BUG.(HLT,SCAFN3,SCAMPI,SOFT,<SCA - Can't complete deferred call to SC.DRQ>,,<
Cause: PHYKLP called SCAMPI at SC.DRQ when the connection block was locked.
At the time, the incoming packet was legal for the current state of the
connection. Now it is not legal. This shouldn't happen, and it is
uncertain how to proceed. It is possible to close the virtual circuit
and continue, but there is a halt in order to analyze the protocol
confusion and fix the bug.
>)
BUG.(HLT,SCALFO,SCAMPI,SOFT,<SCA - SC.LIS received failure from SC.OUT>,,<
Cause: SC.LIS created a new connection block and then called SC.OUT to check
its state. The call should never fail.
>)
BUG.(CHK,SCAMCF,SCAMPI,SOFT,<SCA - Message buffer creation failure>,<<T1,ERROR>,<T2,COUNT>>,<
Cause: SCA detected that the level of buffers maintained was below minimum.
The attempt to create more message buffers failed.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Data: ERROR - error code
COUNT - number of message buffers in SCA's pool
>)
BUG.(CHK,SCAMCR,SCAMPI,SOFT,<SCA - Message buffer count was incorrect>,<<T1,COUNT>,<T2,TOPQ>,<T3,BOTQ>,<T4,BUFNUM>>,<
Cause: There are no message buffers when the count indicated there are enough.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Data: COUNT - count of buffers we believed we had
TOPQ - pointer to top of message free queue
BOTQ - pointer to bottom of message free queue
BUFNUM - number of buffers requested
>)
BUG.(HLT,SCANBL,SCAMPI,SOFT,<SCA - No buffer for online list>,,<
Cause: SC.ABF was called to get a buffer for the address list to be used to
call SYSAPs when a node comes online. Without this list no user is
ever told when a node comes online and hence cannot run.
>)
BUG.(CHK,SCANLF,SCAMPI,SOFT,<SCA - Notice table full>,,<
Cause: So many SYSAPs have requested notification of nodes that come on and go
off line that the table of notification addresses overflowed.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
>)
BUG.(CHK,SCANMB,SCAMPI,SOFT,<SCA - Can't return SCS control message buffer>,<<T1,NODE>>,<
Cause: A node went offline. The local node tried to retrieve two message
buffers from the port's queue but found the queue empty.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Depending on timing this may happen legitimately. If it persists,
stock the port's message free queue more generously at system startup.
Data: NODE - Node number
>)
BUG.(CHK,SCANOC,SCAMPI,SOFT,<SCA - Received packet and connection block doesn't exist>,<<T1,NODE>,<T2,CID>,<T3,OPCODE>>,<
Cause: An incoming packet's destination CID doesn't match any connection
block. This may reflect disagreement with another node about the state
of a previously existing connection. The virtual circuit is
closed which corrects the problem.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
From looking at the dump, try to determine the events that led to it.
Use the SCA ring buffer if necessary.
Data: NODE - Node number
CID - Connect ID
OPCODE - Op code
>)
BUG.(HLT,SCANPT,SCAMPI,SOFT,<SCA - No page for CID table>,,<
Cause: SCA called PGRSKD for a page to put its data tables in. The call
failed. Nothing can be done without these tables.
>)
BUG.(HLT,SCANSB,SCAMPI,SOFT,<SCA - System block has gone away>,<<T2,NODE>>,<
Cause: SC.DEF found a system block marked as stuck for buffers, but the
address of the system block is 0.
Data: NODE - Node number
>)
BUG.(CHK,SCANSC,SCAMPI,SOFT,<SCA - Negative system count>,,<
Cause: SCA was notified of a system going offline and decremented the count of
systems currently online. In doing so, the count went negative.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
>)
BUG.(CHK,SCAOBI,SCAMPI,SOFT,<SCA - Online before initialization done>,,<
Cause: A node came online before the initialization of SCA was completed.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
>)
BUG.(HLT,SCAODI,SCAMPI,SOFT,<SCA - Overly decremented CI interlock>,,<
Cause: A CION was done when no previous CIOFF had occurred. This leads to an
overly-decremented lock.
Action: Send in an SPR along with instruction on reproducing the problem.
If you are unable to determine how this happened, turn on the ring
buffer tracing of interlocks. (Do this by setting RPITRN in RINGSW.)
It should be possible to pair each CION with a preceding CIOFF. Note
that these calls are invoked from several CI-related modules.
>)
BUG.(CHK,SCAOF2,SCAMPI,SOFT,<SCA - Offline twice for a node>,<<T1,NODE>>,<
Cause: SC.ERR was called when a system block was already flagged as offline.
While this won't cause an immediate problem, it does indicate internal
confusion and should be investigated.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Data: NODE - Node number
>)
BUG.(CHK,SCAPER,SCAMPI,SOFT,<SCA - Protocol error>,<<T1,NODE>,<T2,CID>,<T3,OPCODE>,<T4,STATE>>,<
Cause: An incoming message violated the SCS protocol. This message is
illegal. The virtual circuit is closed to eliminate any
confusion.
Action: If this persists, change it to a BUGHLT, and submit an SPR along with
the dump and instruction on reproducing it.
Look at the dump and determine the sequence of events that led to it.
If necessary, use the SCA ring buffer.
Data: NODE - Node number
CID - Connect ID at this node
OPCODE - Op code of incoming packet
STATE - state of connection
>)
BUG.(CHK,SCARTO,SCAMPI,SOFT,<SCA - Reap timed out>,<<T2,NODE>,<T1,CID>,<T3,STATE>,<T4,COUNT>>,<
Cause: A block that is reapable cannot be reaped because either the count of
outstanding packets is non-zero or a debugging check has failed. After
several postponements, these were not corrected. The block is now
being deleted.
Action: If COUNT is non-zero, see if the CI-20 was reloaded recently. Buffers
can be lost legitimately when this happens.
If this bug is reproducible with COUNT zero, change it to a BUGHLT and
submit an SPR with the dump and instructions for reproducing it.
Data: NODE - Node number
CID - Connect ID at this node
STATE - Block state
COUNT - Contents of CBNPO (number of queued messages or datagrams)
>)
BUG.(INF,SCASBN,SCAMPI,SOFT,<SCA - Block state already non-zero>,<<T3,NODE>,<T4,CID>,<T2,OLDSTA>,<T1,NEWSTA>>,<
Cause: While trying to set a connection's block state, it is found to be
already non-zero. This can happen legitimately under some conditions.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
If the old state is anything except CREDIT_PEND, something is wrong.
Try to trace the events that led to this, using the SCA ring buffer if
necessary.
Data: NODE - Node number
CID - Connect ID
OLDSTA - existing block state
NEWSTA - state we're trying to set
>)
BUG.(HLT,SCASCQ,SCAMPI,SOFT,<SCA - Can't get connection management buffers>,<<T1,ERROR>>,<
Cause: SCA has been notified of a new system coming online. It tried to
allocate two buffers to be used for connection management, and failed.
This indicates that a large number of buffers have been allocated at
interrupt level, and the process that creates more hasn't run recently.
Action: It is possible to recover from this by deferring buffer allocation
to process context. Meanwhile, try to find out why buffers are being
used so rapidly, or why job 0 is not running.
Data: ERROR - error code from allocation routine
>)
BUG.(INF,SCATMO,SCAMPI,SOFT,<SCA - SCA timed out remote node>,<<T2,NODE>,<T1,TIME>>,<
Cause: SCA sent a message to another node, and did not receive a response
within a timeout period.
Action: This happens legitimately if a node crashes. If this timeout is
occurring for nodes that appear to be running, try to determine why
they are not communicating. If this bug is reproducible when the node
is running normally, change it to a BUGHLT and submit an SPR with the
dump and instructions for reproducing it.
Data: NODE - Node number
TIME - time since we sent timed message
>,,<DB%NND>)
BUG.(CHK,SCAUXR,SCAMPI,SOFT,<SCA - Unexpected response>,<<T1,NODE>,<T2,CID>,<T3,OPCODE>,<T4,EXPECT>>,<
Cause: A connection management response arrived for a particular connection,
but the op code is not the expected one.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
The virtual circuit is closed on the assumption that the other
node violated protocol. This may correct the confusion. If this error
persists, try to determine the events that led to it. If necessary,
use the SCA ring buffer.
Data: NODE - Node number
CID - Connect ID
OPCODE - Op code of incoming packet
EXPECT - Expected op code for this connection
>)
BUG.(CHK,SCBROK,SCAMPI,SOFT,<SC.BRK called while OKINT>,<<T1,PCC>>,<
Cause: This is a coding problem. SC.BRK is attempting to assign SCA
buffers to the caller and being OKINT leaves a possible window where
SCA buffers could be lost.
Action: If this bug is reproducible, change it to a BUGHLT and submit an SPR
with the dump and instructions for reproducing it.
Make the caller NOINT until it has returned the SCA buffers.
Data: PCC - PC of the calling routine
>)
BUG.(HLT,SCDUUO,SCHED,SOFT,<UUO in scheduler>,,<
Cause: An illegal instruction has been executed while in the scheduler's
context. Since the scheduler's PSB is only a prototype PSB and UPT,
allowing this MUUO to behave like others results in bizarre errors that
mask the original problem. This probably a software problem. This
BUGHLT should be analyzed like an ILLUUO.
>)
BUG.(INF,SCLCBN,SCLINK,SOFT,<Phase-II buffering not implemented>,,<
Cause: Conservative buffering is not yet implemented. We should never have a
logical link open to a phase II node.
Action: If this bug is reproducible, set it dumpable and send in an SPR along
with how to reproduce the problem.
>)
BUG.(CHK,SCLNZE,SCLINK,SOFT,<Passing zero error code to SCMUUO>,,<
Cause: The routine that is supposed to store an error code for the user is
zero. This is an illegal value. To solve this, find who called SCTNIE
with T1/ 0 and correct the caller's behavior.
Action: If this bug is reproducible, set it dumpable and send in an SPR along
with how to reproduce the problem.
>)
BUG.(CHK,SCLRIB,SCLINK,SOFT,<Bad SCTRIB call from LLINKS>,<<T1,ADDR>>,<
Cause: LLINKS has called SCTRIB for permission to send a message to SCLINK and
has passed an invalid SLB address in T1. The data structures for this
logical link are inconsistent. Find out what is in LLINK's ELSCB and
why its not an SLB pointer.
Action: If this bug is reproducible, set it dumpable and send in an SPR along
with how to reproduce the problem.
Data: ADDR - The bad SLB pointer
>,RTN)
BUG.(CHK,SCLSLB,SCLINK,SOFT,<SLB bad at FRESLB>,<<SL,SLBPTR>>,<
Cause: There is no Session Control Job Block (SJB) for this Session Control
Link Block (SLB). This error could have happened at any time during
the life of the link after it was actively transferring data.
Action: If this bug is reproducible, set it dumpable and send in an SPR along
with how to reproduce the problem.
Data: SLBPTR/ pointer to the SLB that lacked a SJB pointer
>,RTN)
BUG.(CHK,SCLSPF,SCLINK,SOFT,<SLB self pointers messed up in FNDSLB>,<<T1,CHAN>,<T2,SJBPTR>>,<
Cause: The DECnet data structures for this link are inconsistent.
Action: If this bug is reproducible, set it dumpable and send in an SPR along
with how to reproduce the problem.
Data: CHAN - The DECnet channel number
SJBPTR - Pointer to the SJB
>,RTN)
BUG.(CHK,SCLTFJ,SCLINK,SOFT,<Freeing SJB with SLB entries existing>,<<P1,SJBPTR>>,<
Cause: FRESJB was called to free up a SJB. However, there are still active
links in use for this SJB. This should never happen, and there is an
internal inconsistency in the DECnet data structures. Submit a SPR if
this happens more than once.
Action: If this bug is reproducible, set it dumpable and send in an SPR along
with how to reproduce the problem.
Data: SJBPTR - Pointer to the SJB
>)
BUG.(CHK,SCLTFS,SCLINK,SOFT,<Tried to free wrong SLB>,<<SL,SLBPTR>>,<
Cause: The channel table entry didn't point to the correct SLB. There is an
internal inconsistency in the DECnet data structures for this link.
Action: If this bug is reproducible, set it dumpable and send in an SPR along
with how to reproduce the problem.
Data: SLBPTR - Pointer to the bad SLB
>,RTN)
BUG.(CHK,SCLVAS,SCLINK,SOFT,<SCLINK - Couldn't get memory>,,<
Cause: SCLINK called ASGVAS to assign virtual address space for the node
name/address database. Since the requested memory is non-resident,
this should always succeed. However, ASGVAS gave a fail return.
Action: If this bug is reproducible, set it dumpable and send in an SPR along
with how to reproduce the problem.
>,RTN,<DB%NND>)
BUG.(HLT,SCPT01,PAGEM,SOFT,<SCNPT - Entry is not an immediate pointer>,,<
Cause: A routine has been called to release all pages to which a specified
page table points. The caller must ensure that all pointers are
immediate pointers to core with no disk backup. The BUGHLT indicates
that a pointer was not an immediate pointer.
>)
BUG.(HLT,SCPT02,PAGEM,HARD,<SCNPT - Page was not deleted>,,<
Cause: A routine has been called to release all pages to which a specified
page table points. The caller must ensure that all pointers are
immediate pointers to core with no disk backup. The BUGHLT indicates
that a page had backup on disk.
>)
BUG.(HLT,SCSA2M,SCSJSY,SOFT,<SCSJSY - Attempt to map second PSB>,<<MPSFRK,OWNFRK>,<T1,CURFRK>>,<
Cause: Some routine mapped a PSB but did not release it, or did not use the
correct interlock. The net result was that we are trying to map another
PSB while we still have the first one mapped.
Data: OWNFRK - The number of the fork that did the first map
CURFRK - The fork doing the second lock
>)
BUG.(CHK,SCSABF,SCSJSY,SOFT,<SCSJSY - Connection abort failure on fork delete>,<<T1,ERRCOD>>,<
Cause: During the deletion process for a fork we tried to abort the
connections it had open. We failed in the attempt.
Action: If this BUGCHK is reproducible, set it dumpable and send in an SPR
along with how to reproduce the problem.
Data: ERRCOD - Error code returned by SC.DIS
>)
BUG.(INF,SCSACF,SCSJSY,SOFT,<SCSJSY - Can't get resident space from ASGRES>,<<T1,ERRCOD>,<T2,CALLPC>>,<
Cause: A call to ASGRES (by the JSYS) has failed. With the error code and
caller's PC given by the BUGINF, figuring out why it failed should be
easy enough.
Action: If this BUGCHK is reproducible, set it dumpable and send in an SPR
along with how to reproduce the problem.
Data: ERRCOD - Error code
CALLPC - PC of caller
>,,<DB%NND>)
BUG.(CHK,SCSBDE,SCSJSY,SOFT,<SCSJSY - Bad entry type found>,<<T2,TYPE>,<T1,BLKADR>>,<
Cause: An illegal type of message buffer was attempted to be returned.
It is now lost.
Action: If this BUGCHK is reproducible, set it dumpable and send in an SPR
along with how to reproduce the problem.
Data: TYPE - Message buffer type
BLKADR - Free space block address
>,PSINXT)
BUG.(CHK,SCSCDC,SCSJSY,SOFT,<SCSJSY - Cannot delete connect block from fork queue>,<<T1,ERRCOD>>,<
Cause: We tried to remove a connect block from the owning fork's list of
connect blocks. The most likely failure is a +1 return from SCSMPS.
This fails only when we map a PSB but do not unmap it.
Action: If this BUGCHK is reproducible, set it dumpable and send in an SPR
along with how to reproduce the problem.
Data: ERRCOD - Error code
>,R)
BUG.(CHK,SCSFR1,SCSJSY,SOFT,<SCSJSY - SCS% fork removing entries that do not belong to it>,<<T1,FRKNUM>,<T2,CURFRK>,<T3,ADDRESS>>,<
Cause: SCS% fork is remving entries that don't belong to it. It is assumed
that only the owning fork can manipulate SCS% in a CB or in its own
PSB.
Action: If this BUGCHK is reproducible, set it dumpable and send in an SPR
along with how to reproduce the problem.
Data: FRKNUM - Fork number to be checked
CURFRK - Current fork
ADDRESS - Address of calling routine.
>)
BUG.(CHK,SCSNOI,SCSJSY,SOFT,<SCSJSY - SCS% cannot receive node online/offline interrupts>,,<
Cause: SCA has told the JSYS SYSAP that there are too many SYSAPs and the JSYS
is not allowed to see online/offline interrupts. The system can run but
many diagnostics, or anything using the JSYS, fail.
Action: If this BUGCHK is reproducible, set it dumpable and send in an SPR
along with how to reproduce the problem.
>)
BUG.(CHK,SCSPBF,SCSJSY,SOFT,<SCSJSY - PSI block build failure>,<<T1,ERRCOD>>,<
Cause: The routine to build an event block failed. It is very likely that
ASGRES did not have the space available.
Data: ERRCOD - Error code returned by ONTBLD
>,ONTLOP)
BUG.(INF,SCSUBL,SCSJSY,SOFT,<SCSJSY - User buffer lost during error recovery>,<<T1,ERRCOD>,<T2,CURFRK>,<T3,BUFADR>>,<
Cause: Bad access to user memory or a failing routine caused SCS to try to
place the currently owned user buffer back on the buffer list. The
attempt failed and the buffer address has been lost. Note that there
is no memory loss, the monitor has just forgotten one user buffer
address.
Action: If this BUGINF is reproducible, set it dumpable and send in an SPR
along with how to reproduce the problem.
Data: ERRCOD - Erro code
CURFRK - Current fork
BUFADR - Buffer address
>)
BUG.(CHK,SCTBWK,SCLINK,SOFT,<SCTNSF call from sched without lock>,,<
Cause: The DECnet entry point SCTNSF has been called from scheduler level when
the Session Control interlock was locked. All scheduler level routines
which call SCTNSF should first check SCTLOK. If SCTLOK is not -1, then
the caller should wait for the next scheduler cycle before calling
SCTNSF. Inspect the stack to find out who the offender is.
Action: If this bug is reproducible, set it dumpable and send in an SPR along
with how to reproduce the problem.
>)
BUG.(CHK,SEBINT,MEXEC,SOFT,<Unexpected interrupt in SYSERR process>,<<ITFPC,ITFPC>,<LSTERR,LSTERR>>,<
Cause: An unexpected error has occurred in the process that handles error
logging. The error handler attempts to reinitialize the context
and resume processing. The stack may be examined for an indication of
where the error occurred.
Action: If this BUGCHK can be reproduced, set it dumpable and submit an SPR
along with instructions on reproducing the problem.
Data: ITFPC - PC when error occurred.
LSTERR - Last error code in fork.
>)
BUG.(CHK,SEBISS,SYSERR,SOFT,<SEBCPY - Insufficient string storage in block>,,<
Cause: There is insufficent room in the SYSERR block for a string type data
item. The string is truncated to fit into the space available.
Action: If this BUGCHK can be reproduced, set it dumpable and submit an SPR
along with instructions on reproducing the problem.
>)
BUG.(CHK,SEBUDT,SYSERR,SOFT,<SEBCPY - Unknown data type>,<<T1,DATTYP>,<T4,EVENT>>,<
Cause: An unknown data type was supplied to SEBCPY to be copied into a SYSERR
block. Legal types are defined in the SBTTB table. This data type
entry is ignored.
Action: If this BUGICHK can be reproduced, set it dumpable and submit an SPR
along with instructions on reproducing the problem.
Data: DATTYP - Data type
EVENT - Event code
>)
BUG.(HLT,SECEX1,PAGEM,SOFT,<SETMPG - Attempt to map non-ex section>,,<
Cause: A routine has been called to modify a process's map for one or more
pages. A virtual address was provided. The caller is expected to
provide a valid address. The BUGHLT indicates that a section that does
not exist in the process's map was specified.
>)
BUG.(CHK,SERFOF,MEXEC,HARD,<Cannot OPENF error report file>,<<T1,ERRCOD>>,<
Cause: The CHKR fork could not open the ERROR.REPORT file.
Action: Based upon the error code returned from OPENF%, attempt to diagnose
the problem. If all appears to be in order and the BUG still
persists, make it dumpable and submit an SPR with the dump and a
copy of MONITR.EXE. If possible, include any known method for
reproducing the problem and/or the state of the system at the time
the BUG was observed.
Data: ERRCOD - OPENF error code
>,,<DB%NND>)
BUG.(HLT,SERFRK,SYSERR,HARD,<SERINI - Cannot create SYSERR fork>,,<
Cause: The cause of this BUGHLT is that Job 0 was unable to create a SYSERR
fork. The specific JSYS that fails is the CFORK JSYS, and the dump
should have the reason for the failure. Look at LSTERR to determine
the reason the CFORK failed.
The action needed to remedy this problem depends on the error returned
by CFORK. Look at that code and try to determine how to undo what it
is complaining about.
>)
BUG.(CHK,SERGOF,SYSERR,HARD,<SETOFI - Cannot GTJFN/OPEN SYSERR file>,,<
Cause: The SYSERR fork failed to open the SERR:ERROR.SYS file for output. The
SYSERR fork first attempts to get a JFN on a currently existing version
of the file. If that fails, it attempts to get a JFN for a new version
of the file. This BUG indicates that both attempts failed.
Action: There is a problem with writing the SERR:ERROR.SYS file. Make sure
that the file structure is in good shape, has enough space available,
and that the directory is in good shape. If there seems to be a lot of
disk problems, have Field Service check out the hardware.
>,,<DB%NND>)
BUG.(HLT,SHRNO0,PAGUTL,SOFT,<DESPT - Share count non-zero>,,<
Cause: The monitor is attempting to deassign a slot in the non-OFN part of the
SPT tables. The caller is expected to have ensured that the SPT slot
is no longer in use. The BUGHLT indicates that the share count for the
SPT slot is non-zero, indicating that some process is using the slot.
>)
BUG.(HLT,SHROFD,PAGUTL,SOFT,<DWNSHR - OFN share count underflow>,,<
Cause: A routine has been called to decrement the share count for an OFN. The
BUGHLT indicates that the count was already 0.
>)
BUG.(HLT,SHROFN,PAGUTL,SOFT,<UPSHR - OFN share count overflow>,,<
Cause: The share count for an OFN has been incremented beyond the maximum
value. It should not be possible for a user program to cause this.
>)
BUG.(HLT,SKDCL1,SCHED,SOFT,<Call to scheduler when already in scheduler>,,<
Cause: Code running in scheduler context has attempted to dismiss, block or
page fault thereby trying to enter scheduler context again. This might
result from an unexpected page fault or faulty logic, for example the
code doing the dismiss was not expected to be run in scheduler context.
This BUGHLT is CALLed from several places, examination of the stack
will indicate find out where the problem was discovered. This is
clearly a software problem.
>)
BUG.(HLT,SKDFKS,SCHED,SOFT,<Illegal scheduler action while fork context setup>,,<
Cause: The scheduler was about to perform an action that requires that no fork
context is setup. The monitor found that FORKX was non-negative which
indicates that fork context was setup.
Action: Submit an SPR along with the dump and any information on reproducing
the problem.
To fix the problem, change the monitor to call DISMSJ before calling
the routine or move the call to a more appropriate place. CLK2 always
forces DISMSJ and is usually a good place for periodic actions.
>)
BUG.(HLT,SKDMPE,APRSRV,HARD,<MPE in scheduler or PI context>,,<
Cause: A page fault occurred indicating an AR or ARX parity error while the
monitor was processing an interrupt or running the scheduler. This
BUGHLT occurs regardless of whether the error is repeated when the
reference is retried or not. The monitor has printed a description of
the problem on the CTY. A SYSERR block has been created and is
placed in the SYSERR file when the monitor is rebooted.
Action: Have Field Service look at the memory causing the parity errors.
>)
BUG.(HLT,SKDPF1,APRSRV,SOFT,<Page fail in scheduler context>,<<T1,UPTPFW>,<T2,TRAPPC>,<T3,TRAPFL>>,<
Cause: A page fault occurred while the monitor was running in scheduler
context and the page fail word did not indicate a "hard" failure. This
is probably a software bug because the scheduler executes only resident
code. One cause of this failure is a reference to a piece of swappable
code or data that is not currently in memory.
Data: UPTPFW - Page fail word
TRAPPC - The PC of the instruction that caused the page fault
TRAPFL - The PC flags of the instruction that caused the page fault
>)
BUG.(HLT,SKDTRP,SCHED,SOFT,<Instruction trap while in scheduler>,<<KIMUPC,PC>,<LSTERR,LSTERR>,<LSTIPC,ERRPC>>,<
Cause: An error occurred, resulting in an illegal instruction trap. If a JSYS
is being executed by the monitor, the process normally receives an
error return when this happens. However, in this case the error
occurred in the scheduler, and there is no recovery.
Action: Although it is possible for bad hardware to cause this BUGHLT, it is
usually bad software. If the hardware checks out OK, send in an SPR
along with a dump and indicate how this problem can be reproduced.
Data: PC - PC of last MUUO, this may or may not be relevant
LSTERR - Last error code, this may indicate where error was generated
ERRPC - PC where ITRAP was called
>)
BUG.(CHK,SMGFUL,PAGEM,SOFT,<Can't swap multiple pages (drum is full)>,,<
Cause: The monitor is attempting to swap a group of core pages to the drum.
There is no space available. The general handling of drum assignments
should insure that there are always a few pages available for
"critical" assignments such as this case. It is possible that some user
program could overtax the normal reserves and cause this failure.
Action: If this problem is seen often and no user program can be found to blame
for running out of swapping space, set this bug dumpable, get a dump
and send in an SPR describing how to reproduce the problem.
>)
BUG.(INF,SNPIC,JSYSA,SOFT,<SNPFN3 - Instruction being replaced has changed>,,<
Cause: The instruction being replaced by a SNOOP% breakpoint via SNOOP%
function .SNPIB is not the same instruction that was at that location
when the SNOOP% breakpoint was defined by function .SNPDB.
Action: No action is required. The new instruction is being replaced.
>)
BUG.(CHK,SNPLKF,JSYSA,SOFT,<SNPFN0 - Cannot lock down page into monitor>,,<
Cause: The .SNPLC function of the SNOOP JSYS was trying to lock pages from the
user address space into the monitor address space. It called the
SETIOP routine in PAGEM to do this, and SETIOP returned +1 indicating
failure.
Action: If this BUGCHK is reproducible, set this bug dumpable and submit an SPR
along with the dump along with instructions on reproducing the problem.
>)
BUG.(CHK,SNPODB,JSYSA,SOFT,<SNPF4C - Count of inserted break points overly decremented>,,<
Cause: The .SNPRB function of the SNOOP JSYS was removing breakpoints, and the
number of breakpoints in the linked list was greater than the
breakpoint count.
Action: If this BUGCHK is reproducible, set this bug dumpable and submit an SPR
along with the dump along with instructions on reproducing the problem.
>)
BUG.(CHK,SNPUNL,JSYSA,SOFT,<SNPF5A - Cannot unlock SNOOP page>,,<
Cause: The .SNPUL function of the SNOOP JSYS received a failure return from
the SETIOP routine in PAGEM while trying to unlock a page that was
locked with the .SNPLC function.
Action: If this BUGCHK is reproducible, set this bug dumpable and submit an SPR
along with the dump along with instructions on reproducing the problem.
>)
BUG.(HLT,SPGNLK,PAGEM,SOFT,<SPHYPG - Page not locked>,<<T2,PAGE>>,<
Cause: SPHYPG or SPHYPT requires a locked physical page to map. The argument
given is either not a physical core page or is not locked. This is
usually a software problem.
Data: PAGE - Offending argument.
>)
BUG.(INF,SPRZR1,SYSERR,SOFT,<SEBCHK - SPRCNT went to zero>,,<
Cause: The SYSERR fork keeps a running count of the number of entries made on
the error file in SPRCNT. This count is continous over system reloads
and crashes. This BUG indicates that the count has overflowed its one
word value.
Action: No action is required. The count has wrapped around, probably
legitimately, and is reset to zero.
>,,<DB%NND>)
BUG.(INF,SPRZRO,JSYSA,SOFT,<SETSPR - SPRCNT was set to zero>,,<
Cause: SMON% function .SFSPR (Set count of SPEAR entries output) was called
with a value of 0. This indicates that the monitor could not get the
running count of the number of SPEAR entries output from either the
dump file or ERROR.SYS.
SPRCNT is a cell which should contain the running number of SPEAR
entries made in the ERROR.SYS file over the life of the system.
>,,<DB%NND>)
BUG.(HLT,SPSCHF,PAGEM,SOFT,<SPSCH - Destination is file>,<<T1,ID>>,<
Cause: A file page identifier has been passed to SPSCH as the destination
page. The destination must be a memory page locked in core.
Data: ID - OFN.PN of offending identifier
>)
BUG.(HLT,SPTFL1,PAGUTL,SOFT,<SPT completely full>,,<
Cause: The monitor is attempting to assign to a process a slot in the non-OFN
part of the SPT tables. Normally a linked list points to the free
slots. The header is now 0, indicating either that there is confusion
in the list or there is not available slot. The monitor normally
protects against this event by refusing to assign additional SPT slots
when the available number falls below a fixed minimum. This BUGHLT
indicates a failure of this mechianism or corruption of the free list.
>)
BUG.(HLT,SPTFL2,PAGEM,SOFT,<SPT completely full>,,<
Cause: A routine has been called to change the map for a page of a process.
The page is being mapped to a file page that is not already shared.
The code is going to create an entry for the file page in the SPT so
that the destination can have a share pointer. The choice of a share
pointer over an indirect pointer was made because the count of
available SPT slots exceeded a threshold. The BUGHLT occurred because
the head of the queue of free SPT slots contains a zero, indicating
that there are no free slots. This means that there is an
inconsistency in the monitor's data.
>)
BUG.(HLT,SPTPIC,PAGEM,SOFT,<SWPIN - SPT page already in core>,,<
Cause: A routine has been called to swap a page into core. The id for
the page indicates that it is a page table. The BUGHLT occurred
because the SPT entry for that page table already contains a core
address.
>)
BUG.(HLT,SPTSHR,PAGUTL,SOFT,<UPSHR - SPT share count overflow>,,<
Cause: The share count for an SPT slot (not an OFN) has been incremented
beyond the maximum value. This can be caused by a pathological
program.
Action: If a user program cannot be found that is at fault, please send in an
SPR along with a dump and any information on reproducing the problem.
>)
BUG.(CHK,SPWRFL,APRSRV,HARD,<Spurious power fail indication>,,<
Cause: A power-fail indication was given and the monitor has executed its
sequence for an orderly power-down. The machine is still running after
a long delay, so the monitor has declared the power-fail warning to be
a mistake. The system restarts as if power had failed.
Action: No action is required; the system continues.
>,,<DB%NND>)
BUG.(CHK,SRQBAD,SCHED,SOFT,<SCDRQ - Bad call to SCDRQ7>,,<
Cause: SCDRQ7 was called with a function it does not know about. Fix the
caller or fix SCDRQ7 to know about this function.
Action: If this bug can be reproduced, set it dumpable and submit an SPR along
with a dump and how to reproduce the problem.
>)
BUG.(HLT,STRBAD,PAGUTL,SOFT,<ASOFN - Illegal structure number>,,<
Cause: A routine was called to assign an OFN (index block). The caller
provided a structure number that was invalid, either because that
number can never exist or because it does not exist now.
>)
BUG.(CHK,STRNIL,PHYSIO,SOFT,<UDBCHK - Illegal structure number in offline UDB>,<<T3,STRNUM>,<P3,UDB>>,<
Cause: Routine UDBCHK found that bit U1.SOF had been set in the UDB of a
structure that was associated with the PS structure or with no
structure at all. This should not occur.
Action: No action is necessary, but if this BUCGHK occurs repeatedly, the
conditions that led to this situation should be investigated.
Data: STRNUM - The illegal structure number found in the UDB
UDB - The illegal UDB
>)
BUG.(CHK,STROFF,MSTR,SOFT,<OFN on mounted structure but STRTAB entry is zero>,,<
Cause: The SPTH table has the N+1 through NOFN number of files on a particular
structure marked as being on a mounted structure, but the STRTAB entry
for this structure is zero. This BUGCHK can also happen if the
structure index passed to CKSTOF if out of the range 0 to STRN-1.
Action: The table should be corrected in a few seconds. However, there may
be flurry of BUGCHKs, depending upon how many files are open on the
structure. If the BUGCHKs persist, check for improperly dismounted
or spun down drives.
>,,<DB%NND>)
BUG.(HLT,STRTER,MEXEC,SOFT,<Fatal error while processing previous startup error>,,<
Cause: When a software channel 34 or 35 interrupt happens on fork
0, the monitor transfers control to the routine specified in
MONBK. This address is often the starting address of
JB0INT. JB0INT handles errors in fork 0. While JB0INT is doing its
error recovery, it sets MONBK to J0EMER, so that this STRTER BUGHLT
occurs if another error happens during JB0INT execution.
>)
BUG.(CHK,STSFUL,MNETDV,SOFT,<Internet host status tables full>,,<
Cause: The monitor wanted to add a host to the internet host table but it
could not because the host status tables are full. This could happen
when the file SYSTEM:HOSTS.TXT is loaded, or by the monitor resolving
too many hosts from the DNS nameservers identified in the file
SYSTEM:INTERNET.NAMESERVERS. This BUGCHK only appears only once in any
5 minutes to prevent too many of them from flooding the CTY.
Action: Rebuild the monitor with a higher value for NHOSTS. Other internet
host tables in the monitor are related to the value of NHOSTS, so
increasing NHOSTS will result in several larger tables. If the host
name table has been filled by a large number of hosts added by the DNS
resolver in the monitor, the host table can be reloaded with just
HOSTS.TXT information by using the IPHOST program's LOAD command.
>)
BUG.(HLT,STZERO,FILINI,SOFT,<FILINI - STRTAB entry for boot structure is 0>,,<
Cause: This happens if the code that is supposed to set up the STRTAB
entry for BS: was never executed. If this happens, some data has been
corrupted.
>)
BUG.(CHK,SUMNR1,SCHED,SOFT,<AJBALX - SUMBNR incorrect>,<<T3,SUMBNR>,<T4,CHECK>>,<
Cause: The value of SUMBNR has been found to be incorrect by AJBALX. The
correct value of SUMBNR has been computed and stored in SUMBNR. This
problem is difficult to diagnose.
Action: If this BUGCHK can be reproduced, change it to a BUGHLT and submit an
SPR along with a dump and how to reproduce the problem.
Data: SUMBNR - Sum of working sets in balance set
CHECK - Correct computed value of SUMBNR
>)
BUG.(CHK,SUMNR2,SCHED,SOFT,<WSMGR - SUMNR incorrect>,<<T3,SUMNR>,<T4,CHECK>>,<
Cause: The sum of reserve pages as stored in SUMNR was found to be incorrect
by routine WSMGR. The correct value has been computed and stored in
SUMNR. This problem is difficult to diagnose.
Action: If this BUGCHK can be reproduced, change it to a BUGHLT and submit an
SPR along with a dump and how to reproduce the problem.
Data: SUMNR - Current SUMNR value
CHECK - Correct computed SUMNR value
>)
BUG.(CHK,SWOFCT,PAGEM,SOFT,<OFN share count zero but OFN not cached>,,<
Cause: The monitor is attemping to swap an OFN and it has found that the
OFN share count is zero. When this happened, the OFN should have
been cached. However, it is not cached.
Action: If this BUGHLT can be reproduced, set this bug dumpable and send in
an SPR with a dump indicating how the problem can be reproduced.
>)
BUG.(CHK,SWPASF,DSKALC,SOFT,<CHKBAT - Failed to assign bad swapping address>,<<T3,STRNAM>,<CKBDRA,ADDR>>,<
Cause: Swapping address was not assigned due to an illegal address or
an already assigned address.
Data: STRNAM - Sixbit Structure Name
ADDR - Address to be Assigned
>,,<DB%NND>)
BUG.(CHK,SWPDIR,PAGEM,HARD,<Swap error in directory page>,<<T1,STRX>>,<
Cause: The monitor detected an error while swapping in a page with the same
OFN as the currently mapped directory. The directory is marked.
Action: There is a hardware problem developing. Field Service can run SPEAR
and check the SYSERR file to diagnose the problem. The additional data
is the structure number having the problem. The easiest way to
determine the structure name from the structure number is to count down
the structures listed in a INFORMATION AVAILABLE command, skipping
"DSK".
Data: STRX - Structure number
>,,<DB%NND>)
BUG.(CHK,SWPIBE,PAGEM,HARD,<Swap error in index block>,,<
Cause: A hardware error occurred while the monitor was reading or writing an
index block either from the file space or the swapping area. Future
attempts to read this block generates an error. Future attempts to
write it may produce the same BUGCHK. The page is marked in the
BAT blocks.
Action: This problem can continue unless corrective action is taken. If Field
Service can determine if there is a drive or media problem by running
SPEAR to examine the SYSERR file. If there is a drive problem the
media (pack) may be ok. If there is a media (pack) problem the media
should be replaced.
>,,<DB%NND>)
BUG.(CHK,SWPJSB,PAGEM,HARD,<Swap error in JSB page>,,<
Cause: A hardware error occurred while the monitor was reading or writing a
page in a process's per-job area in the swapping space. Future
attempts to read this page generates an error. Future attempts to
write it may produce the same BUGCHK. The page is marked in the
BAT blocks.
Action: This problem can continue unless corrective action is taken. The
system reloads and continues to run while the problem gets
worse. Field Service can determine if there is a drive or media
problem by running SPEAR to examine the SYSERR file. If there is a
drive problem the media (pack) may be ok; if there is a media (pack)
problem the boot structure on this system should be replaced with a new
one.
>,,<DB%NND>)
BUG.(HLT,SWPMNE,PAGEM,HARD,<Swap error in swappable monitor>,,<
Cause: A hardware error occurred when the monitor was reading a page of the
swappable monitor from the swapping space. A SYSERR entry is
created when the monitor is rebooted, but the BAT blocks are not
marked.
Action: This problem can continue unless corrective action is taken. Field
Service can determine if there is a drive or media problem by running
SPEAR to examine the SYSERR file. The system reloads and
continues to run while the problem gets worse. If there is a drive
problem the media (pack) may be ok; if there is a media (pack) problem
the boot structure on this system should be replaced with a new one.
>)
BUG.(CHK,SWPPSB,PAGEM,HARD,<Swap error in PSB page>,,<
Cause: A hardware error occurred when the monitor was reading or writing a
page in a process's per-process area to or from the swapping space.
The monitor continues to run in an attempt to update the BAT
blocks, but crashes with a SWPxxx BUGHLT as soon as the disk has
been updated. If the monitor is unable to update the disk (in the case
of the page having the error is needed to update the bat blocks), the
system stops with a DDMPNR BUGHLT, and the flag indicating that a
serious swap error exists is set.
Action: This problem can continue unless corrective action is taken. Field
Service can determine if there is a drive or media problem by running
SPEAR to examine the SYSERR file. If there is a drive problem the
media (pack) may be ok; if there is a media (pack) problem the boot
structure on this system should be replaced with a new one.
>,,<DB%NND>)
BUG.(CHK,SWPPT,PAGEM,HARD,<Swap error in unknown PT>,,<
Cause: A hardware error occurred when the monitor was reading or writing a
page table in the swapping space. The monitor is unable to identify
the page table.
The monitor continues to run in an attempt to update the BAT
blocks, but crashes with a SWPxxx BUGHLT as soon as the disk has
been updated. If the monitor is unable to update the disk (in the case
of the page having the error is needed to update the bat blocks), the
system stops with a DDMPNR BUGHLT, and the flag indicating that a
serious swap error exists is set.
Action: There is a problem with the hardware. Field Service should run SPEAR
and determine if there is a drive or media (pack) problem. If there is
a drive problem the media may be OK; if there is a pack problem the
boot structure should be replaced.
>,,<DB%NND>)
BUG.(CHK,SWPPTP,PAGEM,HARD,<Swap error in unknown PT page>,,<
Cause: A hardware error occurred when the monitor was reading or writing
a page from the file system or swapping space. The monitor is
unable to identify the owning page table.
The monitor continues to run in an attempt to update the BAT
blocks, but crashes with a SWPxxx BUGHLT as soon as the disk
has been updated. If the monitor is unable to update the disk
(in case the page having the error is needed to update the BAT
blocks), the system stops with a DDMPNR BUGHLT, and the flag
indicating that a serious swap error exists is set.
Action: This problem can continue unless corrective action is taken. Field
Service can determine if there is a drive or media problem by running
SPEAR to examine the SYSERR file. If there is a drive problem the
media (pack) may be ok; if there is a media (pack) problem the boot
structure on this system should be replaced with a new one.
>,,<DB%NND>)
BUG.(CHK,SWPSTL,PAGUTL,SOFT,<Swap space too low at startup>,<<T1,SWPSIZ>,<T2,MEMSIZ>>,<
Cause: Insufficient swap space has been allocated for reasonable operation.
The swapping space should be at least 4 times the size of main
(MOS/core) memory.
Action: Rebuild the boot structure with more swapping space. Supply at least
four times the amount of main memory for swapping.
Data: SWPSIZ - Size of swapping space allocated
MEMSIZ - Total MOS/core size
>,,<DB%NND>)
BUG.(CHK,SWPUPT,PAGEM,HARD,<Swap error in UPT or PSB>,,<
Cause: A hardware error occurred when the monitor was reading or writing a
special page (PSB, JSB or user page table) in the swapping space.
The monitor continues to run in an attempt to update the BAT
blocks, but crashes with a SWPxxx BUGHLT as soon as the disk has
been updated.
If the monitor is unable to update the disk (in case the page having
the error is needed to update the bat blocks), the system stops
with a DDMPNR BUGHLT, and the flag indicating that a serious swap error
exists is set.
Action: There is a problem with the hardware. Field Service should run SPEAR
and determine if there is a drive or media (pack) problem. If there is
a drive problem the media may be OK; if there is a pack problem the
boot structure should be replaced.
>,,<DB%NND>)
BUG.(HLT,SWPXXX,DSKALC,HARD,<Unrecoverable swap error for critical page>,,<
Cause: The monitor had a swap error for a PSB, PT, PTP, or UPT.
At the time of the error, a BUGCHK reported the problem,
and allowed the system to continue to record the error in
SYSERR, and rewrite the BATBLOCK.
Action: This has been known to happen on a drive that is beginning to have
hardware problems.
>)
BUG.(INF,SYENCD,SYSERR,SOFT,<SYSERR - Missing code for error type>,<<A,JOBNO>,<B,JOBPNM>>,<
Cause: The user forgot to supply a code type for the error entry. The entry
is not made since it causes problems in the error file.
Action: If this BUGINF can be reproduced, set it dumpable and submit an SPR
along with instructions on reproducing the problem.
Data: JOBNO - Job number, internal index
JOBPNM - Job program name
>)
BUG.(CHK,SYSERF,MEXEC,SOFT,<LOGSST - No SYSERR storage for restart entry>,,<
Cause: ALCSEB in LOGSST failed to allocate a SYSERR storage block.
Action: As a result, there is no restart reason entered in ERROR.SYS.
This is informational and no action is required.
>)
BUG.(INF,TCPJS1,TCPTCP,SOFT,<RETJCN: JCN out of range>,,<
Cause: RETJCN was called for a JCN that is out of range.
Action: If this BUGINF is reproducible, set it dumpable and submit an SPR with
the dump and instructions on reproducing the problem.
>)
BUG.(CHK,TCPJS3,TCPTCP,SOFT,<CHKJCN: TCB ownership confused>,,<
Cause: CHKJCN was called for a connection not owned by the calling job.
Action: If this BUGCHK is reproducible, set it dumpable and submit an SPR with
the dump and instructions on reproducing the problem.
>)
BUG.(CHK,TCPJS4,TCPTCP,SOFT,<ABTJCN: TCP Conn not owned by aborting job>,,<
Cause: ABTJCN was called for a connection not owned by the calling job.
Action: If this BUGCHK is reproducible, set it dumpable and submit an SPR with
the dump and instructions on reproducing the problem.
>)
BUG.(CHK,TCPMSX,TCPTCP,SOFT,<XFRDAT: Byte size incorrect>,,<
Cause: The TCP byte copying routine was called for other than 8 bit bytes.
Action: If this BUGCHK is reproducible, set it dumpable and submit an SPR with
the dump and instructions on reproducing the problem.
>)
BUG.(HLT,TCSOFN,PHYSIO,SOFT,<Transfer of cached OFN page>,,<
Cause: An attempt has been made to transfer a core page to disk. However,
this page belongs to a cached OFN and this should not happen.
Action: Submit an SPR along with the dump and any information that might be
helpful.
Trace the stack and locate the caller. All callers should be aware of
cached OFN pages and take appropriate action to insure that these pages
are not transfered.
>)
BUG.(CHK,TM2CCI,PHYM2,HARD,<PHYM2 - TM02 SSC or SLA won't clear>,,<
Cause: 11 (octal) attempts to clear a TM02/3 SSC or SLA have failed.
Action: This is a hardware problem. Field Service should check out the TM02 or
TM03 controller.
>,,<DB%NND>)
BUG.(CHK,TM2HER,PHYM2,HARD,<TM2ERR - IS.HER set on successful retry>,,<
Cause: A retry operation has been completed succesfully but bit IS.HER
indicating a hard error was set in the IORB. Error recovery should
not be done for hard errors.
Action: If this BUGCHK is reproducable, set it dumpable, and send in an SPR
with the dump and how to reproduce the problem.
>,,<DB%NND>)
BUG.(CHK,TM2IDM,PHYM2,HARD,<PHYM2 - Illegal data mode at Done interrupt>,<<T3,MODE>>,<
Cause: The TM02/3 IORB data mode was invalid or illegal when a done interrupt
occurs.
Action: There is probably a TM02/3 hardware problem that should be checked by
Field Service.
Data: MODE - TM02/3 data mode at done interrupt
>)
BUG.(INF,TM2IDX,PHYM2,HARD,<PHYM2 - Illegal retry byte pointer>,<<T1,RTYBPT>>,<
Cause: An error occured during a TM02/3 operation but the retry type for
the function code is illegal.
Action: If this BUGCHK is reproducable, set it dumpable, and send in an SPR
with the dump and how to reproduce the problem.
Data: RTYBPT - Retry byte pointer
>)
BUG.(CHK,TM2IF2,PHYM2,HARD,<PHYM2 - Illegal function on command done>,<<Q1,FNC>>,<
Cause: FTLCHK detected an illegal function code either in the IORB or UDBERR
at command done for a TM02/3 based tape drive.
Action: If this BUGCHK is reproducable, set it dumpable, and send in an SPR
with the dump and how to reproduce the problem.
Data: FNC - TM02/3 driver function code
>)
BUG.(INF,TM2IRF,PHYM2,HARD,<PHYM2 - Illegal function during retry>,<<T3,FNC>>,<
Cause: An illegal function code was encountered during a TM02/3 retry
operation.
Action: If this BUGCHK is reproducable, set it dumpable, and send in an SPR
with the dump and how to reproduce the problem.
Data: FNC - Retry function code
>)
BUG.(INF,TM2N2S,PHYM2,HARD,<PHYM2 - More drives than table space, excess ignored>,,<
Cause: The number of tape drives on the system exceeds the value of MTAN.
All drives after MTAN are ignored.
Action: To accommodate more tape drives, the monitor must be rebuilt with a
larger value of MTAN.
>,,<DB%NND>)
BUG.(CHK,TM2NUD,PHYM2,HARD,<PHYM2 - Channel done interrupt but no unit active>,<<T1,CDBADR>,<T2,TM2ADR>>,<
Cause: A command done interrupt was issued by an RH20 channel but there was no
unit active on that channel. If an OVRDTA had previously occured, and
the device finally responds, this BUGINF happens. This usually
indicates a hardware failure.
Action: Field Service should check the devices on the channel listed in the
additional data; any channel/controller/unit listed in OVRDTA BUGCHKs
should be suspected.
Data: CDBADR - channel number
KDBADR - controller number
>,,<DB%NND>)
BUG.(CHK,TM2RFU,PHYM2,HARD,<PHYM2 - Error recovery confused>,<<T1,UNIT>,<Q1,CONT>,<T3,CHAN>>,<
Cause: The error recovery process has become confused. This could be caused by
a malfunction in the hardware.
Action: Field Service should check out the hardware. If the hardware checks
out, and this BUGCHK is reproducable, set it dumpable, and send in an
SPR with the dump and how to reproduce the problem.
Data: UNIT - Unit number
CONT - Controller number
CHAN - Channel number
>)
BUG.(INF,TM2UNA,PHYM2,HARD,<PHYM2 - Done interrupt and UDB not active>,<<T1,CDBADR>,<P3,UDBADR>>,<
Cause: The TM02/3 driver got a done interrupt from a unit, but did not believe
that the unit was active. If an OVRDTA had previously occured, and the
device finally responds, this BUGINF will happen. This usually
indicates a hardware failure.
Action: Field Service should check the devices on the channel listed in the
additional data; any channel/controller/unit listed in OVRDTA BUGCHKs
should be suspected.
Data: CDBADR - CDB address
UDBADR - UDB address
>)
BUG.(INF,TM8AEI,PHYM78,HARD,<PHYM78 - Asynchronous error interrupt (TM78 hardware problem)>,<<T1,ICODE>,<T2,CHAN>,<T3,CONT>>,<
Cause: The TM78 gave an asynchronous error interrupt. This happens when the
TM78 thinks it has detected a hardware fault in the TM78 or one of its
drives.
Action: The TM78 has been cleared and restarted. The TM78 or one of its drives
may be getting flakey. Field Service must be called to check the TM78
out.
Data: ICODE - The interrupt code associated with this interrupt from
MASSBUS register 13 of the TM78
CHAN - Channel number of the TM78
CONT - Controller number of the TM78
>,,<DB%NND>)
BUG.(CHK,TM8AFU,PHYM78,SOFT,<PHYM78 - Active found up with no active units>,<<T1,CHAN>,<T2,CONT>>,<
Cause: This BUGCHK happens when the KS.ACT bit is on in the KDB but US.ACT
wasn't set in any UDBs during a periodic check. This is not expected
to happen. KS.ACT is cleared and normal tape operation is restored.
Action: The hardware may be flakey or broken. Field Service should check out
the TM78 hardware. If the hardware checks out OK, change this BUGCHK
to a BUGHLT and submit an SPR along with the dump and how to reproduce
the problem.
Data: CHAN - Channel number
CONT - Controller number
SAVUDB - Contents of SAVUDB
>,,<DB%NND>)
BUG.(CHK,TM8FKR,PHYM78,HARD,<PHYM78 - TM78 failed kontroller reset>,<<T1,STATUS>,<T2,CHAN>,<Q2,CONT>>,<
Cause: This BUGCHK happens when the monitor is attempting to reset a TM78
controller and the controller failed to become ready after a very long
wait. The first word of additional data shows the TM78 status
register. If any bits are set in this word, there is a serious TM78
hardware failure.
Action: The TM78 or other hardware may be flakey or broken. Field Service
should check out the TM78.
Data: STATUS - TM78 status (register 3)
CHAN - Channel number
CONT - Controller number
>,,<DB%NND>)
BUG.(CHK,TM8ISI,PHYM78,HARD,<PHYM78 - Illegal function at start IO>,,<
Cause: This BUGCHK can be caused by trying to write an odd number of words to
a tape drive that is set for the high density mode. If T2 is odd, this
is probably the cause.
Action: If this BUGCHK is reproducable, set it dumpable, and send in an SPR
with the dump and how to reproduce the problem.
>)
BUG.(INF,TM8N2S,PHYM78,HARD,<PHYM78 - More drives than table space, excess ignored>,,<
Cause: The number of tape drives available exceeds the constant value MTAN.
Only MTAN drives are configured.
Action: The monitor should be rebuilt with a value of MTAN large enough to
accommodate all the tape drives available to the system.
>,,<DB%NND>)
BUG.(CHK,TM8NUD,PHYM78,HARD,<PHYM78 - Channel done interrupt but no unit active>,<<T1,CHAN>,<T2,CONT>>,<
Cause: A command done interrupt was issued by an RH20 channel but there was no
unit active on that channel. If an OVRDTA had previously occured, and
the device finally responds, this BUGINF happens. This usually
indicates a hardware failure.
Action: Field Service should check the devices on the channel listed in the
additional data; any channel/controller/unit listed in OVRDTA BUGCHKs
should be suspected.
Data: CHAN - Channel number
CONT - Controller number
>,,<DB%NND>)
BUG.(INF,TM8REW,PHYM78,HARD,<PHYM78 - Spurious rewind started interrupt>,<<T2,CHAN>,<T3,CONT>,<T4,UNIT>>,<
Cause: The TM78 gave a spurious rewind started interrupt.
Action: The monitor has dismissed the interrupt. There may be TM78 or other
hardware problems. If this BUGCHK is persistent, notify Field Service
and have them check out the TM78 and drive.
Data: CHAN - Channel number
CONT - Controller number
UNIT - Unit number
>,,<DB%NND>)
BUG.(CHK,TM8SNS,PHYM78,HARD,<PHYM78 - Can't sense drive status>,<<T1,CHAN>,<T2,CONT>,<T3,UNIT>>,<
Cause: Repeated attempts to sense a tape drive status have failed.
Action: There is a TM78 or tape drive hardware problem. Field Service should
check out the TM78. If one particular drive number is being called
out, that drive is suspect too.
Data: CHAN - Channel number
CONT - Controller number
UNIT - Unit number
>,,<DB%NND>)
BUG.(CHK,TRPSIE,SCHED,SOFT,<TRAPSI - No monitor for trapped fork>,,<
Cause: A fork executed a JSYS that was marked as trapped, but there is no fork
monitoring this JSYS trap. The JSYS trap will be ignored.
Action: If this BUGCHK can be reproduced, set it dumpable and submit an SPR
along with a dump and how to reproduce the problem.
>)
BUG.(HLT,TTBAD1,JSYSM,SOFT,<Bad device designator for terminal at ATACH2>,,<
Cause: The call to CHKDES failed. This should not happen, since the terminal
number involved comes from Q3, which is either the number of terminal
controlling the job, or a user-supplied terminal number from the user's
AC4. If a user-supplied number is being used, it was range-checked by
comparing it to NLINES. If it is the number of the controlling
terminal, the job was already verified to be attached somewhere, so
this BUGHLT should not occur.
>)
BUG.(HLT,TTDAS1,SCHED,SOFT,<HLTJB - Unable to deassign controlling terminal>,,<
Cause: The monitor is killing the last (top) fork in a job and is trying to
deassign the job's controlling terminal. The attempt has failed for an
unexpected reason (one that will not be corrected if the fork waits a
while). This indicates inconsistency in the monitor's data base.
>)
BUG.(INF,TTFSMS,MEXEC,SOFT,<Failed to send system message>,<<LSTERR,LSTERR>>,<
Cause: The most likely reason for this failure is that RSX20F cannot
complete the previous TTMSG request. Typically this is caused by a
hung DH11 line, but could be a software bug as well.
Action: Look at the error code which explains the reason for the send
failure. Some error conditions (such as a remote CI node going
down) can cause this BUG to appear and are perfectly legitimate.
If this BUG persists and the last error appears to be something
suspicious, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
Data: LSTERR - Last TOPS-20 error code for this fork
>,,<DB%NND>)
BUG.(HLT,TTICN0,TTYSRV,SOFT,<GTTCI - No buffer pointer but count non-zero>,,<
Cause: At TCI0 (get a character from the line's input buffer) the pointer to
the dynamic data base for the line was 0. This could either be a
coding error or the resident table containing the pointers was
clobbered. Examination of the stack in the dump should indicate which
routine called TCI0 without the pointer.
>)
BUG.(CHK,TTILEC,TTYSRV,HARD,<TTSND - Unrecognized escape code>,<<2,TDB>,<3,TTY>>,<
Cause: An unrecognized function escape character was encountered in a TTY
output stream.
Action: If this BUGCHK is reproducible, set it dumpable and submit an SPR with
the dump and instructions on reproducing the problem.
Data: TDB - Terminal dynamic data block address
TTY - TTY line number
>)
BUG.(HLT,TTLOKB,TTYSRV,SOFT,<TTLCK - Bad TTY lock>,,<
Cause: The monitor tried to lock a TTY line and discovered the lock count was
overdecremented.
>)
BUG.(CHK,TTNAC1,FILMSC,HARD,<Line not active at PTYOPN>,,<
Cause: STADYN was called to get the address of the dynamic data block for
a TTY line that corresponds to PTY. This BUG means STADYN returned
+1, indicating that there is no dynamic data block assigned for
that line. This should never happen.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(HLT,TTNAC3,DSKALC,HARD,<CTY not active at FSIPBO>,,<
Cause: The monitor tried to write to the CTY, but the CTY was not available to
output a character. The line is not active.
Action: Call Digital Field Service.
>)
BUG.(HLT,TTNAC4,DSKALC,HARD,<CTY not active at FSIPBI>,,<
Cause: The CTY was not available to read in a character.
Action: Call Digital Field Service.
>)
BUG.(HLT,TTNAC5,DSKALC,HARD,<CTY not active at FSIINI>,,<
Cause: While mounting the public structure, the monitor found it
had no CTY on which to output information.
Action: Call Digital Field Service.
>)
BUG.(CHK,TTNAC7,TTYSRV,SOFT,<Deallocating inactive line>,<<T2,TTYLIN>>,<
Cause: TTYDEA was called to deallocate a terminal's dynamic data block but the
line was inactive and had no block.
Action: If this BUGCHK is reproducible, set it dumpable and submit an SPR with
the dump and instructions on reproducing the problem.
Data: TTYLIN - TTY line number
>)
BUG.(HLT,TTNAC8,DEVICE,SOFT,<Cannot assign terminal at DEVINI>,,<
Cause: The monitor could not assign a terminal to a job because
1. It failed to get resident storage.
2. The line is not fully active; it is okay for system messages
and sendalls. Need a CNTRL/C on line.
3. Or a programming error.
>)
BUG.(HLT,TTONOB,TTYSRV,SOFT,<GTOCHR - No buffer but count non-zero>,,<
Cause: At TTSND7 (send a character to a line) the pointer to the line's data
base was 0. This is either a coding error or the resident table
containing the pointers was clobbered. Examination of the stack in the
dump should indicate which routine made the call without a pointer.
>)
BUG.(CHK,TTQADX,TTYSRV,SOFT,<TTYSRV - Unknown function requested>,<<T3,ADR>>,<
Cause: TTQAD has been called with a routine address that is not in its local
table of known routines. To diagnose this problem, look at the stack
to find the call to TTQAD. Then find the name of the routine being
passed and add it to the TQFNT table.
Action: If this BUGCHK is reproducible, set it dumpable and submit an SPR with
the dump and instructions on reproducing the problem.
Data: ADR - Address of bogus routine
>)
BUG.(CHK,TTULKB,TTYSRV,SOFT,<Bad TTY unlock in ULKTT>,,<
Cause: The monitor tried to unlock the TTY and it was already unlocked.
Action: If this BUGCHK is reproducible, set it dumpable and submit an SPR with
the dump and instructions on reproducing the problem.
>)
BUG.(CHK,TTYBBO,TTYSRV,HARD,<TTYSRV - Big buffer overflow>,,<
Cause: The buffer for incoming TTY characters was full. The character was
discarded and the line XOFFed.
Action: Some device connected to the system is filling the buffer for incoming
terminal characters. The device is probably not responding quickly
enough to XON/XOFF commands.
>,,<DB%NND>)
BUG.(CHK,TTYNTB,TTYSRV,SOFT,<Ran out of TTY buffers>,,<
Cause: TTGTBF was called to assign and set up TTY buffers but TTFREC indicates
that there are no buffers available.
Action: If this BUGCHK is reproducible, set it dumpable and submit an SPR with
the dump and instructions on reproducing the problem.
>,,<DB%NND>)
BUG.(INF,TTYSTP,RSXSRV,HARD,<Line has been shut off because of excessive input rate>,<<T2,LINE>>,<
Cause: A terminal line on RSX20F console front end is generating input at an
excessive rate. It is being shut off for 5 seconds by having its input
speed set to zero. The high input rate can result from a noisy
terminal line which has a high input baud rate. If an EIA line, it may
be too long and so picks up electrical noise.
Action: The terminal line number specified in the additional data should be
checked. This problem can usually prevented by eliminating the noise
or reducing the input speed. If all connections to the terminal line
are tight and the line speed has not been changed recently, the front
end hardware should be checked by Field Service.
Data: LINE - Line being shut off
>,,<DB%NND>)
BUG.(HLT,TVTNTV,TTYSRV,SOFT,<TVTCHK called with non-TVT>,,<
Cause: TVTCHK was called to determine the status of a TVT line, but the line
number provided by TCP is not a TVT line. TCP should never call TVTCHK
with a non-TVT line.
>)
BUG.(HLT,TWQNUL,PHYSIO,HARD,<PHYSIO - PWQ OR TWQ was null at a seek or transfer completion>,,<
Cause: When I/O completed on a unit, either OFFTWQ or OFFPWQ was called to
remove the current IORB from the position wait queue or the transfer
wait queue. The error occurred because the queue was empty.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(HLT,UCHMOD,PAGUTL,SOFT,<OFN is modified while uncaching>,,<
Cause: An OFN is being uncached but the system believes that it has been
modified. This is bad because there should be no users of this cached
OFN.
>)
BUG.(CHK,UCLOFN,DISC,SOFT,<DSKREN - Attempt to uncache locked OFN>,,<
Cause: The RNAMF jsys has just renamed a file and then tried to uncache the
OFN. This is to prevent a situation where directories appears to
have the wrong number of pages in use. Somehow the file was locked
by another process. This should not be possible. If this persists
please change this to a BUGHLT and submit an SPR with a dump.
>)
BUG.(HLT,UCXBNC,PAGUTL,SOFT,<Uncaching OFN not in core>,,<
Cause: An OFN is being uncached but the storage address is not a core address.
It should be since it is cached.
>)
BUG.(HLT,UIONIR,PHYSIO,HARD,<UDSKIO - No IORB for NOSKED fork>,,<
Cause: The routine UDSKIO was called to do special I/O for a fork, and to do
the I/O it uses one of a group of preallocated IORBs. There were no
free IORBS left, and the fork could not block because it was NOSKED.
Action: Field Service should check the system. It is unlikely that a software
problem could cause this BUGHLT.
>)
BUG.(CHK,ULKBAD,TTYSRV,SOFT,<Unlocking TTY when count is zero>,<<T2,TTYLIN>>,<
Cause: A call has been made to ULKTTY to unlock a terminal but the lock count
is already zero. This indicates a coding problem.
Action: If this BUGCHK is reproducible, set it dumpable and submit an SPR with
the dump and instructions on reproducing the problem.
Data: TTYLIN - TTY line number
>)
BUG.(CHK,ULKINT,FUTILI,SOFT,<Lock being unlocked while OKINT>,<<T1,LOCK>,<T2,CALLER>>,<
Cause: A routine is unlocking a lock while OKINT. This is dangerous since
allowing interrupts can cause the lock to be held indefinetly or
lock ownership to be lost. The process should have been NOINT when
it acquired the lock or a LOKINT BUGCHK would have resulted.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed. The dump shows which routine
is OKINT while attempting to get the lock. Make the routine go
NOINT for the duration of the lock being locked.
Data: LOCK - Lock index and flags
CALLER - Caller's address
>)
BUG.(CHK,ULKSTZ,FUTILI,SOFT,<Overly decremented structure lock>,,<
Cause: ULKST1 was called to unlock a structure but the lock count was
already zero.
Action: If this BUG persists, make it dumpable and submit an SPR with the
dump and a copy of MONITR.EXE. If possible, include any known
method for reproducing the problem and/or the state of the system
at the time the BUG was observed.
>)
BUG.(CHK,UNBFNF,SCHED,SOFT,<Fork not found>,,<
Cause: This BUG can be caused in one of three ways:
1. UNBLK1 was called to unblock a specific fork and the fork was
not blocked.
2. UNBLK1 was called to unblock a specific fork and the fork was
not on the wait list it claimed to be on.
3. RECONC was called to transfer a fork from one wait list to
another, but the fork was not on the list it claimed to be on.
Action: If this BUGCHK can be reproduced, set it dumpable and submit an SPR
along with a dump and how to reproduce the problem.
>)
BUG.(HLT,UNFWSS,PHYSIO,SOFT,<Unit not found creating SDB for structure>,,<
Cause: The routine SETSTR was called to build an SDB for a structure. One of
its arguments is the channel, controller, and unit numbers of a unit
which had already been known to exist. But when the routine CHKCKU was
called to find the UDB of the unit, the routine failed to find the
unit.
>)
BUG.(HLT,UNPGF1,APRSRV,HARD,<MEMPAR - Parity error during mem scan>,,<
Cause: A page fault occurred while the monitor was scanning memory looking for
an MB parity error. The monitor expects to cause such a fault when it
references the bad word in memory. However, the PC indicates that the
error occurred somewhere other than in the instruction that is expected
to fail. The monitor has printed a description of the problem on the
CTY. A SYSERR block has been created and will be placed in the SYSERR
file when the monitor is rebooted. If the memory scan has detected any
errors, the monitor has printed a description of them on the CTY, too.
Action: Have Field Service check the memory on this system.
>)
BUG.(HLT,UNPGF2,APRSRV,HARD,<Unknown page failure type>,,<
Cause: A page fault has occurred and the page fail word indicates a "hard"
error. The monitor has read the type of failure from the page fail
word and one of the following is true:
1. The hardware is never supposed to generate the code.
2. The code is valid, but the scheduler is running, and this code
should never be generated from scheduler context.
Action: This bug is usually caused by a hardware problem. Have Field Service
check out the system. If this BUGHLT is reproducible with healthy
hardware, submit an SPR along with the dump and instructions on
reproducing it.
>)
BUG.(CHK,UNPIRX,SCHED,SOFT,<UNPIR - No PSI in progress>,,<
Cause: UNPIR was called to leave PSI context but there was no PSI in progress.
Action: If this BUGCHK can be reproduced, set it dumpable and submit an SPR
along with a dump and how to reproduce the problem.
>)
BUG.(HLT,UNXMPE,APRSRV,HARD,<PFCDPE - Unexpected parity error trap>,<<T1,PFW>,<T2,PADR>>,<
Cause: The monitor was processing an AR or ARX parity error when a second
error occurred. The monitor retries the reference that caused the
original error and is prepared to handle a second error. However, the
BUGHLT indicates that the error occurred during the processing but not
during the retry.
Action: This is caused by a hardware problem. Field Service should check the
system.
Data: PFW - Page fail word
PADR - Address of page fail
>)
BUG.(HLT,UP2LNG,APRSRV,HARD,<The system has been up too long>,,<
Cause: The system has been up for more than 397 days, 16 hours, 22 minutes
and 18 seconds. When this happens, TODCLK got large enough to start
filling in bit 0 (meaning TODCLK is now negative). This has the
unpleasant side effect of knocking all of the system timers out of
whack. Prior to this BUGHLT, the system's behavior will have been
quite strange. Many FLKTIM BUGCHKs may have appeared.
Action: Have field service perform a PM on your system at least once a year.
If this is not possible, then you should reload your system at least
once a year.
>)
BUG.(HLT,UPDCSH,PAGUTL,SOFT,<UPDOF0 - Update of cached OFN>,,<
Cause: A routine has been called to write an updated index block for a file
onto the disk. However, the OFN is cached and the index block should
have been updated when it got cached, not once it is cached.
>)
BUG.(INF,USGHOL,JSYSM,HARD,<Lost page(s) in usage file>,,<
Cause: This BUGINF indicates that the first free page in the USAGE file as
reported by FFFFP% is not the last page in the file. This means
that the file has holes in it, or lost pages.
Action: The USAGE file ACCOUNT:SYSTEM-DATA.BIN should be repaired or deleted.
>,,<DB%NND>)
BUG.(HLT,UXXCKP,JSYSM,SOFT,<Couldn't create checkpoint file>,,<
Cause: The file ACCOUNT:CHECKPOINT.BIN.1 could not be referenced for one
reason or another. The code first attempts a GTJFN (GJ%PHY, GJ%OLD)
and an OPENF (OF%RD, OF%WR, OF%RTD), one of which must fail for UXXCKP
to be a possibility.
After the above GTJFN or OPENF has failed, then a GTJFN (GJ%PHY,
GJ%NEW) is attempted. If this succeeds, then UXXCKP occurs if a
failure happens on one of the following: the subsequent OPENF
(OF%WR,OF%RD), the call to ASGSWP to allocate CKPSIZ words, or the SOUT
writing CKPSIZ words to the checkpoint file.
If the GTJFN (GJ%PHY, GJ%NEW) fails, then another GTJFN (GJ%PHY,
GJ%DEL) is attempted, and its failure causes UXXCKP. If this GTJFN
succeeds, however, then a CHFDB (turning off FB%DEL to undelete the
file) is done, and its failure also causes UXXCKP.
Action: Use EDDT to patch the system so that you can bring up the system
without the checkpoint file being referenced. This can be done by
putting a RET at USGINI and bringing the system up. Then get the
checkpoint file into a state such that none of the above failures
occurs.
>)
BUG.(CHK,UXXCL1,JSYSM,HARD,<Unable to create new usage file>,<<T1,ERRCOD>>,<
Cause: The USAGE file ACCOUNT:SYSTEM-DATA.BIN could not be created. This error
occurs if a JFN cannot be obtained on the file or if it cannot be
opened.
Data: ERRCOD - JSYS error code
>,,<DB%NND>)
BUG.(CHK,UXXCL2,JSYSM,HARD,<Unable to open new usage file>,<<T1,ERRCOD>>,<
Cause: This bug indicates that the USAGE file ACCOUNT:SYSTEM-DATA.BIN could
not be opened. This occurs if a JFN cannot be obtained on the file
or if the file cannot be opened.
Data: ERRCOD - JSYS error code
>,,<DB%NND>)
BUG.(CHK,UXXCL3,JSYSM,HARD,<Unable to close usage file>,<<T1,ERRCOD>>,<
Cause: This bug indicates that TOPS-20 could not CLOSF the USAGE file
ACCOUNT:SYSTEM-DATA.BIN. This bug is highly unlikely unless the JFN
has been lost.
Action: Look at the JSYS error code and figure out what could have happened.
Data: ERRCOD - JSYS error code
>,,<DB%NND>)
BUG.(HLT,UXXCRE,JSYSM,SOFT,<Cannot create usage file>,,<
Cause: If the GTJFN (GJ%PHY,GJ%OLD) or the OPENF (OF%RD, OF%WR, OF%RTD) on the
checkpoint file ACCOUNT:CHECKPOINT.BIN.1 fails, then another GTJFN
(GJ%FOU) and OPENF (OF%WR) is attempted in order to create a new
checkpoint file. If the second try GTJFN and OPENF fail, the UXXCRE
BUGHLT occcurs.
Action: Analyze the error code from the failing JSYS, and use EDDT to
bring the system up without accounting and repair the problem.
This can be done by putting a RET at USGINI.
>)
BUG.(CHK,UXXFAI,JSYSM,HARD,<USAGE JSYS failure>,<<LSTERR,LSTERR>>,<
Cause: The monitor attempted to perform a USAGE% call to log either a
login, logout, or session entry and it failed. There is no
reasonable explanation for the failure of this JSYS call.
Action: Use the DOB% facility to produce a dump and submit an SPR.
Also, if you have a procedure for reproducing this problem
please include it with the SPR.
Data: LSTERR - error code from USAGE JSYS.
>)
BUG.(INF,UXXFIT,JSYSM,HARD,<Checkpoint file not in correct format for this system, rebuilding>,,<
Cause: The ACCOUNT:CHECKPOINT.BIN file is not in the correct format for this
monitor's configuration. This can occur if the value of NJOBS has
changed from the previous monitor or if the size of the checkpoint
records has changed. This BUGCHK can be expected if the monitor
version has changed or a monitor with a different configuration has
been loaded.
Action: TOPS-20 will rebuild the checkpoint file, no further action is needed.
>,,<DB%NND>)
BUG.(HLT,UXXILL,JSYSM,SOFT,<USGMES - Illegal function code>,,<
Cause: The USAGE JSYS causes entries in the usage queue. Each
entry has a dispatch offset which is used by USGMES as an index into
a vector for calling the appropriate support routine.
If the dispatch offset is too large, this BUGHLT occurs. Since the
monitor itself is creating the entries in the queue, such a mismatch
should never occur.
>)
BUG.(HLT,UXXMAP,JSYSM,SOFT,<USGMAP - Call to JFNOFN failed>,,<
Cause: USGMAP wants to map a page of a file into FPG0 via SETMPG.
It calls JFNOFN to convert the JFN.PN atom to OFN.PN, which SETMPG
wants. If JFNOFN fails, this BUGHLT occurs. Some reasons that JFNOFN
fail are: JFN is not associated with a disk file; JFN is not open;
attempt to create a new page table for a file that is not open for
writing; attempt to create a new page table for a directory file;
attempt to create a new page table for which there is no room on disk.
>)
BUG.(HLT,UXXOPN,JSYSM,SOFT,<Unable to open usage file>,,<
Cause: USGINI invoked OPENF (OF%RD, OF%WR, OF%RTD), which failed to open the
USAGE file ACCOUNT:SYSTEM-DATA.BIN.
Action: Use EDDT to bring the system up without accounting and repair the
problem. This can be done by putting a RET at USGINI.
>)
BUG.(CHK,UXXWER,JSYSM,HARD,<Write error in usage file>,<<T1,PAGE>>,<
Cause: A SOUT or UFPGS error occurred while trying to write to the USAGE file
ACCOUNT:SYSTEM-DATA.BIN. This indicates that there is a hard error in
the file.
Action: The USAGE file ACCOUNT:SYSTEM-DATA.BIN must be repaired or deleted.
Data: PAGE - Page number in USAGE file
>,,<DB%NND>)
BUG.(CHK,WRTBT4,DSKALC,SOFT,<ASOFN on bit table file failed>,<<T2,STRCOD>>,<
Cause: Could not assign an OFN for the structure bit table.
Data: STRCOD - Structure Unique Code
>)
BUG.(CHK,WRTCPB,DSKALC,HARD,<WRTBTB - Failed to backup ROOT-DIRECTORY>,<<T1,STRCOD>>,<
Cause: The bit table is being written. The backup root-directory or
symbol table may not have been written, or there may not be
enough free space on the pack.
Data: STRCOD - Structure Unique Code
>,,<DB%NND>)
BUG.(HLT,WRTLNG,DSKALC,SOFT,<WRTBTB - Bit table is a long file>,,<
Cause: The FDB for a file structure bit table has the FB%LNG bit set, which
says the file is a long file.
>)
BUG.(CHK,WSPNEG,PAGEM,HARD,<SOSWSP - WSP negative>,<<FX,FORK>,<T2,FKCSIZ>>,<
Cause: SOSWSP has been been called to decrement the working set size of the
current fork by one. If this was done, the working set size would
become negative. This indicates a problem with the monitor's
calculation of the fork's working set size since it should never be
negative. The working set size has not been decremented.
Action: There is a problem in PAGEM or SCHED with working set size management.
If this problem can be reproduced, set this BUG dumpable and get a
dump to send in with an SPR.
Data: FORK - Fork number
FKCSIZ - Current size of working set (RH of FKWSP)
>)
BUG.(HLT,WSSPNA,PAGEM,SOFT,<WSSFKP - Fork special page bad age>,,<
Cause: The monitor is swapping out all pages of a process. It is trying to
swap out one of the special pages (JSB, PSB, etc.). The page should be
in core and locked, but it is not assigned (its age is less than
PSASN).
>)
BUG.(HLT,WSSPNC,PAGEM,SOFT,<WSSFKP - Fork special page not in core>,,<
Cause: The monitor is swapping out all pages of a process. It is trying to
swap out one of the special pages (JSB, PSB, etc.). The page should be
in core and locked, but it is not in core.
>)
BUG.(HLT,XBLTAL,APRSRV,SOFT,<XBLTA asked to copy too much>,<<T1,LENGTH>>,<
Cause: XBLTA was called with a 'length to BLT' of more than one section. It
is unlikely that the caller really intended to copy this much and
usually indicates a software bug somewhere.
Data: LENGTH - Number of words XBLTA was asked to copy
>)
BUG.(CHK,XBWERR,PAGUTL,HARD,<UPDOFN - Disk write error on XB>,<<T1,STRX>>,<
Cause: UPDOFO was called to scan an index block and write the image to disk.
This BUG indicates that there has been a disk write error on the index
block for a file.
Action: Field Service should run SPEAR to check the SYSERR file for disk
problems. The additional data is the structure number having the
problem. The easiest way to determine the structure name from the
structure number is to count down the structures listed in a
INFORMATION AVAILABLE command, skipping "DSK".
Data: STRX - Structure number
>,,<DB%NND>)
BUG.(HLT,XSCORE,PAGUTL,SOFT,<CST too small for physical core present>,,<
Cause: A routine has been called to map a specified core page to a specified
virtual page. The BUGHLT indicates that the caller provided a page
number of a core page that does not exist. (The number is too large.)
This BUGHLT can occur if a monitor that is built for less than 256K is
booted on a machine whose memory exceeds 256K.
Action: If the monitor was built for less than 256K, and there is more than
256K of memory on the system, rebuild the monitor for the correct
amount of memory.
>)
BUG.(HLT,XTRAPT,DISC,SOFT,<NEWLFT - Extra page table in long file>,,<
Cause: The monitor is attempting to create a new file section in a long
file. This BUGHLT indicates that the page table slot in the super PT
already contains a pointer to a second level PT. This indicates a race
of some kind when a a new page table is created.
>)