PDP-10 Archive: 6-1-documentation/tops20.tco from tops20_v6_1_tcpip_distribution_tp

Trailing-Edge - PDP-10 Archives - tops20_v6_1_tcpip_distribution_tp_ft6 - 6-1-documentation/tops20.tco

There are 24 other files named tops20.tco in the archive. Click here to see a list.

                               TCO-number:  6.1.1000



Written-by:  MCINTEE                          Creation-date:  24-Feb-83 10:03:57


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR


Problem:  

Diagnosis:  

Solution:  


                               [End of TCO 6.1.1000]

                               TCO-number:  6.1.1003



Written-by:  GUNN                             Creation-date:  20-Jul-83 12:21:32


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  Yes


Program:  MONITOR
  Routines-affected:   	LLMOP	STG	MONSYM




Problem:  Digital Network Architecture (DNA) Phase IV requires
minimum subset Low Level Maintenance OPeration (LLMOP) support
for Ethernet.


Diagnosis:  Need to add code to TOPS-20 Monitor to implement
part of the LLMOP functions.


Solution:  Add new module LLMOP and code in various other
modules to implement Ethernet Loopback Protocol Server,
Remote Console Protocol Server, and LLMOP% JSYS as interface
to Ethernet Loopback Requestor and Remote Console Requestor.



                               [End of TCO 6.1.1003]

                               TCO-number:  6.1.1004



Written-by:  GLINDELL                         Creation-date:  28-Nov-83 11:15:44


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCJSYS	MONSYM	STG	PROLOG	GLOB




Problem:  
There is a need for a way to set and read link parameters and quotas
for a logical link.

Diagnosis:  
Not needed before.

Solution:  
Add 4 new MTOPR functions: set/read link parameters, and set/read link
quotas.


                               [End of TCO 6.1.1004]

                               TCO-number:  6.1.1007



Written-by:  GLINDELL                         Creation-date:  12-Jun-84 16:37:36


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	scjsys




Problem:  DECnet X25 object numbers do not have names defined
Diagnosis:  Not needed before
Solution:  Add names/object numbers X25GAT/31, X29SRV/34 and X25HST/36

                               [End of TCO 6.1.1007]

                               TCO-number:  6.1.1008



Written-by:  GLINDELL                         Creation-date:   3-Jul-84 13:46:57


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	jntman




Problem:  
DECnet startup is slow when there are many nodes to define. Engineering
net currently has more than 2000 nodes defined.

Diagnosis:  

Solution:  
Add a new function to the NODE% jsys that allows a table of node names
and numbers to be inserted into the monitor. The SETNOD program will use
this function.


                               [End of TCO 6.1.1008]

                               TCO-number:  6.1.1009



Written-by:  PAETZOLD                         Creation-date:  18-Jul-84 19:14:10


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	PHYSIO	PHYH2	STG	pagutl




Problem:  : 

Low on address space.  However need to support 4 meg of memory.

Diagnosis:  : 

Need to move more stuff out of PC section.

Solution:  : 

Move CST5.


                               [End of TCO 6.1.1009]

                               TCO-number:  6.1.1010



Written-by:  PRATT                            Creation-date:  30-Jul-84 11:17:48


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	STG	GLOBS	IPNIDV	ANAUNV	IPIPIP	IMPDV
			IPFREE	MNETDV	MONSYM	PARAMS	TTYSRV	TTANDV
			TTPHDV




Problem:    	Can't transmit TCP/IP over Ethernet

Diagnosis:    	No code

Solution:    	Write the code

	In addition, changes are needed to TTYSRV, and TTANDV
	so that TTANDV can assemble independently of TTYSRV.


                               [End of TCO 6.1.1010]

                               TCO-number:  6.1.1011



Written-by:  GROSSMAN                         Creation-date:  17-Aug-84 20:22:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  Yes


Program:  MONITOR
  Routines-affected:   	MONSYM	STG




Problem:  User's can't do Ethernet functions directly.

Diagnosis:  No interface.

Solution:  Add the NI% JSYS.  The functions come later.


                               [End of TCO 6.1.1011]

                               TCO-number:  6.1.1021



Written-by:  GLINDELL                         Creation-date:  10-Oct-84 16:03:40


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	APRSRV	globs	JSYSA	LDINIT	MEXEC	PAGEM
			pagutl	PROLOG	STG




Problem:  Merge 6.1 address space changes.


                               [End of TCO 6.1.1021]

                               TCO-number:  6.1.1022



Written-by:  PRATT                            Creation-date:  12-Oct-84 06:49:46
Edited-by:   PRATT                            Edit-date:      25-Oct-84 13:51:36


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA	TTYSRV	LATSRV	TTANDV	TTPHDV	MONSYM




Problem:      

	There is no jsys which provides a means for finding out
	the originating node for a given job coming into the 20
	from a network.

Diagnosis:      	No code

Solution:      

	Add a new jsys call NTINF% which will be a generic network
	information jsys. The new standard jsys calling sequence 
	will be used for passing arguments.

	Given the terminal number, job #, or -1 (for self), function 
	.NWRRH will return the remote hostname for the job.


                               [End of TCO 6.1.1022]

                               TCO-number:  6.1.1024



Written-by:  PAETZOLD                         Creation-date:  17-Oct-84 20:34:25
Edited-by:   PAETZOLD                         Edit-date:      17-Oct-84 21:31:06


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	MNETDV




Problem:      

No way to return list of Internet addresses for the system.

Diagnosis:      

Oversight.

Solution:      

Add function .GTHLA to the GTHST% JSYS.  Calling sequence (simliar to 
conventions for this JSYS) is:

1/  .GTHLA
3/  Destination Address
4/  Count of items to return

Non skip return indicates error (ARGX24 only possible)
skip return indicates success.  T4 contains count of items returned.


                               [End of TCO 6.1.1024]

                               TCO-number:  6.1.1026



Written-by:  PRATT                            Creation-date:  20-Oct-84 17:55:06
Edited-by:   PRATT                            Edit-date:      22-Oct-84 12:37:12


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	COMND


Related-QAR:  	706037



Problem:    

	We have a way of retrieving the last command if it had an error
	but have no way to retrieve it if that command completed sucessfully.

Diagnosis:  	No code

Solution:    	Add the code which allows ^H to retrieve the last
		command without the confirmation character.


                               [End of TCO 6.1.1026]

                               TCO-number:  6.1.1030



Written-by:  PRATT                            Creation-date:   1-Nov-84 14:14:57


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	ANAUNV	STG	GLOBS	MNETDV	IPIPIP




Problem:  	No way to run TCP/IP over the CI

Diagnosis:    	No code

Solution:  

	Create a new module called IPCIDV which interfaces 
	Multinet to SCA.

	Change MNETDV so that it accepts an IPCI device

  	Change ANAUNV to build an NCT for IPCI

	Change IPIPIP to call CIPSRV from the Internet fork

	Change STG to define storage for IPCIDV


                               [End of TCO 6.1.1030]

                               TCO-number:  6.1.1032



Written-by:  PRATT                            Creation-date:   5-Nov-84 16:30:19


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PROLOG	STG	GLOBS	TTYSRV	TTPHDV	TTANDV
			TTYDEF




Problem:    

	TTYSRV/TTYDEF can't be compiled in the normal M61: area
	for Arpanet monitors or any other monitor which would turn
	off LAT, or CTERM.

	Also, we still have lots of references to line types which
	don't exist, such as the DZ, DC, and RP line types.

Diagnosis:  

  	TTYDEF.MAC contains conditonal assembly for Arpanet, LAT, CTERM
	and NRT. TTYSRV uses TTYDEF.UNV to find out which line types
	to assemble into the TDCALL's and other things. What results is a 
	TTYSRV.REL which may or may not have any one of the particular 
	line types turned on even though the monitor is built with the 
	corresonding device dependent code. BADTTY buglhts usually result. 

	Further complications can occur when assembling the device
	dependent modules with an unknown TTYDEF.UNV.

Solution:    

        Remove traces of the DC, DZ, and RP line types and the 
        KCFLG conditional code. Save the code for historical reasons
        in a module called TTDZKC.
        
	Update the line type values in PROLOG

        Change the names of some local routines within TTYSRV which the 
	NRT and FE code references because they are already globally 
        defined elsewhere in the monitor.
        
        Remove the TDCALL macros from TTYDEF and rewrite them so they
        always assemble the device specific code. Dummy symbols will
        be defined within STG if the device specific code is not loaded.

	Move some storage that was defined within TTYSRV to STG

	Move the FE line code to a module called RSXSRV.MAC

	Move the NRT code to a module called NRTSRV.MAC

	Make TTANDV.MAC become TVTSRV.MAC

        Make a bunch of symbols global.

	Add the LOADMODULES for the new device specific modules

	Turn on the device specific code based on the global
	flags used for each network or device.


                               [End of TCO 6.1.1032]

                               TCO-number:  6.1.1033



Written-by:  GLINDELL                         Creation-date:   6-Nov-84 17:10:38
Edited-by:   GLINDELL                         Edit-date:       6-Nov-84 17:20:17


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  Yes


Program:  MONITOR
  Routines-affected:   	sclink




Problem:    In a large network, there will be a lot of 'node online'
and 'node offline' messages. What the operator is probably interested
in seeing is if any nodes go offline that people have open links to.

Diagnosis:  
Solution:    Remove the 'node online' and 'node offline' code from GALAXY.
Make SCLINK generate 'link broken' messages.

When SCLINK discovers that a user link is broken, the node in question
will be added to an offline table. Every 10 seconds the table will be
checked, and an operator message will be generated if the table was
non-empty. The message format will be (approximately)

15:22:03	-- Message from monitor --

User links to the following DECnet nodes were broken:
KL2102	GIDNEY	CLOYD	

If there are more than 5 nodes, only the first 5 will be typed out followed
by 'and more'.

The operator will be able to suppress typeout of these messages by
	DISABLE OUTPUT-DISPLAY (OF) DECNET-LINK-MESSAGES


                               [End of TCO 6.1.1033]

                               TCO-number:  6.1.1035



Written-by:  PALMIERI                         Creation-date:   7-Nov-84 15:20:44
Edited-by:   PALMIERI                         Edit-date:       7-Nov-84 16:11:28


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	ROUTER	DNADLL	D36COM	SCLINK	JNTMAN	D36PAR




Problem:    Can't send buffers larger than 576 on the Ethernet or the CI.

Diagnosis:    No code

Solution:   Add code to select a receive blocksize based on the circuit
type.  Create DECnet buffers that are as large as the largest receive
blocksize. The default size is 576 bytes and a larger or smaller
blocksize may be selected in CONFIG.CMD.  Provide a routine for the
session control to determine the largest blocksize that it can use on
transmit for a given logical link.  Large buffers can only be used to
adjacent nodes which support large blocksizes.  If large buffers are
in use over a circuit and the circuit fails another path to the
adjacent node may be selected.  If the new circuit has a smaller
blocksize than the previous the link will be aborted.


                               [End of TCO 6.1.1035]

                               TCO-number:  6.1.1036



Written-by:  PAETZOLD                         Creation-date:   7-Nov-84 18:33:24


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PROLOG




Problem:    

XNENT and XRENT need to be able to define global symbols.

Diagnosis:    

They do not do this now.

Solution:    

Add an optional argument to the macro.  If it is set to a non null string
("G" is recommended) then make the symbol internal.


                               [End of TCO 6.1.1036]

                               TCO-number:  6.1.1037



Written-by:  PAETZOLD                         Creation-date:   9-Nov-84 11:13:28


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PROLOG




Problem:    

No easy way to call into msec1 from xcdsec.

Diagnosis:  

Solution:    

add a new macro called callx which is just like xcall except that it does
not do a EA.ENT.


                               [End of TCO 6.1.1037]

                               TCO-number:  6.1.1038



Written-by:  GROSSMAN                         Creation-date:   9-Nov-84 13:02:56


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	STG




Problem:  Too much scheduler overhead

Diagnosis:  LV8CHK does a CALL R 50 times a second.

Solution:  Remove the CALL R in LV8CHK.  Actually, it is really a CALL NISCH.
Since PHYKNI does not need a scheduler level entry point, NISCH was redefined
to be R.


                               [End of TCO 6.1.1038]

                               TCO-number:  6.1.1039



Written-by:  PAETZOLD                         Creation-date:   9-Nov-84 18:17:12


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	SCHED




Problem:    

MRETN to monitor context from XCDSEC causes previous caller ACs to not be
restored correctly.

Diagnosis:    

MRETN is running in XCDSEC also (in this case).  Microcode problem causes
the BLT to restore ACs to fail.

Solution:    

Force monitor into section one at MRETN1.


                               [End of TCO 6.1.1039]

                               TCO-number:  6.1.1040



Written-by:  PRATT                            Creation-date:   9-Nov-84 18:30:50


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV




Problem:    

	Setting the pause/unpause characters on a system you are
	CTERM'd, can generate a confusing message from the cterm-server 
	and also mess up the echoing of those characters.

Diagnosis:    

	There were actually a few problems:

	The CTHPPC routine sends a message to tell the host to 
	change it's local echoing for the pause/unpause characters.
	After entering the routine, by mistake, the code tries to save the 
	characters twice and unfortunately does it wrong both times.
	The 1st time it is saved in the wrong AC, and the 2nd time
	it picks up the characters out of a smashed AC.

	This produced weird characters which were sent to the 
	cterm-server program on the other system and exercises
	a bug there as well.

Solution:    

	Save the characters in the right AC before they are smashed.


                               [End of TCO 6.1.1040]

                               TCO-number:  6.1.1041



Written-by:  PAETZOLD                         Creation-date:  11-Nov-84 11:39:33


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	IPIPIP	IPFREE	IPNIDV	IPCIDV	TCPTCP	TCPCRC
			TCPBBN	TCPJFN	IMPANX	IMPDV	MNETDV	ANAUNV
			TVTSRV	STG




Problem:    

Address space.

Diagnosis:    

Not enough.

Solution:    

Move ARPANET code to XCDSEC.  Move almost all of it.  Some of the TCOPR%
JSYS code remains in MSEC1, TVTSRV remains in MSEC1 but calls into XCDSEC.
This frees up 37 pages in MSEC1.


                               [End of TCO 6.1.1041]

                               TCO-number:  6.1.1045



Written-by:  GROSSMAN                         Creation-date:  12-Nov-84 17:09:02
Edited-by:   GROSSMAN                         Edit-date:      13-Nov-84 01:27:51


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FORK	JSYSA	NIPAR	NISRV	MONSYM	NIUSR
			PHYKNI	




Problem:    No NI% JSYS code.

Diagnosis:    Code not written.

Solution:    Write the code.

Note that a new KNILDR is required.  Also note that the new ERRMES.BIN
should be put up.


                               [End of TCO 6.1.1045]

                               TCO-number:  6.1.1046



Written-by:  GROSSMAN                         Creation-date:  12-Nov-84 23:36:57


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	nisrv




Problem:  Closing of an NISRV portal occasionally results in a KNIBER BUGHLT.

Diagnosis:  Overly paranoid programmer.  KNIBER (KNI Bad Error Return) happens
whenever PHYKNI gives an unexpected error return to NISRV.

Solution:  There is no need for such paranoia, pass the error upwards and
let the caller deal with the problem (the problem is usually a memory
allocation failure).


                               [End of TCO 6.1.1046]

                               TCO-number:  6.1.1047



Written-by:  GROSSMAN                         Creation-date:  13-Nov-84 00:00:42


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phykni




Problem:  Various and sundry fixes in preparation for NI% JSYS:

1) PHYKNI can now abort commands for a channel that isn't running.  This
   causes the command to be returned (via the callback mechanism) with the
   error code UNCAB%.  This means that a portal can now be closed even
   though the channel is dead.

2) A mousetrap has been put in to help track down the spurious KNISTP's
   that people have been seeing.  This will cause a KNICRS (Can't Read
   Station Info) BUGCHK to print out if PHYKNI is unable to queue up a
   command to the port to see if it's alive.

3) KNISTP was fixed so that it will really stop the KLNI.  This allows
   KNILDR to dump and reload it.  (Unfortunately, KNILDR has a bug which
   currently prevents this from happening).

4) Fix PXCT bug in FIXBSD when dealing with user mode addresses.

5) Make sure that Receive Failure Bit mask is 0 if Receive Failure count
   is 0.

6) Fix race in NISTP.


                               [End of TCO 6.1.1047]

                               TCO-number:  6.1.1048



Written-by:  GROSSMAN                         Creation-date:  13-Nov-84 00:15:06
Edited-by:   GROSSMAN                         Edit-date:      13-Nov-84 00:21:00


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIPAR




Problem:    KLNI state symbols in the monitor do not correspond with NI%
state symbols in MONSYM.

Symbols aren't global.

TABENT macro cannot deal with arguments that expand to multiple lines of
macro.

Diagnosis:  
Solution:    Make all UNS.xx symbols in the monitor correspond with .NISxx
symbols in MONSYM.

Make all definitions in NIPAR be global to avoid confusion if values should
happen to change.

Rewrite TABENT and friends.  Now you can generate a table with LOADs and
STORs as arguments.


                               [End of TCO 6.1.1048]

                               TCO-number:  6.1.1051



Written-by:  GROSSMAN                         Creation-date:  13-Nov-84 16:48:19
Edited-by:   GROSSMAN                         Edit-date:      13-Nov-84 16:51:15


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:    Programs using NISRV can get hung if KNILDR never completes reload
of the KLNI.

Diagnosis:  
Solution:    Time out KNILDR.  If KNILDR doesn't complete in 15. seconds, then
a KNIRTO BUGCHK will occur.  We will also put the port in the "Can't Reload"
state, and all portals will be informed that the KLNI is now OFF.


                               [End of TCO 6.1.1051]

                               TCO-number:  6.1.1052



Written-by:  GROSSMAN                         Creation-date:  13-Nov-84 22:55:33


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	APRSRV




Problem:  EA.ENT much too slow.

Diagnosis:  The routine $EAENT was written in the stone age of extended
addressing.

Solution:  Take advantage of new inventions (like XJRST).  This considerably
simplifies switching from section 0 to section 1.


                               [End of TCO 6.1.1052]

                               TCO-number:  6.1.1054



Written-by:  GROSSMAN                         Creation-date:  15-Nov-84 14:06:27


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MONSYM	NIUSR




Problem:  NI% JSYS symbols conflict with other symbols in the monitor.

Solution:  Change prefix of Buffer Descriptor blocks from BD to BX.  Change
prefix of NI% JSYS argument block from NI to EI.

Change a spec.  Change 7 programs.


                               [End of TCO 6.1.1054]

                               TCO-number:  6.1.1055



Written-by:  MELOHN                           Creation-date:  16-Nov-84 19:26:05
Edited-by:   MELOHN                           Edit-date:      17-Nov-84 16:24:02


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA	SETSPD	GLOB	LATSRV




Problem:  Need SETSPD command and corresponding SMON% to set default LAT state 
at startup time.

Diagnosis:  No Code.

Solution:  Add new SMON function to set LAT initial startup state. Change SETSPD
to set this state (default is LAT ON). Users can set LAT state off, and then use
LCP commands (LATOP%) to set groups, IDs, etc, before turning LAT on. Most users
will ignore this command, and LAT will come by default, with LAT group 0
enabled.


                               [End of TCO 6.1.1055]

                               TCO-number:  6.1.1056



Written-by:  PRATT                            Creation-date:  18-Nov-84 13:23:53
Edited-by:   PRATT                            Edit-date:      18-Nov-84 13:30:27


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	COMND


Related-QAR:  	706236



Problem:      

        COMND returns NPXAMB (ambiguous) when given a FDB list of
        .CMSWI followed by any other function if the user types
        "/<ESC>".
                
	COMND should just beep and continue parsing.

Diagnosis:      

        CMAMBT gets called when an escape is seen and there is no data
        in the atom buffer. It then checks for another FDB in the list
        and if it finds one, attempts to parse using the new one. Since a
        "/" was already typed the next FDB in the list will probably not
        be able to parse causing the error return.

Solution:      

	Check to see if we have a prefix character. If so, do not 
	try to parse the next FDB, just beep and continue trying 
	to parse this field.


                               [End of TCO 6.1.1056]

                               TCO-number:  6.1.1059



Written-by:  MELOHN                           Creation-date:  19-Nov-84 15:26:59


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	TTYSRV	CTHSRV




Problem:  Detaching a job from a CTERM terminal leaves the terminal in a 
wierd state, where only the escape sequence and control C do anything.

Diagnosis:  The TDCALL for detaching a CTERM terminal does a front end request
which doesn't do much for a non-front end terminal. 

Solution:  Change the TDCALL to work just like NRT and LAT terminals, which 
don't do FE requests.


                               [End of TCO 6.1.1059]

                               TCO-number:  6.1.1061



Written-by:  GROSSMAN                         Creation-date:  20-Nov-84 00:57:44


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  DECnet and LAT do not survive KLNI reloads.

Diagnosis:  State mappings in PHYKNI are incorrect.

Solution:  Rewrite SETSTA.  Make it table driven.


                               [End of TCO 6.1.1061]

                               TCO-number:  6.1.1062



Written-by:  HAUDEL                           Creation-date:  21-Nov-84 08:01:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGUTL




Problem:  Monitor code moved to extended sections and the "keep" 
bit is not set for machines that have the MCA25.

Diagnosis:  No code to do so.

Solution:  Add code. The "keep" bit will now be set for RSCOD,RSDAT,
RSVAR,XRCOD,and XRVAR. The CSTs will also have the "keep" bit set.



                               [End of TCO 6.1.1062]

                               TCO-number:  6.1.1064



Written-by:  GROSSMAN                         Creation-date:  28-Nov-84 13:44:13


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NISRV




Problem:  NISRV is dependant upon D36COM.

Diagnosis:  NISRV uses D36COMs memory manager.

Solution:  Use the memory manager in FREE (ie: ASGRES/RELRES).


                               [End of TCO 6.1.1064]

                               TCO-number:  6.1.1067



Written-by:  GROSSMAN                         Creation-date:   3-Dec-84 13:40:02


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NITEST




Problem:  NITEST uses DNGWDS.

Solution:  Make it use ASGRES.


                               [End of TCO 6.1.1067]

                               TCO-number:  6.1.1072



Written-by:  GROSSMAN                         Creation-date:   5-Dec-84 00:33:22


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  Yes


Program:  MONITOR
  Routines-affected:   	NISRV	PHYKNI




Problem:  1) Unused code in NISRV.
2) Error responses not handled appropriately.

Solution:  1) Remove unused code.

2) Dispatch on various error types instead of using catchall KNISCE.  We
   now have:

	KNIBLV (Halt) - Buffer length violation
	KNIIEC (Halt) - Illegal error code
	KNICCF (Chk)  - Carrier check failed
	KNICDF (Chk)  - Collision detect check failed
	KNIFTL (Chk)  - Frame too long
	KNIRFD (Chk)  - Remote failure to defer
	KNIFTS (Halt) - Frame too short
	KNIDOV (Chk)  - NIA buffer overrun


Some of these errors are also passed up to the user in the form of an NISRV
error code.  Some errors that used to be reported via KNISCE now just get
passed up to the user (such as Queue length violations).


                               [End of TCO 6.1.1072]

                               TCO-number:  6.1.1074



Written-by:  GUNN                             Creation-date:   6-Dec-84 14:58:47
Edited-by:   GUNN                             Edit-date:       8-Jan-85 17:01:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  Yes    Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	IPCF	FREE


Related-SPR:  	 18886	 20473



Problem:      @RETRIEVE *.* command on directory with large number of files
(300+) may encounter random failures for some files. For example;

	@RETRIEVE *.*
	 F001.DAT.1 [OK]
	 F002.DAT.1 [OK]
	 ...
	 F213.DAT.1 [OK]
	 F214.DAT.1 Archive system request not completed
	 F215.DAT.1 [OK]
	 F216.DAT.1 Archive system request not completed
	 F217.DAT.1 Archive system request not completed
	 F218.DAT.1 [OK]
	 ...
	@

Diagnosis:  	The ARCF% JSYS function .ARRST sends an IPCF
packet to QUASAR for each file to be retrieved. The PID QUASAR uses to
receive its IPCF packets is not quota controlled. Any sender has the
potential to 'flood' QUASAR, especially under conditions where
QUASAR might not be able to receive and process its packets in a
timely fashion. It is possible under these conditions for all of the
IPCF free space to be used up temporarily until QUASAR receives the
packets and releases the space.

	The routine ARCMSG in IPCF is responsible for sending packets
to QUASAR for the archive functions. It calls the common routine
MESTOR to send the packets. MESTOR can fail in two cases which are
potentially recoverable, if the receivers PID is over quota (IPCFX7),
or the call to ASGIPC fails to get free space for the packet.
Currently, ARCMSG doesn't return error information to its caller.
There is code that attempts to protect against over quota failures by
going OKINT and DISMS'ing until the receiver has gone back under
quota, but this code has the potential of leaving the caller NOINT.

Solution:   Make ARCMSG return an error code in T1 on failure. Have
ARCMSG pass up the error code from ASGIPC or MESTOR failures. Add a
mechanism to RELIPC to flag when free space is again available and
have callers (particularly code at ARRFR in JSYSF) of ARCMSG go OKINT
and DISMS until the recoverable conditions have changed and try again.


                               [End of TCO 6.1.1074]

                               TCO-number:  6.1.1076



Written-by:  MCCOLLUM                         Creation-date:   7-Dec-84 16:38:41


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	COMND




Problem:  
MONNEJ bugchks


Diagnosis:  
Missing ERJMPs after MTOPRs in COMND


Solution:  
Add ERJMPs



                               [End of TCO 6.1.1076]

                               TCO-number:  6.1.1079



Written-by:  PAETZOLD                         Creation-date:  11-Dec-84 13:13:27


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	mnetdv




Problem:    

ATNVT% referencing section 1 stuff from 6 without proper care.

Diagnosis:    

EBD.

Solution:    

Use an XJRST when referencing TVTJFN.


                               [End of TCO 6.1.1079]

                               TCO-number:  6.1.1080



Written-by:  GROSSMAN                         Creation-date:  11-Dec-84 14:38:33


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIPAR




Problem:  Bit masks returned by Read Channel Counters funtion of NISRV
are undefined.

Solution:  Define the bits.  They live in the fields CCRFM and CCSFM (receive
and send failure masks).


                               [End of TCO 6.1.1080]

                               TCO-number:  6.1.1081



Written-by:  GROSSMAN                         Creation-date:  11-Dec-84 14:50:32


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NISRV




Problem:  If the .NICLO function of NISRV fails, the portal that was being
closed can no longer be used in any way (such as re-trying the .NICLO
function).

Diagnosis:  When .NICLO closes a portal, it sets the "closing" flag for that
portal.  This flag prevents people from doing anything with the portal while
it is being closed.  Unfortunately, when an error occurred during a close,
the "closed" flag was not being reset, and therefore nobody could play with
the portal anymore.

The most common error that occurs during an close is a resource error.
Usually, this happens during system startup, or heavy Ethernet traffic.

Solution:  Clear the "closing" flag when giving an error back to the user.


                               [End of TCO 6.1.1081]

                               TCO-number:  6.1.1082



Written-by:  GROSSMAN                         Creation-date:  11-Dec-84 15:20:01


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR




Problem:  The NI% JSYS:
1) Buffer status is not being returned upon completion of xmits and rcvs.
2) ^C can't get out of a blocking receive.
3) Random user ?Ill mem refs from programs that use the NI% JSYS.

Diagnosis:  1) Oops
2) Wrong check in the receive complete and transmit complete scheduler tests.
3) User was putting a receive buffer on a copy on write page.  The receive
   buffer code lock the page down.  The user then attempts to modify the page,
   which causes the monitor to attempt to give the user his own copy of the
   page.  Unfortunately, the page is locked down (by NIUSR) and PAGEM refuses
   to do the copy on write.  This eventually turns into an ?Illegal write ref...

Solution:  1) Write the code.
2) Check FKPS1 instead of FKPS0 inside the scheduler test.
3) Attempt to write a byte into the user's receive buffer.  If the page is
   copy on write, he will get his own private writeable copy of the page and
   all is well.  If the page is not writeable, he will get an illegal write
   reference trap of some sort.  If the page is writeable, no problem.


                               [End of TCO 6.1.1082]

                               TCO-number:  6.1.1086



Written-by:  GRANT                            Creation-date:  13-Dec-84 07:01:34


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyklp	scampi




Problem:  Poorly coordinated BUGxxx.
Diagnosis:  SCA and CI-20 device driver don't handle the closing of a virual
           circuit in the same manner when it comes to outputting stuff on
           the CTY.
Solution:  Remove SCACVC.  Make SCATMO a BUGINF rather than a BUGCHK.  
          Create KLPCVC (closed virutal circuit).  Change KLPNUP to KLPOVC
          (opened virtual circuit).

          Now, whenever TOPS-20 opens a virtual circuit you will get a KLPOVC
          and whenever TOPS-20 closes a virutal circuit you will get a KLPCVC.

                               [End of TCO 6.1.1086]

                               TCO-number:  6.1.1087



Written-by:  HAUDEL                           Creation-date:  13-Dec-84 10:09:26


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MONSYM




Problem:  DIAG% functions for the Reading and Writing of maintenance data
do not have any entries in MONSYM.

Diagnosis:  Entries never added.

Solution:  Add .DGWMD and .DGRMD to Monsym.


                               [End of TCO 6.1.1087]

                               TCO-number:  6.1.1088



Written-by:  MELOHN                           Creation-date:  13-Dec-84 15:24:47


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	latsrv




Problem:  If LAT-STATE is OFF in config file and user attempts to connect
to host before the first multicast message is sent out, system crashes with
SKDPF1.

Diagnosis:  If a start message is recieved from a LAT server and the LAT
circuit state is off, routine HMSTRT calls LCLHLT to shut down the circuit.
Since no circuit yet exists, we get a SKDPF1. 

Solution:  Re-arrange the checks in HMSTRT so that we don't bother checking
any circuit related parameters if the circuit doesn't exist yet.


                               [End of TCO 6.1.1088]

                               TCO-number:  6.1.1089



Written-by:  MELOHN                           Creation-date:  13-Dec-84 16:02:12


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV


Related-TCO:  	6.1.1058



Problem:    System crashes with ILMNRF before TCO 6.1.1058, and MCLNSK after.

Diagnosis:    
(courtesy of Gunnar Lindell)

1. User is running on a CTERM line and does a jsys that will affect the
   PSI system, for instance ATI or STIW.
2. Since the PSI system will be affected, the fork lock is acquired.
   As a consequence, the job goes CRSKED.
3. TTYSRV is called by the jsys to process the function. TTYSRV gets
   the terminal lock (LCKTTY).  As a consequence, the job goes NOINT.
4. The connection with the remote host is broken some time after that
   ULKTTY was called, and before (6).
5. TTYSRV calls CTHSRV to process the function (CTHSPS for instance).
   The first thing CTHSRV does is to lock the CTERM database (LOKCDB).
6. LOKCDB will check if the link state is RUN.  It wont be, since the
   connection was broken.  So LOKCDB will decide to 'blow the link away'.
   It does this by calling MSGREL (in CTHSRV).  MSGREL queues up a
   'carrier off' PSI.  It wont of course take effect yet, since the job
   is still NOINT.
7. After calling MSGREL, LOKCD0 goes on to call ULKTTY, and this is
   where the roof falls in.  ULKTTY will do an OKINT, this will let
   the carrier off interrupt in.  That will trap to FLOGO1.  There
   a jsys entry is simulated by calling MCENTR.  SInce we are still
   CRSKED (from the fork lock) we bughlt MCLNSK!

Solution:    Remove call to ULKTTY at LOKCD1. There should be no reason
to ever have to unlock the TDB on error, since the caller who locked
the TDB in the first place will unlock it as well. This also fixes the
case where if we fail for some reason to get the CDB, an ULKBAD BUGCHK
occurs, since we tried to unlock the TTLOK twice.


                               [End of TCO 6.1.1089]

                               TCO-number:  6.1.1090



Written-by:  HAUDEL                           Creation-date:  17-Dec-84 08:40:19


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCSJSY	SCHED	MONSYM




Problem:  Possible "race condition" errors in SCSJSY.MAC

Diagnosis:  Code not written to handle such conditions.

Solution:  Change code in SCSJSY, change the way SCSJSY code is
called from SCHED, and change/delete some error codes in MONSYM.


                               [End of TCO 6.1.1090]

                               TCO-number:  6.1.1091



Written-by:  GRANT                            Creation-date:  17-Dec-84 10:19:41


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phymsc	STG	globs




Problem:  Too many CI-related BUGxxx.
Diagnosis:  Many CI-related BUGxxx were created for debugging purposes but aren't
           necessary during normal operation.
Solution:  Create the cell CIBUGX;  the default for its contents is zero.  If
          CIBUGX is non-zero you will get more CI-related BUGxxx.

                               [End of TCO 6.1.1091]

                               TCO-number:  6.1.1092



Written-by:  TBOYLE                           Creation-date:  17-Dec-84 15:41:52
Edited-by:   TBOYLE                           Edit-date:      18-Dec-84 12:13:14


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MEXEC	EXECSU	TTYSRV




Problem:    

	Jobs lying around in LOGOUT or Not logged in EXEC jobs. This
happens on LAT, CTM, NRT and also FE lines.

Diagnosis:    

	Along the process of LOGOUT somebody usually does an unconditional
block like a DOBE in the EXEC's case after printing "Autologout" and a
call to TTDOBE in the hangup code.

	We should not do this because the final rundown in LGOUT% JSYS
trys SOBE's for 15 seconds and the gives up so that LOGOUT can proceed.

Solution:    

	Remove DOBE in AUTOL6 and remove CALL TTDOBE in the TTHNGU code.
Move the CFOBF to be after the 15 seconds rundown so that FE lines
don't have the previous guys logout message hanging around in them.


                               [End of TCO 6.1.1092]

                               TCO-number:  6.1.1095



Written-by:  HAUDEL                           Creation-date:  19-Dec-84 12:58:27
Edited-by:   HAUDEL                           Edit-date:      19-Dec-84 13:04:16


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO


Related-SPR:  	 20000



Problem:      
I/O could fail to restart if the tape "rewind timer" mechanism
set the tape status to indicate BOT and there are IORBs queued
to the device via the UDBTWQ of the UDB.

Diagnosis:      
The "rewind timer" mechanism did not include a way for this
to happen.

Solution:     
Have the "rewind timer" code set US.OIR when it sets US.BOT.


                               [End of TCO 6.1.1095]

                               TCO-number:  6.1.1096



Written-by:  LOMARTIRE                        Creation-date:  21-Dec-84 08:54:26
Edited-by:   LOMARTIRE                        Edit-date:      21-Dec-84 08:55:43


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKLP	JSYSA	MONSYM	GLOBS	CFSSRV




Problem:     There is no way to obtain the names of the HSC nodes to which we have 
open connections.

Diagnosis:     No function to do this.

Solution:     Add a function to the CNFIG% JSYS.  This function, .CFHSC, will 
return the node names of any HSCs to which we have an open VC.  The argument 
block returned is identical in format to the one returned by the .CFCND 
function (return all CFS nodes).


                               [End of TCO 6.1.1096]

                               TCO-number:  6.1.1097



Written-by:  GRANT                            Creation-date:  28-Dec-84 09:32:43


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYMVR	SCAMPI	GLOBS	STG




Problem:  It is difficult to debug in a multi-CPU configutation because various
         CI-related timers go off and cause connections to close.
Diagnosis:  There is no nice way of turning off these timers.
Solution:  Create the cell CITIMR and make it non-0 if you are debugging and
          want to stop on breakpoints without having the other node(s)
          time you out.  

                               [End of TCO 6.1.1097]

                               TCO-number:  6.1.1099



Written-by:  GRANT                            Creation-date:  30-Dec-84 06:36:21


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYPAR




Problem:  It is difficult to find the device unit number, given a CDB, KDB,
         and UDB.
Diagnosis:  The data structures are not always interpretted the same and the
           definitions in PHYPAR don't provide much help.
Solution:  Enhance definitions of UDBSLV and CDBUDB.

                               [End of TCO 6.1.1099]

                               TCO-number:  6.1.1100



Written-by:  PAETZOLD                         Creation-date:  31-Dec-84 13:04:31


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	ipcidv




Problem:    

Low order octet of the local internet address defined for the internet 
CI interface must agree with the CI node number.  it is however easy to
do this wrong.

Diagnosis:  

Solution:    

in CIPRST generate a CIPBAD BUGINF if the address is wrong and do not 
initialize the multinet interface.


                               [End of TCO 6.1.1100]

                               TCO-number:  6.1.1101



Written-by:  PAETZOLD                         Creation-date:   1-Jan-85 13:53:09
Edited-by:   PAETZOLD                         Edit-date:       2-Jan-85 11:20:39


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	GTJFN




Problem:      

Most JFN informational JSYSi (eg. DVCHR, JFNS) do not work with DSK*: JFNs.

Diagnosis:      

GTJFN is overzealous is trimming free space blocks and is overtrimming 
the device name block when DSK*: is used.

Solution:      

Make STRDEV fix up FILOPT(JFN) to make sure enough space is reserved for
a full device name.


                               [End of TCO 6.1.1101]

                               TCO-number:  6.1.1108



Written-by:  GRANT                            Creation-date:   4-Jan-85 08:08:58
Edited-by:   GRANT                            Edit-date:       4-Jan-85 08:18:25


Edit-checked:         Yes    Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyklp	STG	globs


Related-QAR:  	706372



Problem:      If you have a broken CI-20 it may be impossible to boot your
         system because you may always get a KLPNRL BUGHLT.

Diagnosis:      If TOPS-20 decides to reload the port during startup (after its
           initial attempt), a KLPNRL is pretty much guaranteed.  There
           needs to be a clean way to boot the system and have it ignore
           the port so you can run SPEAR and find out what the problem is.

Solution:      Create the cell NOKLIP.  If it contains a non-0 value at system
          startup, the port will be reset and then ignored by TOPS-20.


                               [End of TCO 6.1.1108]

                               TCO-number:  6.1.1109



Written-by:  GROSSMAN                         Creation-date:   4-Jan-85 08:44:28


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NISRV	PHYKNI




Problem:  1) Disabling multicasts would hose the monitor over.
2) Promiscuous & Unknown modes of operation did not work.
3) NISRV too slow.
4) Unused code.
5) KNIRTO BUGCHKs
6) KLNI variables not be handled properly after a restart.

Solution:  1) Throw out the disable multicast code and start over.

3) Implement a memory cache for all memory required by transmits or receives.
   When blocks on the cache aren't used for a minute or more, return them to
   the resident free space pool.
4) Remove unused code.
5) KNIRTO timer was too small.  Increase it from 15. to 30. seconds.
6) Implement a validity flag for each KLNI variable we maintain.  When the
   KLNI gets restarted, ensure that all valid variables get set in the KLNI.

In addition, add a cell called TOTINT which contains the cumulative run time
NISRV spends at interrupt level.  This time is in the same format as that
returned by RDTIME.


                               [End of TCO 6.1.1109]

                               TCO-number:  6.1.1110



Written-by:  LEACHE                           Creation-date:   4-Jan-85 13:51:52


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	STG




Problem:    
	(1)  Symbols created in KDDT user mode get lost.
	(2)  Symbol-table growth in user mode causes the table
	     to cross a section boundary, causing section 36 to be
	     mapped.

Diagnosis:    
	(1)  Symbol table pointers not being correctly managed by the
	     pre and post KDDT code in STG.
	(2)  Symbol table being BLT'ed into the low end of section 37
	     instead of the high end.

Solution:    
	(1)  Manage the symbol table pointers correctly.
	(2)  Move the symbol table to the high end of section 37 while in
	     KDDT user mode.


                               [End of TCO 6.1.1110]

                               TCO-number:  6.1.1111



Written-by:  MCCOLLUM                         Creation-date:   4-Jan-85 15:00:39


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MEXEC




Problem:  
Job 0 JFNs can be released by a fork other than the one that originally got it.


Diagnosis:  


Solution:  
Set bit GJ%ACC in the GTJFN call at USGOPN



                               [End of TCO 6.1.1111]

                               TCO-number:  6.1.1112



Written-by:  MCCOLLUM                         Creation-date:   4-Jan-85 15:39:13


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MEXEC




Problem:  
GETAB% returns the local job index rather than the global job number
when an entry from the DEVUNT table is requested. Any program which
attempts to use this job number will get unexpected results.


Diagnosis:  
GETAB% returns the value directly from the LH of DEVUNT and does not
convert it to a global job number first.


Solution:  
If the LH of the DEVUNT entry is .GE. zero, call LCL2GL and return
the global job number to the user.



                               [End of TCO 6.1.1112]

                               TCO-number:  6.1.1113



Written-by:  GROSSMAN                         Creation-date:   5-Jan-85 15:53:46


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NISRV	STG	PHYKNI




Problem:  NISRV wastes lots of paper by printing too many BUGxxxs.

Diagnosis:  KNIFQE (Free Queue Empty), and KNIDMD/KNIDM1 are nice to have for
debugging purposes.  However, they are not good for production situations.

Solution:  Create a cell called NIBUGX that will enable the extraneous BUGxxxs
if it contains non-zero.  It will default to 0.

In addition, move LOADMODULE of NITEST into STG, and make it's loading dependant
upon the DEBUG conditional.


                               [End of TCO 6.1.1113]

                               TCO-number:  6.1.1114



Written-by:  GRANT                            Creation-date:   6-Jan-85 20:03:51
Edited-by:   GRANT                            Edit-date:       7-Jan-85 08:16:55


Edit-checked:         Yes    Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	STG	params


Related-QAR:  	706369



Problem:      SAVTRE facility noreadily accessible to non-source site.

Diagnosis:      Must poke with DDT to turn it on.

Solution:      Create symbol SAVTRF and define it as an NDG in PARAMS.  This way
          you can override it with your PARAM0.


                               [End of TCO 6.1.1114]

                               TCO-number:  6.1.1115



Written-by:  HAUDEL                           Creation-date:   7-Jan-85 08:29:18


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MEXEC


Related-QAR:  	706387



Problem:  MONNEJ's when job logs out.

Diagnosis:  No ERJMP after some JSYSes in the Monitor.

Solution:  Add the ERJMPS.


                               [End of TCO 6.1.1115]

                               TCO-number:  6.1.1116



Written-by:  GRANT                            Creation-date:   7-Jan-85 08:31:03
Edited-by:   GRANT                            Edit-date:       7-Jan-85 09:03:06


Edit-checked:         Yes    Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	mstr	PHYSIO


Related-QAR:  	706408



Problem:          OPR>SHOW STATUS DISK  incorrectly claims a disk is dual-ported
to another KL when the other port is actually the front end.

Diagnosis:        MSTR% returns incorrect MS%2PT bit value.  When the front end
started sending us its disk configuration packet which causes the U1.FED
bit to get set in the UDB, the MSTR% support routines were never updated.

Solution:         In PHYSIO's GETSTR routine, add the check for U1.FED and don't
return MS%2PT if it is on.  Also, in MSTR, eliminate the check for the disk
being part of PS:;  it's not longer needed.

Note:  Although not part of this QAR's problem, we need to check for
don't-care disk, too.


                               [End of TCO 6.1.1116]

                               TCO-number:  6.1.1118



Written-by:  MCCOLLUM                         Creation-date:   7-Jan-85 16:22:37
Edited-by:   MCCOLLUM                         Edit-date:      11-Jan-85 15:13:09


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSF




Problem:  
ARCF% function .ARGST is too slow.

Diagnosis:  
The ARCF% code updates the directory on disk even though .ARGST is a read-only
function.

Solution:  
For ARCF% function .ARGST, don't update the directory. Updating the directory
slows this function down by 100%.


                               [End of TCO 6.1.1118]

                               TCO-number:  6.1.1119



Written-by:  MCCOLLUM                         Creation-date:   8-Jan-85 14:09:39


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	TTYSRV


Related-QAR:  	706399



Problem:  
Failing TTMSG's can leave a process NOINT.

Diagnosis:  
Error returns from routine SALLIN neglect to go OKINT.

Solution:  
Add OKINTs in error returns.


                               [End of TCO 6.1.1119]

                               TCO-number:  6.1.1120



Written-by:  TBOYLE                           Creation-date:   8-Jan-85 14:54:53


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYMSC


Related-QAR:  	706362



Problem:  CHANS displays cylinder and sector as huge negative numbers
on MSCP disks.

Diagnosis:  PHYMSC forgets to turn off the physical bit before
computing UDBPS1 and UDBPS2.

Solution:  Turn off the IRBPAD bit in the area of MSCS6A.


                               [End of TCO 6.1.1120]

                               TCO-number:  6.1.1121



Written-by:  LOMARTIRE                        Creation-date:   9-Jan-85 10:49:16


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GTJFN


Related-SPR:  	 20453

Related-QAR:  	706323



Problem:   ILMNRF while processing a parse-only JFN in GTJFN.

Diagnosis:   RECDIR is attemping to parse the directory name and finds that it 
cannot find the directory.  DIRLUK will return no match but will not return an 
updated byte pointer.
For parse-only JFNs, RECDIR assumes a no match return from DIRLUK is ambiguous 
and "updates" FILOPT with the new byte pointer.  This destroys FILOPT.
This eventually leads to an ILMNRF.

Solution:   Still treat the match as ambiguous but do not update FILOPT for "no 
match" returns from DIRLUK in RECDIR.


                               [End of TCO 6.1.1121]

                               TCO-number:  6.1.1122



Written-by:  GROSSMAN                         Creation-date:   9-Jan-85 10:58:06


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  Yes


Program:  MONITOR
  Routines-affected:   	NISRV	PHYKNI




Problem:  Various and sundry fixes:

1) Move NISRV to section 6.  This frees up 6 section 0/1 pages from the
   bootable monitor.

2) Fix hung KLNI detector (KNISTP generator) so that resource errors don't
   result in spurious KNISTPs.

3) Fix SBD code in KNIJB0 (runs KNILDR).

4) Re-write GETCOR so that memory is no longer 4 word aligned.  Alignment is
   not necessary on KL10s.



                               [End of TCO 6.1.1122]

                               TCO-number:  6.1.1125



Written-by:  GRANT                            Creation-date:   9-Jan-85 14:41:06
Edited-by:   GRANT                            Edit-date:       9-Jan-85 14:46:31


Edit-checked:         Yes    Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYP4	phymsc


Related-QAR:  	706405



Problem:        TOPS-20 has wrong drive serial number.

Diagnosis:        While TOPS-20 remained running, a disk had its HDA replaced.
Once we have a UDB, we never reread the serial number.

Solution:        Whenever get an online interrupt (Massbus) or online an
MSCP disk, get the DSN again and put it in the UDB.


                               [End of TCO 6.1.1125]

                               TCO-number:  6.1.1126



Written-by:  MOSER                            Creation-date:  10-Jan-85 16:53:41


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYMVR




Problem:   Too many Bugchks.

Diagnosis:  Some are useless.

Solution:  Remove MSSGON and only do MSSSHT when it is interesting.


                               [End of TCO 6.1.1126]

                               TCO-number:  6.1.1127



Written-by:  LOMARTIRE                        Creation-date:  10-Jan-85 16:56:44
Edited-by:   LOMARTIRE                        Edit-date:      20-Mar-85 10:48:07


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	APRSRV


Related-TCO:  	6.1.1279

Related-SPR:  	 20429



Problem:     ILLUUO from a bad argument passed to GTJFN.

Diagnosis:     Code in GTJFN processes the user's byte pointer by placing a XCTBU 
of an ILDB on to the stack and then doing an XCT -2(P).  If the byte pointer is 
bogus, then this will result in the KIMXLP trace of the XCT to go to -2(P) to 
try to find the next instruction.  However, since the stack is now changed, 
this will produce random results.

Solution:     Once KIMUO4 has determined that an XCT caused the UUO, continue the 
search from the trapping instruction which is passed in in T2.  This will avoid 
extra iterations and the confusion that XCT -2(P) causes.


                               [End of TCO 6.1.1127]

                               TCO-number:  6.1.1128



Written-by:  MOSER                            Creation-date:  10-Jan-85 17:31:07


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYMVR




Problem:   MSSNUT BUGHLT.

Diagnosis:  When a drive is write locked the MSCP driver may ask the server to
online the disk with the UF.WPR bit set. The server will reject this but it
uses the wrong length message to do so. The sanity check catches this and
crashes.

Solution:  Use the right lenght after SETUID fails.


                               [End of TCO 6.1.1128]

                               TCO-number:  6.1.1129



Written-by:  GRANT                            Creation-date:  11-Jan-85 08:51:07


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	diag




Problem:  DFPTA diagnositc (CI/NI port selector) failures due to DIAG%
returning "Diagnostic owns the channel" error.
Diagnosis:  The diagnostic is trying to assign RH20 channel 0 and the monitor
is checking to see that there are no disks dual-ported to another KL.  But,
it fails to test for offline, which would allow the DIAG% to succeed.
Solution:  In DIAG.MAC's DGUDPT routine, check for disk offline (US.OFS)
before proceeding to the dual-ported type checks.

                               [End of TCO 6.1.1129]

                               TCO-number:  6.1.1132



Written-by:  LOMARTIRE                        Creation-date:  14-Jan-85 11:05:04


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSF


Related-SPR:  	 20490



Problem:   Edit 2612 to RCUSR% causes LCKDIR BUGHLTs.

Diagnosis:   The edit was done incorrectly and introduced a path which caused the 
LCKDIRs under certain RCUSR% combinations.

Solution:   Rewrite the patch correctly.  If a user specifies RC%STP but the 
user name does not contain wild cards, fail and return RC%NMD in the 
flags.


                               [End of TCO 6.1.1132]

                               TCO-number:  6.1.1133



Written-by:  LEACHE                           Creation-date:  15-Jan-85 08:30:58
Edited-by:   LEACHE                           Edit-date:      15-Jan-85 09:27:47


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	JSYSA	DISC	GLOBS


Related-SPR:  	 13064	 15605	 17743	 17832	 18333



Problem:      
   Disc allocation (as returned by INFO DISK) is often wrong for SYSTEM
directory.

Diagnosis:     
   When an OFN is acquired for the accounting file, the value supplied
in the call to ASGOFN is supposed to be the remaining allocation for the 
directory, but the value 377777,,0 is erroneously used.   This causes the
EXEC to display incorrect (sometimes negative) values in the INFO DISK
command.

Solution:     
   Get the current remaining allocation for directory SYSTEM and use
that value in the call to ASGOFN.


                               [End of TCO 6.1.1133]

                               TCO-number:  6.1.1134



Written-by:  LEACHE                           Creation-date:  15-Jan-85 09:10:25


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	DISC


Related-SPR:  	 15605	 17743	 17832	 18333



Problem:    Disk allocation (as displayed by INFO DISK) is always wrong for
ROOT-DIRECTORY.

Diagnosis:    Assigning an OFN for ROOT-DIRECTORY presents a chicken-and-egg
problem:  the directory allocation is required to get the OFN, but an OFN
is required to get the directory allocation.  The monitor attempts to solve
this by specifying the value 377777,,0 in the call to ASGOFN.  This will
cause the INFO DISK command to report erroneous values.

Solution:    In routine ASROFN, acquire the first OFN for ROOT-DIRECTORY
using a recognizably unique value for the directory allocation.  On the
next OFN assignment for ROOT-DIRECTORY, fetch the true allocation and
call ADJALC to adjust the allocation remaining for the directory.


                               [End of TCO 6.1.1134]

                               TCO-number:  6.1.1136



Written-by:  GROSSMAN                         Creation-date:  15-Jan-85 13:55:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GLOBS	LLMOP	NISRV	PHYKNI	SYSFLG




Problem:  Too many references to KNIN throughout the monitor.  Customers
should be able to control KLNIness by changing the definition of KNIN in
PARAMS.  Therefore, the only references to KNIN should be in STG.

Solution:  Remove many unneeded references to KNIN.


                               [End of TCO 6.1.1136]

                               TCO-number:  6.1.1137



Written-by:  GROSSMAN                         Creation-date:  15-Jan-85 14:23:11


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  NISRV crashes the monitor when it can't find KNILDR.

Diagnosis:  The routine KNIJB0 (which is in section 0/1) was trying to call a
routine in section 6.  Unfortunately, it it ended up calling HSYS, in section
0/1 and crashed the monitor.

Solution:  Move most of KNIJB0 to section 6.  Leave the entry point in section
0/1.


                               [End of TCO 6.1.1137]

                               TCO-number:  6.1.1138



Written-by:  GROSSMAN                         Creation-date:  16-Jan-85 15:52:43


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR




Problem:  There is no interlock mechanism to prevent multiple jobs from
playing with the KLNI.

Solution:  Create a new NI% JSYS function called .EIGET, which will acquire
ownership of the KLNI.  Only the "owner" of a KLNI is allowed to alter it's
state or set it's address.  If there is no owner, anybody will be allowed to
do these functions.


                               [End of TCO 6.1.1138]

                               TCO-number:  6.1.1139



Written-by:  MOSER                            Creation-date:  16-Jan-85 16:36:03


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CFSSRV	PHYMVR	PHYSIO




Problem:   SKED too high. Jobs get dismissed and rescheduled too much. Scheduler
thrashes when system is IO bound.

Diagnosis:   Routines flag running the scheduler by setting PSKED which forces
DISMSJ. PSKD1 is more approprate - it means "there may be a scheduling event but
don't dump the current fork".

Solution:  Set PSKD1 instead of PSKED where appropriate.


                               [End of TCO 6.1.1139]

                               TCO-number:  6.1.1140



Written-by:  GROSSMAN                         Creation-date:  16-Jan-85 16:50:27


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR	STG




Problem:  GLFNFs and various other scheduler weirdness...

Diagnosis:  NIUSR was calling the scheduler at interrupt level to request a
PSI (PSIRQ).  It works most of the time, but the scheduler isn't interlocked
with respect to interrupt level, so races can occur.

Solution:  Make a routine that gets called in LV8CHK whenever interrupt level
needs a PSI to be generated.


                               [End of TCO 6.1.1140]

                               TCO-number:  6.1.1141



Written-by:  GROSSMAN                         Creation-date:  16-Jan-85 17:10:26


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR




Problem:  NI% JSYS too slow.

Diagnosis:  It calls ASGRES and RELRES for every packet it transmits or receives.

Solution:  Make a cache for transmit and receive memory.  Search the cache first,
and call ASGRES only if no blocks are found.


                               [End of TCO 6.1.1141]

                               TCO-number:  6.1.1142



Written-by:  MOSER                            Creation-date:  17-Jan-85 13:38:20


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCHED




Problem:   SUMNR1 BUGCHKs when running the "new" scheduler.

Diagnosis:   There is now a path through the code that can change the working
set size of a fork in the balance set. Previously this was impossible. The
path is Balance set wait satisfied calls NEWST which gives a big boost and
calls NEWST which calls NEWWSS which calls ADJWSS.

Solution:  Make ADJWSS fix SUMBNR if FKIB% is on in FKSWP.


                               [End of TCO 6.1.1142]

                               TCO-number:  6.1.1143



Written-by:  LOMARTIRE                        Creation-date:  17-Jan-85 15:52:54


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GTJFN


Related-SPR:  	 20532



Problem:   If GTJFN% is performed on a filespec whose device field is a 
system-wide logical name, and the first entry in that system-wide logical name 
is a job-wide logical name, the files pointed to by the job-wide logical name 
are not found.  However, a job-wide logical name anywhere except first in the 
system-wide logical name works correctly.

Diagnosis:   If during the logical name evaluation (done in CHKLNM), a 
system-wide logical is found, the SAWSLN bit is set.  Then, until we step to 
the next entry in the logical name definition, we will only search the 
system-wide logical name table.  So, for a system-wide logical of:

	A: => ONE:, TWO:

with job-wide logicals of:

	ONE: => PS:LOGIN.CMD
	TWO: => PS:LOGOUT.CMD

a DIR of A: will not find PS:LOGIN.CMD (since ONE: is the first entry in a 
system-wide logical but it is not a system-wide logical itself).  However,
the file pointed to by TWO: will be found since TWO: is the second entry in 
the system-wide logical name.

Solution:   In CHKLNM, ignore the setting of SAWSLN when deciding whether to 
search job-wide then system-wide logicals or just system-wide logicals.  This 
will make system-wide logicals perform correctly regardless of the placement of 
the individual entries.  Namely, job-wide will be searched first, then 
system-wide.


                               [End of TCO 6.1.1143]

                               TCO-number:  6.1.1145



Written-by:  LOMARTIRE                        Creation-date:  17-Jan-85 16:04:06
Edited-by:   LOMARTIRE                        Edit-date:      17-Jan-85 16:07:30


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FILMSC


Related-SPR:  	 20459



Problem:       A fork is assigned a controlling terminal and goes into TCITST on
that terminal line.  Now, the fork is frozen, is assigned a new controlling
terminal, and is resumed.  It will still be in TCITST on the old terminal line.

Diagnosis:       TCOs 6.1526 and 6.2031 handle the case where a job controlling 
terminal is changed in the above manner.  However, there is no code to handle 
the case of a fork controlling terminal.

Solution:       Add code in TTYIN2 (after the job controlling terminal check) to 
check if the JFN is for the fork's controlling terminal.  If it is, place the 
line number in the left half of DEV.


                               [End of TCO 6.1.1145]

                               TCO-number:  6.1.1146



Written-by:  PAETZOLD                         Creation-date:  19-Jan-85 16:18:39


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	niusr




Problem:    

SKDPF1s from NIUSR.

Diagnosis:    

EBD.

Solution:    

.NIOFF needs to be resident as well NIJJIF.


                               [End of TCO 6.1.1146]

                               TCO-number:  6.1.1147



Written-by:  MAYO                             Creation-date:  21-Jan-85 10:00:52


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSF


Related-SPR:  	 19735



Problem:  ARCF% discard function doesn't clear AR%WRN.  REAPER checks
AR%WRN to see if it should delete archived files.  So, if a user
retrieves a file, keeps it a while, gets a warning from REAPER that
the file will soon expire and be deleted, and decides to keep it by
discarding archive information, he will be very suprised in a few weeks
when REAPER expires and deletes the file anyway.
Diagnosis:  ARCF% discard doesn't clear enough FDB bits.
Solution:  Have it clear AR%WRN.

                               [End of TCO 6.1.1147]

                               TCO-number:  6.1.1149



Written-by:  LEACHE                           Creation-date:  22-Jan-85 14:41:54


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CRYPT


Related-QAR:  	706373



Problem:    CHKPEV always returns failure for customer encryption algorithms.

Diagnosis:    Wrong flavor of test instruction.

Solution:    Simplify the routine and the bug goes away.


                               [End of TCO 6.1.1149]

                               TCO-number:  6.1.1150



Written-by:  MELOHN                           Creation-date:  22-Jan-85 21:08:27


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV




Problem:    The CTERM fork runs every 100ms whether or not it has anything to
do. SYSDPY seems to indicate that it is using .3 seconds of runtime every
minute even when are are no users of CTERM.

Diagnosis:    The CTERM would run MUCH less often if it had a scheduler test
that woke up when there was something for the CTERM fork to do.

Solution:    Add scheduler test CTMTST, and make the CTERM fork MDISMS with it
until it has something to do.


                               [End of TCO 6.1.1150]

                               TCO-number:  6.1.1152



Written-by:  PAETZOLD                         Creation-date:  22-Jan-85 22:03:58


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	phykni




Problem:    

KPALHVs  from  NISRV  looping in NIDPT2 when a portal is closed and the
monitor has arpanet code and the ethernet cable has a UNIX based system
on it and the TOPS-20 system in question has had very  little  ethernet
IP traffic in the last several minutes before the portal is closed.

Diagnosis:    

A  UNIX  system  sends  various ethernet multicast messages with the IP
protocol type. TOPS-20 does not enable this  multicast  address  and  a
KNIDMD/KNIDM1  combination will result. After this event the buffer and
its CM block will once again be placed on  the  free  queue  without  a
callback to the client.

At  this point we decide to close the portal. NIDPT goes into a loop at
NIDPT2 to NIDPT1 attempting to release all buffers back to the  client.
These  buffers  have  never  had  IO performed on them (since they were
queued) and may still have the multicast address set in the  CM  block.
At   this  point  MSGAVA  will  once  again  generate  a  KNIDMD/KNIDM1
combination and requeue the block back onto the free queue. We now have
a loop trying to remove this block from the free queue.

After attempting to do this many tens of thousands of times we  finally
crash with a KPALHV.

Solution:    

Make  sure  the  destination ethernet address of a buffer coming out of
the NIDPT2 loop has a clear address without any multicast bits set.

The reason this  problem  was  never  reproducable  on  the  CHIP/ETHER
systems  is  that  there  is no UNIX system on the private ethernet but
there is on the public ethernet.


                               [End of TCO 6.1.1152]

                               TCO-number:  6.1.1153



Written-by:  GRANT                            Creation-date:  23-Jan-85 16:11:36


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO




Problem:  Unusual events can occur with don't-care disks and no one is notified.
Diagnosis:  When a don't-care mismatch occurs (drive and disk are not the same
status) we treat the disk under standard access rules but say nothing about
what was discovered.
Solution:  Add PHYDCU and PHYDCD BUGCHKs and PHYDCR BUGINF.
PHYDCU is for standard disk on don't-care drive;  PHYDCD is don't-care disk on
standard drive;  PHYDCR means we are treating the disk as don't-care.

                               [End of TCO 6.1.1153]

                               TCO-number:  6.1.1154



Written-by:  HAUDEL                           Creation-date:  23-Jan-85 16:15:52


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSF


Related-QAR:  	706126



Problem:  MONNEJ bugchk.

Diagnosis:  No ERJMP after a RPACS jsys in JSYSF.

Solution:  Add and ERJMPR.


                               [End of TCO 6.1.1154]

                               TCO-number:  6.1.1155



Written-by:  WAGNER                           Creation-date:  24-Jan-85 14:17:07


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CFSSRV




Problem:  CFS Hashing algorithm uses non-prime numbers.

Diagnosis:  Theoretically primes are better

Solution:  Make HSHLEN prime (either ^D509 or ^D251, based on CFSSCA)


                               [End of TCO 6.1.1155]

                               TCO-number:  6.1.1156



Written-by:  HAUDEL                           Creation-date:  24-Jan-85 16:00:17
Edited-by:   HAUDEL                           Edit-date:      19-Feb-85 16:51:33


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCHED


Related-QAR:  	706331



Problem:      Cannot use the .SKSCJ (set job class) function of the SKED% for the
calling job with privs disabled.

Diagnosis:      The code sets up the job number in T1 and then uses T2 in testing
job number.

Solution:      Change CAMN T2,JOBNO to CAMN T1,JOBNO at SKDSJC+16 in SCHED.MAC.


                               [End of TCO 6.1.1156]

                               TCO-number:  6.1.1157



Written-by:  GROSSMAN                         Creation-date:  28-Jan-85 10:15:50


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	STG	NIUSR




Problem:  Entry into NIJJIF too slow.

Diagnosis:  XCT is slower than SKIPE.

Solution:  Replace XCT of CALL NIJJIF with SKIPE flag followed by CALL NIJJIF.


                               [End of TCO 6.1.1157]

                               TCO-number:  6.1.1158



Written-by:  PAETZOLD                         Creation-date:  28-Jan-85 15:22:33


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	IMPANX	IPNIDV	IPCIDV




Problem:    

Interrupt context code (eg. device drivers) for internet often request service
by the internet fork by incrementing INTFLG.  If the system is in the null
job the scheduler will not notice this for a while.  This causes extra 
latency on unloaded systems.

Diagnosis:  

Solution:    

AOS PSKD1 as well as INTFLG.


                               [End of TCO 6.1.1158]

                               TCO-number:  6.1.1159



Written-by:  PAETZOLD                         Creation-date:  28-Jan-85 15:52:00


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	TTYSRV	TVTSRV




Problem:    

TVT output performance could be faster and could make better use of networking
resources.

Diagnosis:    

Currently monitor assumes TVTs are slow speed.

Solution:    

Make monitor believe they are fast.


                               [End of TCO 6.1.1159]

                               TCO-number:  6.1.1160



Written-by:  PAETZOLD                         Creation-date:  28-Jan-85 16:37:25


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	ipfree




Problem:    

None observed but code is wrong.  DEFSTR for USIZE in IPFREE has a 36 bit
field ending on bit 17.

Diagnosis:  

Solution:    

Make it a 18 bit field.


                               [End of TCO 6.1.1160]

                               TCO-number:  6.1.1161



Written-by:  PAETZOLD                         Creation-date:  28-Jan-85 17:15:28


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	tcpjfn




Problem:    

It is possible for the monitor to leave a TCP: JFN in a locked state if
you  attempt  to  use TCOPR% on a TCP: JFN that is no longer associated
with a TCB. The problem is caused by using  the  RETERR  macro  without
first  making  sure that the JFN is indeed unlocked. The problem occurs
in two places in TCPJFN.MAC.

Diagnosis:    

EBD.

Solution:    

Fix it.


                               [End of TCO 6.1.1161]

                               TCO-number:  6.1.1162



Written-by:  GROSSMAN                         Creation-date:  29-Jan-85 14:27:57


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI	NIPAR	MONSYM




Problem:  The "SET PORT NI AVAILABLE" command to OPR doesn't restart the KLNI
as advertised.

Diagnosis:  OPR is trying to put the KLNI into the RUN state.  Unfortunately,
the KLNI may contain bad ucode.  The monitor doesn't know this, and just starts
the KLNI anyway.  Usually, a KNIPER results, sometimes death is the result.

Solution:  Create a new state called the "Reload Requested" state (.EISRR).
Setting the KLNI into this state causes KNILDR to run and reload the KLNI.
This state may only be set if the current KLNI state is not RUN.


                               [End of TCO 6.1.1162]

                               TCO-number:  6.1.1163



Written-by:  MCCOLLUM                         Creation-date:  29-Jan-85 15:35:25


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FORK




Problem:  
PTAIC bughlts


Diagnosis:  
The code around KSEF1 SETOMs the entry in the SYSFK table before routine
CLNZSC is called to delete the user's non-zero sections. CLNZSC does an
SMAP% JSYS to delete any non-zero sections using .FHSLF as a process handle.
Routine FKHPTX attempts to translate .FHSLF by looking in the SYSFK table
and gets the -1 put there by KSEF1. A string of incorrect references based
on this -1 eventually causes SETCPT to try to map in a page table for the
section using a bogus SPT index. A reference to this page table by SECPTR
causes the crash.


Solution:  
Do not clear the entry in SYSFK until after the non-zero sections are
successfully deleted by CLNZSC.



                               [End of TCO 6.1.1163]

                               TCO-number:  6.1.1164



Written-by:  GRANT                            Creation-date:  31-Jan-85 07:48:54
Edited-by:   GRANT                            Edit-date:      31-Jan-85 07:50:45


Edit-checked:         No     Document:          Yes    TCO-tested:  Yes
Maintenance-release:  Yes    Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyp2	PHYSIO


Related-QAR:  	706360



Problem:    RP20s don't have drive serial numbers and the current method of faking
them doesn't work very well.

Diagnosis:    Using CHECKD to put a number in the disk's homeblock is a nuisance
for the system manager and also causes some special case RP20 code in the
monitor which seems needless and appears to be bug-prone.

Solution:    When an RP20's UDB is created, the monitor will make up a drive
serial number by adding the unit number to 8000 (decimal) and placing it in
the UDB.  Then whenever a serial number is required, the RP20 is guaranteed
to have one, just like any other disk.

The scheme implies a restriction, namely, all RP20s connected to systems in
a cluster must have unique unit numbers.


                               [End of TCO 6.1.1164]

                               TCO-number:  6.1.1165



Written-by:  GRANT                            Creation-date:  31-Jan-85 13:08:46


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyp2




Problem:  Homeblocks don't get checked when an RP20 comes online.
Diagnosis:  No code.
Solution:  Before calling PHYONL, set US.CHB.

                               [End of TCO 6.1.1165]

                               TCO-number:  6.1.1166



Written-by:  MOSER                            Creation-date:  31-Jan-85 15:25:53


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	STG	GLOBS	PROLOG	CFSSRV	LINEPR	DISC
			PAGUTL




Problem:   The monitor is too slow. Especially when accessing long files
randomly.


Diagnosis:   Management of OFN resources is not done optimally. Since
many Jsyses use OFNs any improvement in this area can potentially
speed up the system significantly. Long files are especially bad since they
contain many OFNs.

Solution:  Change OFN assignment to use a hash and link algorithm. Change
OFNJFN to do a single compare instead of a costly search for long files.
Change all ASxOFN callers to conform to the new sequence. Most of the
changes are outlined in the OFN-MANAGMENT-PERFORMANCE.MEM spec.


                               [End of TCO 6.1.1166]

                               TCO-number:  6.1.1167



Written-by:  TBOYLE                           Creation-date:  31-Jan-85 15:50:45


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYP2




Problem:  RP20's never get bad blocks into the BAT BLOCKS.

Diagnosis:  PHYP2 never properly handled the CLASS 5 ERROR.
CLASS 5 is for DX20 requested command retry. This happens
on all data errors. In IBM land, when a data error occurs,
the drive requests the channel to retry the transfer several
times before reporting an error. i.e. the channels perform
error recovery.

We have always ignored the CLASS 5 ERROR. The microcode as of
edit 17 also gives us this error under some specific conditions
which we will need to check for.

Solution:  Change the CLASS 5 error handler to look for appropriate
data errors and flag them properly for BAT BLOCK processing. Errors
that are not due to data are carefully checked for and left as
device errors.


                               [End of TCO 6.1.1167]

                               TCO-number:  6.1.1169



Written-by:  GROSSMAN                         Creation-date:   2-Feb-85 11:11:25


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GLOBS	STG	PHYH2	PHYKNI	DIAG




Problem:  Diagnostics DFPTA, and DFNIE do not work with the KLNI.

Diagnosis:  They were trying to do DIAG% functions to read and write the channel
logout areas.  These functions would fail because there was no CDB for the
channel the KLNI uses (channel 5).

Solution:  Create a CDB and dispatch table for the KLNI.  Fill it in with the
bare minimum of information needed to support DIAG and keep PHYSIO off my
back.  In addition, move the initialization of the KLNI from PHYH2 to PHYSIO
(by modifying PHYCHT in STG).  Also, put some checks into various DIAG functions
so that programmers cannot attempt to do disk type stuff to the KLNI.


                               [End of TCO 6.1.1169]

                               TCO-number:  6.1.1170



Written-by:  GROSSMAN                         Creation-date:   2-Feb-85 11:37:59


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  Yes


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  The KNISTP BUGCHK does not print the micro PC if the KLNI is still
running.

Solution:  Stop the KLNI just before doing the BUGCHK.  This will help detect
microcode loops.


                               [End of TCO 6.1.1170]

                               TCO-number:  6.1.1171



Written-by:  PAETZOLD                         Creation-date:   3-Feb-85 15:13:15


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	PHYH2




Problem:    

ILLGO BUGHLTs on systems with 4096K.

Diagnosis:    

Channels talk to physical memory.  At the end of a transfer the logout
area contains the address+1 of the last word written (or read).  This is 
a modulo 22 bit address.  ie. when writing to page 17777 of memory the
logout area will have a zero.  

The code at CKERR2 fetches the logout address and decrements it.  However
this results in a minus one and not 17777.  The CAME then fails and we get 
an ILLGO.

Solution:    

Insert an ANDX to retain only the desired bits (and simulate modulo 22 bit
addressing).


                               [End of TCO 6.1.1171]

                               TCO-number:  6.1.1172



Written-by:  PAETZOLD                         Creation-date:   3-Feb-85 16:33:01


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	DSKALC




Problem:    

assembly errors from dskalc.

Diagnosis:    

hosers.

Solution:    

Change OFNPTT symbol to OFPTT to avoid conflict with new stuff in PROLOG.


                               [End of TCO 6.1.1172]

                               TCO-number:  6.1.1173



Written-by:  GLINDELL                         Creation-date:   5-Feb-85 09:55:22


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	sclink	ntman	router




Problem:  LOOP NODE does not work
Diagnosis:  Never tested
Solution:  Fix it

                               [End of TCO 6.1.1173]

                               TCO-number:  6.1.1174



Written-by:  GLINDELL                         Creation-date:   5-Feb-85 10:18:38
Edited-by:   GLINDELL                         Edit-date:       5-Feb-85 10:20:05


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FORK


Related-QAR:  	706220



Problem:    
It is common practice to set T3 to -1 in the EPCAP% jsys.  This will
enable all possible capabilities.  When the EPCAP% asks the ACJ for
permission, it passes the unmodified T3 to ACJ.  When the ACJ sees
the -1 it is impossible to make a decision since the actual bits to
be enabled cannot be distinguished.

Diagnosis:  
Solution:    
If the user passes -1 in T3, then calculate the bits the user
is actually trying to enable.


                               [End of TCO 6.1.1174]

                               TCO-number:  6.1.1175



Written-by:  TBOYLE                           Creation-date:   5-Feb-85 13:35:17
Edited-by:   TBOYLE                           Edit-date:       5-Feb-85 13:43:13


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYP2




Problem:  SPEAR does not report all of the extended status bytes for
RP20 device/data errors.

Diagnosis:  Although there is room for them in the SYSERR block,
the monitor does not fill them all in.

Solution:  Fix PHYP2 to use all 80 status bytes. Chage SNSNUM, and 
fix the calculation for number of words based on SNSNUM. We will
use 20 words.


                               [End of TCO 6.1.1175]

                               TCO-number:  6.1.1176



Written-by:  GLINDELL                         Creation-date:   5-Feb-85 15:17:56


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA


Related-QAR:  	706380



Problem:  
Can't delete section zero pages from code in non-zero section.

Diagnosis:  
PMAP% code ignores PM%EPN if it's the "delete process page" function.

Solution:  
At PMAP0 + a few, if it's the delete option, read the users value of
PM%EPN before calling FKHPTX.

Note: the documentation for PMAP% should be changed.  It currently
states that PM%EPN cannot be used with "delete process pages" (case IV)
in the documentation.  As of this TCO, this restriction has been
removed.


                               [End of TCO 6.1.1176]

                               TCO-number:  6.1.1177



Written-by:  GLINDELL                         Creation-date:   5-Feb-85 17:11:05


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SYSERR


Related-QAR:  	706428



Problem:  
The bug description is trashed when looking at bug entries with SPEAR.

Diagnosis:  
18-bit arithmetic for 30-bit values.

Solution:  
Single instruction patch at SEBCP3 to use a 1-word global bytepointer.


                               [End of TCO 6.1.1177]

                               TCO-number:  6.1.1180



Written-by:  GRANT                            Creation-date:   7-Feb-85 08:28:52
Edited-by:   GRANT                            Edit-date:       7-Feb-85 08:34:53


Edit-checked:         Yes    Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO




Problem:      Wrong CI wire information in PDB on disk.

Diagnosis:      When a pack gets moved from one drive to another, the PDB is cleared,
the new drive serial number is set along with our node's info.  But, a bad
SKIP instruction was preventing the current CI wire status from getting set.

Solution:      In routine CLRPDB, change SKIPGE to SKIPL so CALL to PTHSTS occurs.


                               [End of TCO 6.1.1180]

                               TCO-number:  6.1.1181



Written-by:  GROSSMAN                         Creation-date:   7-Feb-85 14:31:21


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  

Support new KLNI microcode version number format.  There are now major and
minor version numbers, and an edit number.  NISRV will not start the KLNI
if the major and minor version do not exactly match the expected values.

The expected values are in UVCMAJ for the major version number, and UCVMIN
for the minor version number.  If the microcode version does not match the
values in UCVMAJ and UCVMIN, a KNIVER BUGxxx will result.  The data items
are:

		1) Bad major version
		2) Bad minor version
		3) Expected major version
		4) Expected minor version


                               [End of TCO 6.1.1181]

                               TCO-number:  6.1.1183



Written-by:  GRANT                            Creation-date:   8-Feb-85 17:09:17
Edited-by:   GRANT                            Edit-date:       8-Feb-85 17:12:55


Edit-checked:         Yes    Document:          Yes    TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYMVR	MONSYM




Problem:    Booting a system with NOKLIP patched to contain a non-zero value causes
SETSPD to produce the error MSCPX1 (No MSCP server in current monitor) for each
ALLOW command in CONFIG.

Diagnosis:    The error code is being returned from the monitor to SETSPD for each
SMON%.  The error is misleading since it's the same one you get if you build a
monitor without the MSCP server module PHYMVR.MAC.

Solution:    Create a new error code MSCPX4 whose meaning is "MSCP server not
currently running" and have PHYMVR return it instead of MSCPX1 when the server
has not been initialized, which is the state it is in if the system is not
using a CI.


                               [End of TCO 6.1.1183]

                               TCO-number:  6.1.1184



Written-by:  GRANT                            Creation-date:  11-Feb-85 08:37:55
Edited-by:   GRANT                            Edit-date:      11-Feb-85 09:36:58


Edit-checked:         Yes    Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO	phyklp




Problem:    TOPS-20 refuses to access a disk when TOPS-10 is run on one of the
KLs in the cluster which once ran TOPS-20.

Diagnosis:    TOPS-20 assumed that 1) systems on the CI don't change their node
numbers, 2) there are no VAXes on the CI, and 3) a KL on the CI must be
running TOPS-20.

Solution:    Add logic to handle the following cases: 1) node x was once a KL
but is now an HSC or VAX, and 2) TOPS-20 and TOPS-10 can be run
interchangeably on a KL in the cluster.


                               [End of TCO 6.1.1184]

                               TCO-number:  6.1.1185



Written-by:  GLINDELL                         Creation-date:  11-Feb-85 11:31:05


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA


Related-QAR:  	838001



Problem:  
All accounts that have an /EXPIRATION date set in ACCOUNTS-TABLE.BIN
will get "Account has expired" independent of what the expiration date
was set to.

Diagnosis:  
Someone changed CHKEXP to CALL LGTAD instead of doing a GTAD% jsys.
That was a good idea, but unfortunately LGTAD does not preserve T2/B.

Solution:  
Save T2 over the call to LGTAD.


                               [End of TCO 6.1.1185]

                               TCO-number:  6.1.1187



Written-by:  GRANT                            Creation-date:  12-Feb-85 07:07:25


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO




Problem:  Disk forced offline when it should be accessible.
Diagnosis:  Bring up a system not on a CI and have a MASSBUS disk dual-ported
to another system.  TOPS-20 will refuse access to the disk;  this is the
desired action.  However, single-porting the disk should cause TOPS-20 to
allow access to the disk but it doesn't due to a bug which didn't get the
forced-offline bit cleared.
Solution:  At location UPDBYE, ANDCAM the U1.OFS bit into the status word.

                               [End of TCO 6.1.1187]

                               TCO-number:  6.1.1188



Written-by:  WAGNER                           Creation-date:  12-Feb-85 09:56:16


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCHED




Problem:  DDMP AND CHKR CHECKED TOO OFTEN IN SKDLV8

Diagnosis:  NO NEED TO CHECK EVER 20 mS

Solution:  MOVE THE CHECKING TO CLK2CL, AND CHECK EVERY 10 S.


                               [End of TCO 6.1.1188]

                               TCO-number:  6.1.1189



Written-by:  GRANT                            Creation-date:  12-Feb-85 10:20:51


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO




Problem:  Logic of code too difficult to follow.
Diagnosis:  Routine names are confusing and routines do more than one
logical function.
Solution:  Move some code so that CHKPDB, CLRPDB, and RSTPDB, actually
check, clear, and reset the PDB, respectively.

                               [End of TCO 6.1.1189]

                               TCO-number:  6.1.1190



Written-by:  MELOHN                           Creation-date:  12-Feb-85 14:26:19


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    PLUTOs running LAT V1.1 software can occasionally crash at
PC 000002 when a user disconnects from a TOPS-20 host.

Diagnosis:    The PLUTO crashed with an invalid stop message. What
happens is this: when the number of slots on a TOPS-20 host is
decreased to 0, the spec says the circuit should be stopped. Both
TOPS-20 AND the server try to send a stop circuit message to close the
circuit. If the server's message arrives after we have cleared the
circuit database but before the server itself receives the TOPS-20
generated stop message, TOPS-20 generates a second, bogus stop message
which crashes the server.

Solution:    Don't send the stop message to the server when the number
of slots becomes zero. As part of a more general fix, a host timer
will be implemented such that if the server doesn't stop the circuit
within the keep-alive-timer * 2, TOPS-20 will send the stop circuit
message itself.


                               [End of TCO 6.1.1190]

                               TCO-number:  6.1.1191



Written-by:  MELOHN                           Creation-date:  12-Feb-85 14:37:06


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    Posideons (DECserver-100) occasionally get -207- protocol
violation messages, which stop the affected slot. Pluto based LAT
users see a "node stopped circuit" message.

Diagnosis:    The slot multiplexor routine can be called to send a "must
reply NOW" message to the server. In the case that there is no slot
data to be sent to the server, the slot formatting routine is
bypassed. This does not correctly zero the number of slots in the
message, and therefore a valid message with an invalid number of slots
is sent to the server.

Solution:    Zero the number of slots in the main loop of the slot
multiplexor routine, not during slot formatting.


                               [End of TCO 6.1.1191]

                               TCO-number:  6.1.1192



Written-by:  MELOHN                           Creation-date:  12-Feb-85 14:57:01


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    Clearing all LAT service names with LCP crashes the monitor
with SKDPF1.

Diagnosis:    The multi-cast building routine assumes that there is at
least one service offered in the multi-cast message. If that service
is deleted, the routine attempts to load a byte from a garbage
location.

A LAT host by definition must offer at least one service, so it should
not be possible for the user to clear all offered services.

Solution:    Fix the multi-cast building routine to check to make sure
there is at least one service before rebuilding the multi-cast
message. Change the LATOP% jsys to return LATX07 (Invalid or unknown
LAT service name) if the user attempts to clear the last service name
offered.


                               [End of TCO 6.1.1192]

                               TCO-number:  6.1.1193



Written-by:  PALMIERI                         Creation-date:  12-Feb-85 16:18:17


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GLOB	STG




Problem:  DECnet called unnecessarily from LV8CHK before it is initialized.
LV8CHK calls to DECnet are into section 1 when calls to section 6 would
require less code in the DECnet modules and be somewhat faster.

Diagnosis:  No code


Solution:  Check D36IFG in LV8CHK before calling DECnet.  Make calls to
	  DECnet be of the form:  CALL @[XCDSEC,,ADDRESS]


                               [End of TCO 6.1.1193]

                               TCO-number:  6.1.1194



Written-by:  LOMARTIRE                        Creation-date:  13-Feb-85 12:06:26
Edited-by:   LOMARTIRE                        Edit-date:      13-Feb-85 14:18:04


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CFSSRV	MSTR


Related-TCO:  	6.1.1195



Problem:     The current error codes returned on a failing structure mount request 
are not very descriptive and do not imply what the cause of the failure was.

Diagnosis:     CFS will vote in order to gain the requested access to the 
structure.  If the vote fails, there is no way to determine why we were told 
NO.

Solution:     Implement a way for a reason code to be passed back when a CFS node 
desides to say NO to an incoming vote.  Have the voting node interpret this 
reason code and transform it into a meaningful TOPS-20 error code.  Currently, 
only the structure resource handling routines will do this.  Below are the 
error codes which will be returned.  Note that some are new and others are just 
new text on an already existing error code.

    MSTX44 - Mount type refused by another CFS processor
    MSTX45 - Structure naming or drive serial number conflict in CFS cluster
    MSTX47 - Shared access denied; already set exclusive in CFS cluster
    MSTX48 - Exclusive access denied; access conflict in CFS cluster
    MSTX49 - Structure naming conflict in CFS cluster


                               [End of TCO 6.1.1194]

                               TCO-number:  6.1.1196



Written-by:  LOMARTIRE                        Creation-date:  13-Feb-85 12:20:29


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCAPAR




Problem:   SC.IDL gets called too often from the scheduler.

Diagnosis:   The constant used to determine the interval between calls is being 
calculated incorrectly.

Solution:   Fix the calculation.  Instead of the timing being once every 3
milliseconds, it will be once every 160 (decimal) milliseconds.
This will result in SC.IDL being called roughly half as often.


                               [End of TCO 6.1.1196]

                               TCO-number:  6.1.1197



Written-by:  WAGNER                           Creation-date:  13-Feb-85 13:45:53


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCHED




Problem:  BGND includes time that is actually idle.

Diagnosis:  New code tries some background tasks before running the NUL job.
	   Because of the way the code was written, the time that it spends
	   doing this is charged to BGND.

Solution:  Modify RDSIVL to not accumulate time if flag BKIDLF is set. This
	  way time is charged to whichever idle is appropriate for the time
	  that we spend doing background tasks before running the NUL job.


                               [End of TCO 6.1.1197]

                               TCO-number:  6.1.1198



Written-by:  GRANT                            Creation-date:  13-Feb-85 14:32:53


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyklp




Problem:  Over zealous sending of REQUEST-IDs.
Diagnosis:  When a virtual circuit is closed TOPS-20 tries madly to 
re-establish the connection by sending REQUEST-IDs once a second to the
node which went away.  This doesn't seem necessary.
Solution:  In the once-a-second checker, remove the code which checks for
closed virtual circuits.  Communication will be resumed by more standard
methods when the node reappears.


                               [End of TCO 6.1.1198]

                               TCO-number:  6.1.1200



Written-by:  LOMARTIRE                        Creation-date:  14-Feb-85 15:52:01


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CFSSRV




Problem:   With the new DSA disks, the HDA can be replaced during timesharing and 
the drive serial number of the drive changes.  This causes a lot of problems 
with CFS since now this structure may be know by another "name" since CFS uses 
the serial number as root of one of the structure resources.

Diagnosis:   CFS was not coded to handle this case.

Solution:   Whenever PHYSIO detects the case of a serial number change, it will 
update the UDB of the disk and call CFSDSN in order to update the structure 
resource.  The old resource block will be unlinked, updated, and relinked.


                               [End of TCO 6.1.1200]

                               TCO-number:  6.1.1201



Written-by:  GRANT                            Creation-date:  18-Feb-85 11:09:38
Edited-by:   GRANT                            Edit-date:      18-Feb-85 11:17:32


Edit-checked:         Yes    Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO




Problem:    The UDBs for all HSC-based disks contain the wrong value for the
high-order word of the drive serial number.

Diagnosis:    Since non-HSC-based disks only have a 1-word DSN, TOPS-20 makes
up the high-order word to fill in the UDB.  It was not correctly making the
distinction between HSC and non-HSC disks and, thus, smashing the high-order
word of the HSC-based disks' UDBs.

Solution:    In routine PHYDUA, fix the index register to be P1 instead of P3
so the CDB status word is correctly obtained.


                               [End of TCO 6.1.1201]

                               TCO-number:  6.1.1202



Written-by:  GROSSMAN                         Creation-date:  18-Feb-85 22:14:03
Edited-by:   GROSSMAN                         Edit-date:      20-Feb-85 09:39:12


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR	MONSYM




Problem:    Implement Read Channel Counters.

Diagnosis:  

Solution:  


                               [End of TCO 6.1.1202]

                               TCO-number:  6.1.1203



Written-by:  GROSSMAN                         Creation-date:  18-Feb-85 22:50:03
Edited-by:   GROSSMAN                         Edit-date:      20-Feb-85 09:48:36


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR




Problem:   Programs doing blocking transmits with the NI% JSYS wake up too
frequently.

Diagnosis:   The scheduler test for blocking transmits always succeeds because
an AC was not being set up.

Solution:   Setup the AC before using it.


                               [End of TCO 6.1.1203]

                               TCO-number:  6.1.1204



Written-by:  GROSSMAN                         Creation-date:  18-Feb-85 23:10:53
Edited-by:   GROSSMAN                         Edit-date:      20-Feb-85 09:57:12


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:    PHYKNI treats KLNI CRAM parity error 7777 as an unplanned CRAM
parity error, when in reality it is planned (ie: intentional).

Diagnosis:    PHYKNI treats KLNI CRAM parity errors in the range 7750-7775
(inclusive) as Planned CRAM Parity Errors.  Unfortunately, the microcoders
are now using 7777 as a PCPE.  So, now the rules are: all parity errors
between 7750 and 7777 (inclusive), but excluding 7776 are Planned, and
must be treated special.

Solution:    Follow the rules stated above.


                               [End of TCO 6.1.1204]

                               TCO-number:  6.1.1205



Written-by:  GROSSMAN                         Creation-date:  18-Feb-85 23:34:05
Edited-by:   GROSSMAN                         Edit-date:      20-Feb-85 10:05:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:    Code wrong at NIDPT + a few.

Diagnosis:    The code was expecting a temp AC to be preserved across a
subroutine call.  In reality, the code worked correctly, but it was just
pure luck.  Don't depend on luck.

Solution:    Use a more permanent AC.


                               [End of TCO 6.1.1205]

                               TCO-number:  6.1.1206



Written-by:  PALMIERI                         Creation-date:  19-Feb-85 10:21:45
Edited-by:   PALMIERI                         Edit-date:      19-Feb-85 16:50:13


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	DATIME




Problem:   IDTIM fails when asked to suppress inputting of date and time

Diagnosis:   IDTNCS subroutine called by IDTIM notices that inputting of date
and time are suppresed and attempts to return the current date and time.
After doing the ODCNV it forgets that time input is suppressed and returns
what it thinks it input for time which is garbage.  IDTIM then calls IDCNV
to convert the date and time to internal format and notices the garbage
time and returns an error in T1 when IDTIM expects it in T2.

Solution:   Return current time when inputting of time is suppressed.
Look for error code in correct AC.


                               [End of TCO 6.1.1206]

                               TCO-number:  6.1.1207



Written-by:  LOMARTIRE                        Creation-date:  19-Feb-85 10:21:58


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CFSSRV




Problem:   If a structure is mounted by one system but not another,
and it is moved, then no other system will be able to mount the structure
until it is dismounted.

Diagnosis:   PHYSIO calls CFS at CFRDSN in order to allow CFS to update the 
structure tokens with the new drive serial number.  However, it calculates
the new HSHCOD value for the structure name token wrong.  So, this system will 
always refuse any mount requests for the structure because the HSHCOD values 
won't match.

Solution:   Correctly calculate HSHCOD.


                               [End of TCO 6.1.1207]

                               TCO-number:  6.1.1208



Written-by:  HAUDEL                           Creation-date:  19-Feb-85 11:49:50
Edited-by:   HAUDEL                           Edit-date:      19-Feb-85 16:53:10


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
Related-QAR:  	706364



Problem:         CHECKD's "REBUILD" command fails if the DSKBTTBL file does 
not exist.

Diagnosis:       Thd DSKAS% tries to get a JFN on an existing DSKBTTBL
file and if the file is not found, it does not try to
write a new one. Even if the DSKAS% does find an existing DSKBTTBL file,
it does not seem to do anything with the contents.

Solution:       If the DSKAS% fails to get a JFN because the file does not
exist, JRST to the code that writes a new DSKBTTBL.


                               [End of TCO 6.1.1208]

                               TCO-number:  6.1.1209



Written-by:  GROSSMAN                         Creation-date:  19-Feb-85 11:58:07
Edited-by:   GROSSMAN                         Edit-date:      20-Feb-85 10:11:12


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:   The KLNI sometimes doesn't restart after a continuable error.  It
just hangs in the INIT state.

Diagnosis:   Sometimes when the KLNI gets an error, there are items left on the
response queue.  When the KLNI gets restarted, it never gets an interrupt
to tell it to look at the response queue.  This happens because the interrupt
is generated only if a response is put onto an empty response queue.

Solution:   Clean up (empty out) the response queue during error processing.


                               [End of TCO 6.1.1209]

                               TCO-number:  6.1.1210



Written-by:  GROSSMAN                         Creation-date:  19-Feb-85 12:19:47
Edited-by:   GROSSMAN                         Edit-date:      20-Feb-85 10:12:36


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR	MONSYM




Problem:   Implement Read Portal Counters.

Diagnosis:  

Solution:  


                               [End of TCO 6.1.1210]

                               TCO-number:  6.1.1213



Written-by:  PALMIERI                         Creation-date:  19-Feb-85 17:30:54


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	Ntman




Problem:  No entry in line parameter table for TOPS20 specific parameter
RECEIVE BUFFER SIZE (2500)

Diagnosis:  Never put in

Solution:  Add it


                               [End of TCO 6.1.1213]

                               TCO-number:  6.1.1214



Written-by:  MELOHN                           Creation-date:  19-Feb-85 18:54:58


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    TOPS-20 provided host services with "Dynamic" ratings never change.

Diagnosis:    Code to set the rating based on the load average was not
being called regularly. The formula  255-INT(15-minute load average) should be
more flexible to provide a better indication of how loaded a system is.

Solution:    Rewrite the DYNRAT routine to use the formula
255-INT(4*(15-minute load average)), add a cell to the host node
database to contain the host's current rating, and check to see if
this rating needs to be updated each time the multi-cast message is
sent out.


                               [End of TCO 6.1.1214]

                               TCO-number:  6.1.1215



Written-by:  PALMIERI                         Creation-date:  20-Feb-85 14:21:22


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	STG	DNADLL	GLOBS




Problem:    Too many instructions executed in LV8CHK if DECnet doesn't have
anything to do.  D36IFG flag check every time through LV8CHK.

Diagnosis:    LV8CHK calls DNADLL every time to see if something to do.  LV8CHK
always checks the DECnet intialized flag.

Solution:    Have LV8CHK check DNADLL's queue any only call it if something to
do.  Remove check of D36IFG since all callee's do the right thing is DECnet
is not initialized.


                               [End of TCO 6.1.1215]

                               TCO-number:  6.1.1216



Written-by:  MELOHN                           Creation-date:  22-Feb-85 12:08:43


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    Stop circuit messages from the server are not acted upon by
the TOPS-20 Host code.

Diagnosis:    Routine HMSTOP checks incoming stop messages using the
Remote ID field when it should be using the Local ID field.

Solution:    Use the Local circuit ID field instead.


                               [End of TCO 6.1.1216]

                               TCO-number:  6.1.1217



Written-by:  GROSSMAN                         Creation-date:  22-Feb-85 17:09:10


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  Not setting an NIA20's Ethernet address (via the SETSPD command
ETHERNET), results in the system using a 0 address.

Diagnosis:  PHYKNI was getting the Ethernet address from a location that
doesn't get initialized if the SETSPD command is never issued.

Solution:  Initialize the address only if we have a valid address to use.


                               [End of TCO 6.1.1217]

                               TCO-number:  6.1.1218



Written-by:  WAGNER                           Creation-date:  25-Feb-85 08:30:52


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA


Related-QAR:  	706379



Problem:  NIN% only handles radices up to decimal 10

Diagnosis:  Code overly restrictive

Solution:  Make code able to accept radices up to decimal 36. NOUT% can handle
these, so no changes there. Change MONSYM to report IFIXX1 based on new range.


                               [End of TCO 6.1.1218]

                               TCO-number:  6.1.1219



Written-by:  WAGNER                           Creation-date:  25-Feb-85 11:25:28


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSF


Related-QAR:  	706383



Problem:  User with infinite quota can create inferiors with negative quotas.

Diagnosis:  Poor check made, assumption is that normal return from CKLIQ means
infinite quota, but it can also mean negative.

Solution:  Make the jump conditional upon quota being non-negative.


                               [End of TCO 6.1.1219]

                               TCO-number:  6.1.1222



Written-by:  LOMARTIRE                        Creation-date:  27-Feb-85 10:12:51


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GTJFN


Related-SPR:  	 17125



Problem:   G1%NLN (no long names) does not work when recognition is used.

Diagnosis:   When the field is recognized, there is no check for the length
of the field which is being returned.

Solution:   Before the field which was recognized is output, check the length 
of the field.  If it is too large, return the appropriate error; either 
GJFX41 or GJFX42.  Note that this is only done for recognition of file names 
and extensions (from routines DEFNAM, DEFEXT, RECNAM, RECEXT).


                               [End of TCO 6.1.1222]

                               TCO-number:  6.1.1223



Written-by:  MELOHN                           Creation-date:  27-Feb-85 21:10:30


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:   If a LAT server crashes, TOPS-20 doesn't find out until the
server comes back and the first person tries to reconnect to the host.
This first person sees the banner followed immediatly by "node stopped
circuit", at which time all TTYs on that server are detached and
JOBCOFed. The person has to connect again in order to establish a new
circuit on the host, and the period between the time the server
crashes and the first person attempts to re-connect can be weeks,
during which time all of the users on that server remain connected and
can be charged for TTY connect time.

Diagnosis:   A keep-alive timer needs to exist on the host side of the
LAT circuit to detect those times when the server does not or cannot
send its normal keep-alive message. If a reasonable number of server
keep alive intervals has passed without a message from the server, it
is safe to assume that the server has passed away and the circuit
should be stopped.

Solution:    Implement such a timer, initially based on 6 times the
server keep-alive timer number of seconds. Stop the circuit if no
messages have been received from the server within that interval.


                               [End of TCO 6.1.1223]

                               TCO-number:  6.1.1225



Written-by:  PAETZOLD                         Creation-date:  28-Feb-85 09:42:47
Edited-by:   PAETZOLD                         Edit-date:       6-Mar-85 16:08:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	IPNIDV




Problem:      

ILULK2 BUGHLTS from NISRV and IPIPIP.

Diagnosis:      

This  is  a catch all BUGHLT for problems in the TCP/IP Ethernet buffer
handling stuff.

Caused by a race in the NIPSTO (start output routine).

Solution:      

Use  NTOB  in  the  NCT  as  an  interlock for NIPSTO so that all other
callers will stay away from this routine. This is OK since the  current
possessor  of  the  interlock  will  queue  all linked buffers to NISRV
anyways. Initial code in NIPSTO should be PIOFF instead of NOSKED. Also
reset NBQUE field of buffers in NIPQUE when dequeueing them.

Add IPNDSW debuging code. This code links all send and receive  buffers
given to NISRV and BUGHLTs (IPNMIS and IPNHIT) on any discrepancies.

Other causes of ILULK2s during the interim period of these changes were 
caused by bugs induced by the debuging code.  There are no known ILULK2
problems in the TCP/IP at this time.


                               [End of TCO 6.1.1225]

                               TCO-number:  6.1.1227



Written-by:  LOMARTIRE                        Creation-date:   4-Mar-85 07:57:31
Edited-by:   LOMARTIRE                        Edit-date:      12-Mar-85 10:45:17


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MEXEC


Related-TCO:  	6.1.1247



Problem:   CFSVFL BUGHLTs

Diagnosis:   The starting of CFS at system startup is very timing dependent
and is subject to variations to timing changes elsewhere in the monitor.  
Recently, something has changed in the system startup timing that causes the 
system not to be fully "joined" to the cluster when we try to mount our PS.
This can result in confusion since we may obtain the wrong access.  This will 
eventually result in a CFSVFL BUGHLT.  One reason we were not fully joined is 
that SCA usually is stuck on buffers due to MSCP connections being established 
before we join the cluster.  A temporary fork is around to help solve this 
problem but it is started too late to be of help once we call CFSJYN.

Solution:   TEFORK is the temporary fork used at system startup to insure that 
the SCA buffers are replenished.  Move it after the call to PPDINX and before 
the call to IPACHK.  In this way, SCA has been initialized and we have a TEFORK 
ready to call SC.ALM whenever needed.


                               [End of TCO 6.1.1227]

                               TCO-number:  6.1.1228



Written-by:  MOSER                            Creation-date:   5-Mar-85 10:31:50


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	TTYSRV




Problem:   Still more ITRLGOs. This one is from TTYSRV when the ACJ refuses to
allow line speed changes.

Solution:   ERJMP .+1 after MTOPR in TTCKSP.


                               [End of TCO 6.1.1228]

                               TCO-number:  6.1.1229



Written-by:  WAGNER                           Creation-date:   5-Mar-85 10:48:46


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FORK


Related-QAR:  	838045



Problem:  Performance Issues

Diagnosis:  SETJSB doesn't check to see if it needs to do mapping before doing
it. Mapping is expensive.

Solution:  Lets see if we need to spend the effort first.


                               [End of TCO 6.1.1229]

                               TCO-number:  6.1.1231



Written-by:  PALMIERI                         Creation-date:   5-Mar-85 13:48:03


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	DTESRV	GLOBS	STG




Problem:    Too many BUGINFs when MCB DTE is initialized or rebooted.

Diagnosis:  

Solution:    Add DTBUGX which when zero suppresses BUGINFs DTESUI, DTETPR, DN20ST
	for MCB DTEs only.  Default is non-zero.  Place DTEWAT BUGINF under
	FTDEBUG.


                               [End of TCO 6.1.1231]

                               TCO-number:  6.1.1232



Written-by:  HAUDEL                           Creation-date:   5-Mar-85 14:32:35


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCAPAR	MONSYM




Problem:  SCA's connection management symbols are not available to
the users of the SCS%.

Diagnosis:  The connection management symbols are defined in SCAPAR and
not in MONSYM.

Solution:  Define new symbols in MONSYM and have the symbols in 
SCAPAR point to those in MONSYM.


                               [End of TCO 6.1.1232]

                               TCO-number:  6.1.1233



Written-by:  GLINDELL                         Creation-date:   5-Mar-85 15:01:44


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	nrtsrv	cthsrv


Related-QAR:  	838026



Problem:  
NRT and CTERM does not do the right thing when 
^ESET NO LOGINS DECNET-TERMINAL is set.  NRT tests
for 'remote terminals' instead of for 'decnet terminals'
while CTERM doesn't test anything at all.  Also, NRT
returns the wrong reason code 'node shutting down' instead
of 'access not permitted'.

Diagnosis:  

Solution:  
Test SF%MCB for both NRT and CTERM.  Use disconnect reason
code RSNACR (access not permitted).


                               [End of TCO 6.1.1233]

                               TCO-number:  6.1.1234



Written-by:  GLINDELL                         Creation-date:   5-Mar-85 15:37:39


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA


Related-QAR:  	838052



Problem:  
Any user can get any password on a system that is not using password
encryption.  This bug is also in all 4.1 and 5.1 systems.  Also, this
can be done in a finite time, more exactly proportional to 128 times
the number of characters in the password.

Diagnosis:  
Check password code in JSYSA is not defensive enough.  Monitor uses
one address when reading bytes that match the correct password, another
address when reading bytes that did not match.  Clever usage of address
break will use this fact.

Solution:  
Use the one and same address whether good or bad bytes are read.
The address break will not reveal anything that way.  Code affected
is at CHKPS3 and CHKPS5.


                               [End of TCO 6.1.1234]

                               TCO-number:  6.1.1236



Written-by:  PALMIERI                         Creation-date:   6-Mar-85 10:19:10


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	SCJSYS	D36PAR




Problem:    Random BUGHLTs when privileged user continually opens the maximum
	  number of logical links

Diagnosis:    Port database in SCJSYS is built to accommodate maximum number
	    logical links and is indexed by link number.  However link numbers
	    are assigned by the lower layer (SCLINK) when the logical link
	    is opened and numbers greater than maximum logical link value
	    may be assigned.  This occurs because SCLINK does not immediately
	    release an SLB for a logical link but instead puts it on a
	    queue to be released later.  The user is told that the link is
	    closed and may then open another.  If that happens before the
	    SLB for the previous link is released the link number cannot be
	    reused.  SCLINK expands its database and can then assign a link
	    number that exceeds maximum links.  This number is given to SCJSYS.
	    This causes SCJSYS to index off the end of its port database and
	    trash memory it does not own.

Solution:    Add a table of pointers to the port database indexed by port number.
	   Make this table maximum links times 2 for unprivileged users and
	   maximum links plus 10 for privileged users.  If the link number
	   assigned exceeds the size of the indirect table close the link
	   and return MONX07 error to user.  The indrect table facilitates
	   not having to build the port database until the port is opened.


                               [End of TCO 6.1.1236]

                               TCO-number:  6.1.1238



Written-by:  MCCOLLUM                         Creation-date:   6-Mar-85 14:54:28


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MANY




Problem:  
There are nearly 400 undocumented BUGxxx's in the monitor as well as a few
hundred improperly documented BUGxxx's.


Diagnosis:  
Same.

Solution:  
Document all the BUGxxx's that are not under DEBUG=1 and fix all the improperly
documented BUGxxx's so the documentation people can distribute BUGS.MAC.


                               [End of TCO 6.1.1238]

                               TCO-number:  6.1.1239



Written-by:  PAETZOLD                         Creation-date:   6-Mar-85 16:11:00


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	STG




Problem:    

Should be able to turn off the IPNIN and IPCIN code even though it is not
a supported configuration.  This is possible in 5.4.

Diagnosis:    

oversight.

Solution:    

Key the IMPANX and IMPDV loadmodules off of ANXN under control of NETN.
Also create an dummy for IMPCHK when ANXN is off and NETN is on.


                               [End of TCO 6.1.1239]

                               TCO-number:  6.1.1241



Written-by:  GROSSMAN                         Creation-date:   7-Mar-85 16:09:39


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLMOP




Problem:  COMMMS BUGHLTs when running DFNIS (sometimes called RMTCON).

Diagnosis:  ^C'ing out of DFNIS at the wrong moment leaves some buffers laying
around.  When the fork gets reset, these buffers get returned to the memory
manager.  Unfortunately, NISRV still has pointers to the buffers, and
eventually returns them to LLMOP.  LLMOP then tries to return them to the
memory manager again, resulting in a COMMMS BUGHLT.

Solution:  Create an "ABORT" bit for requests.  When LLMOP tries to clean up
the request buffers, it first see's if the request has completed, if it has,
it returns the buffer immediately, otherwise, it just sets the "ABORT" bit.
Eventually, NISRV returns the buffer.  If the abort bit is on, the buffer
is just released with no further ado.


                               [End of TCO 6.1.1241]

                               TCO-number:  6.1.1242



Written-by:  PRATT                            Creation-date:   7-Mar-85 19:13:44


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	IPCF




Problem:  

  	1) Must be whopr to use the QUEUE jsys.

	2) IP%MON is not being cleared when it should which would
	   allow a user to set the "sent by monitor" bit.

Diagnosis:    

	1) The VALARG routine was checking 3 fields including
	   the field IP%CFC for a non-zero value when the user 
	   wasn't priv'd. The new .IPCCG value (sent on behalf 
	   of the QUEUE jsys) doesn't need privs. All the old values do.

	2) Typo in the code

Solution:    

	1) Check IP%CFC for .IPCCG after checking the status of
	   IP%CFP, and IP%CFM.

	2) Change the T1 to a P1


                               [End of TCO 6.1.1242]

                               TCO-number:  6.1.1244



Written-by:  GLINDELL                         Creation-date:   7-Mar-85 22:19:11
Edited-by:   GLINDELL                         Edit-date:       7-Mar-85 22:22:26


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	POSTLD


Related-QAR:  	838076



Problem:    

Illegal memory read in POSTLD if the monitor PDV is missing.
This should only happen if a bad .CCL file is used to link the
monitor.

Diagnosis:    
POSTLD needs to find the monitor PDV in order to locate the symbol
table.  The .POLOC function of PDVOP% is used.  If there is no name
with the requested name, then the PDVOP% will return with a 0 PDVA
and a 0 count of items returned.  However, when I wrote the code
I thought the PDVOP% would generate an error if no matching PDV 
was found.  Since this is not the case, POSTLD will pick up a 0

for the PDVA and after that it's downhills.

Solution:    
Check for 0 items returned after the PDVOP%, and if no PDV is found,
issue an appropriate error message and abort POSTLD.

Also, add some information on how to debug POSTLD in a comment at the
top of the module.


                               [End of TCO 6.1.1244]

                               TCO-number:  6.1.1245



Written-by:  GRANT                            Creation-date:  11-Mar-85 07:46:45
Edited-by:   GRANT                            Edit-date:      11-Mar-85 10:00:03


Edit-checked:         Yes    Document:          Yes    TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	cfssrv	MEXEC	phymvr	globs	PROLOG	STG




Problem:    Jobs gets hung when another system in the cluster is shutdown for PM.

Diagnosis:    The MSCP server is invisible to the operator on the system going
down and to users on the other systems who have structures mounted through the
MSCP server.  The software provides no help in this siutation.

Solution:    When a system is going to cease timesharing, check to see if it has
any disks onlined by other systems through its MSCP server.  If so, warn the
operator that the other systems must be checked for possible structure
dismounting instructions.  On the other systems, check for any structures
mounted through the MSCP server of the system going down.  If there are any,
warn the operator about the other system's pending shutdown and list the
structures that should be dismount.


                               [End of TCO 6.1.1245]

                               TCO-number:  6.1.1247



Written-by:  LOMARTIRE                        Creation-date:  12-Mar-85 10:45:16


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CFSSRV	MEXEC


Related-TCO:  	6.1.1227



Problem:   CFSVFL BUGHLTs at system startup.

Diagnosis:   TCO 6.1.1227 tried to solve this problem but only fixed part of
the problem.  There still must be a more deterministic way to insure that
we have joined with all existing TOPS-20 systems.

Solution:   Add code to routine CFSJYN which will make this routine return 
only once we are sure that we have completely joined the cluster.  We will 
wait until we have a CFS connection to every TOPS-20 system to which we have 
at least one open path.  The extra DISMS in MEXEC after the CALL CFSJYN is 
now no longer needed.


                               [End of TCO 6.1.1247]

                               TCO-number:  6.1.1250



Written-by:  MELOHN                           Creation-date:  12-Mar-85 18:46:38


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    When bias knob 20 set set, LAT ends up retransmitting up to
40% of all of its messages. This means more work for LATSRV, the
server, and more unnecessarily retransmitted messages on the Ethernet.

	Also, LCP displays the circuit retransmit timer is in seconds,
when it is actually in scheduler cycles. Quite a difference.

Diagnosis:    The LAT host retransmit timer was designed based on the
assumption that a scheduler cycle happens every 20ms. This is not a
valid assumption when the bias knob is set or when the system is very
idle and the LAT scheduler routine is run as part of the idle loop.

Solution:    Change the retransmit timer to be based on TODCLK, like the
keep alive timer. Change the LATOP% jsys to expect the retransmit
timer value in milleseconds, just like the various circuit timers in
the server.


                               [End of TCO 6.1.1250]

                               TCO-number:  6.1.1251



Written-by:  MELOHN                           Creation-date:  12-Mar-85 18:51:56


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    Various LATOP% parameters default to bogus values. The Host
number which is supposed to be settable via the NODE command in SETSPD
isn't.

Diagnosis:    The defaults grew out of the values used in standalone. No
code existed to set the host number.

Solution:    Make the default values and ranges more realistic. Read the
host number from RTRADR at LATINI time.


                               [End of TCO 6.1.1251]

                               TCO-number:  6.1.1252



Written-by:  MELOHN                           Creation-date:  12-Mar-85 18:58:57


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    Users report being "dropped" by the LATBOX but there is no
evidence that anything was wrong, other than a stopped session at the
user's terminal.

Diagnosis:    When an illegal message type is received, a LATIMT BUGCHK
is recorded and the circuit terminated. If the first part of the
message is readable, but the slot data in the message is garbaged, the
circuit is stopped, but no BUGCHK issued. LATIMT should probably be a
BUGINF anyway.

Solution:    Make the LATIMT a BUGINF. Add a new BUGINF LATIST which
will print out when a user is dropped due to an illegal slot within a
seemingly legal message.


                               [End of TCO 6.1.1252]

                               TCO-number:  6.1.1253



Written-by:  MCCOLLUM                         Creation-date:  13-Mar-85 14:27:38
Edited-by:   MCCOLLUM                         Edit-date:      13-Mar-85 14:41:07


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	DISC


Related-QAR:  	838064



Problem:  

GTFDB3 crashes when renaming files.

Diagnosis:    

When renaming a file to itself, RNAMF% should return a RNMX10 (Source  file
is not closed) error. There  is a coding error  in this path that  prevents
RNAMF% from performing  the check  that determines  if the  file is  indeed
open. It wrongly assumes that  the file is not  open and proceeds to  trash
the FDB of the already  existent file. A last  second sanity check call  to
GETFDB finds this and causes the crash.

Solution:  

Fix the coding error and allow RNAMF% to return RNMX10.


                               [End of TCO 6.1.1253]

                               TCO-number:  6.1.1255



Written-by:  PALMIERI                         Creation-date:  13-Mar-85 15:38:12


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	SCJSYS


Related-QAR:  	838043



Problem:    Previous DECnet monitors allow a program to do an implied connect
	  accept by doing a SOUT/SOUTR after the OPENF on a DECnet link.
	  6.1 does not.

Diagnosis:    No code

Solution:    Add code to wait for connect and accept it if SOUT/SOUTR is done
	   and the link is in connect wait/connect received state.


                               [End of TCO 6.1.1255]

                               TCO-number:  6.1.1256



Written-by:  PAETZOLD                         Creation-date:  13-Mar-85 16:37:40


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  Yes


Program:  monitor
  Routines-affected:   	STG




Problem:    

Channel detected write parity errors from the last quadword of memory.

Diagnosis:    

Hardware design problem in the KL10.

Solution:    

Add conditional assembly in STG to force NMAXPG down by one page if MAXCOR
is set for 4.0 Meg.


                               [End of TCO 6.1.1256]

                               TCO-number:  6.1.1258



Written-by:  PALMIERI                         Creation-date:  13-Mar-85 16:51:31


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	D36COM	SCLINK


Related-QAR:  	838082



Problem:    36 bit byte mode does not work from 6.1 to 5.1 or 6.1 to 6.1.

Diagnosis:    If segment size of message falls somewhere in a full word other
	    than at the end, the rest of the bytes of the word may not be sent.

Solution:    Send a maximum of segment size modulo 9 bytes in a message. Send
	   any remaining bytes in the following message.


                               [End of TCO 6.1.1258]

                               TCO-number:  6.1.1260



Written-by:  PALMIERI                         Creation-date:  13-Mar-85 17:03:39


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	ROUTER




Problem:    Loop node does not work when ROUTER is an endnode.

Diagnosis:    The routing algorithm choses the destination as the next hop for
	    the NI but since we are always the destination when 
	    performing loop node we fail to get the message back since
	    the NI cannot receive its own transmitted messages.

Solution:    If message is destined for ourselves send it to the designated
	   router if there is one.  If not send it to ourselves and never get
	   it.


                               [End of TCO 6.1.1260]

                               TCO-number:  6.1.1263



Written-by:  WAGNER                           Creation-date:  14-Mar-85 13:46:34


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MONSYM


Related-SPR:  	 20644



Problem:  DOCUMENTATION FOR .SKRJP FUNCTION OF SKED% IS WRONG, AND MISSING SOME
SYMBOLS

Diagnosis:  SEE ABOVE

Solution:  CORRECT THE DOCUMENTATION, AND IMPLEMENT 2 NEW SYMBOLS: .SACSH, AND
.SACLU


                               [End of TCO 6.1.1263]

                               TCO-number:  6.1.1264



Written-by:  MELOHN                           Creation-date:  14-Mar-85 15:35:39


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    The third service offered by a TOPS-20 host to a PLATO does
not appear on the PLATO list of available services.

Diagnosis:    The Multi-cast change flags were not being set when a new
service was added. Apprently VMS doesn't set them either, but the
PLATO allows the second service to be added without setting the change
flags so as not to upset VMS while the third or more service must have
the change flags set. *yuk*

Solution:    Set the change flags whenever we add a new service, up to
the maximum number of services offered.


                               [End of TCO 6.1.1264]

                               TCO-number:  6.1.1265



Written-by:  MELOHN                           Creation-date:  14-Mar-85 15:52:31
Edited-by:   PRATT                            Edit-date:      14-Mar-85 16:33:03


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	STG


Related-TCO:  	6.1.1268

Related-QAR:  	838073



Problem:    Need more customer defined terminal types.

Diagnosis:   If we reserve a new block for DEC as well, customers who
need larger amounts of terminals than we can to build in can add them
at the end without worrying about DEC coming along some day and using
the slots for our new terminal types.

Solution:    Add more - 10 reserved for customers, 10 for DEC.


                               [End of TCO 6.1.1265]

                               TCO-number:  6.1.1267



Written-by:  MELOHN                           Creation-date:  14-Mar-85 16:31:12


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	TTYSRV	TTYDEF	NRTSRV


Related-QAR:  	838078



Problem:    System halted with COMMMS.  There were 13 SCLSBJ and 2
NVTPCL BUGCHKs queued on SEBQOU.

A user on terminal 31 was set hosted (.MOSNH had been issued).  TTNUS
was set in the terminal's dynamic data but the contents of TTULL was
101 (octal) and not the address of a NRB.  This discrepancy caused the
BUGHLT and BUGCHKs when MCSRV tried to service the line.

Diagnosis:    

After careful scrutiny we noticed that TTLMAX and TTULL share the same
word in the dynamic data.  TTLMAX is the maximum of TTLINE, the line
counter.  TTULL is the address of the NET USER logical link.  The
comment lines in TTYDEF tell us the assumption is TTLMAX will not be
in use when SET HOST is in effect.  Unfortunately there is no code in
TTYSRV to support this assumption.

The value in TTULL at the time of the crash was the same as the value
in TTLINE, indicating that routine INCLIN(TTYSRV) had recently
deposited the value.  INCLIN is called for example by the BOUT% JSYS
(TCO->TCOY->TTCO1 (via CHITAB dispatch)) when a ^J is being output to
the terminal.

Another way for TTLMAX to get set (and TTULL to get zapped) is by the
MTOPR function .MOSLM which directly stores into TTLMAX.

Since the SET HOST (.MOSNH) function does not freeze any processes in
the job, or prevent non job output, it is possible for the value in
TTULL to get lost.

The careful reader would have immediately noticed that under versions
6.0 and 5.1 TTULL was also overlayed on the TTLMAX word.  Correct, but
now the field for the escape character (TTUEC) is 774000,,0 whereas
under previous versions it was 177,,0.  This change has been made to
accomodate a 29 (?) bit address for the field TTULL, which is
allocated out of extended resident free space.

This means that under versions 6.0 and before the test at INCLIN would
invariably prevent the store of a new value in TTLMAX.  However, with
the new position of TTUEC a user supplying an escape character of "@"
or greater causes the value in TTLMAX to be negative!  This negative
value of TTUEC and TTULL then gets overwritten by the code at INCLIN.

Of course our user was an extremist, he supplied tilde (code 176) for
his escape character.

Solution:    Solution:

Forget about saving a word in the dynamic data, nobody will notice
because the system will be crashing.  Instead increase the dynamic
data and seperate TTLMAX and TTULL.

Change the lines in TTYDEF (changes are in lowercase):

	TTLMAX==34			;MAXIMUM OF TTLINE
	DEFSTR TTULL,TTLMAX,35,29	;NET USER LOGICAL LINK 
					; - WHEN TTLMAX NOT IN USE
	DEFSTR TTUEC,TTLMAX,6,7		;NET USER ESCAPE CHAR
		:
		:
	TTDDLN==37			;DEFAULT DYNAMIC DATA SIZE

to be:

	TTLMAX==34			;MAXIMUM OF TTLINE
		:
		:
	ttlnuw==37			; net user word 
	defstr ttull,ttlnuw,35,29	; net user logical link
	defstr ttuec,ttlnuw,6,7		; net user escape char
		:
		:
	ttddln==40			; default dynamic data size


                               [End of TCO 6.1.1267]

                               TCO-number:  6.1.1271



Written-by:  PRATT                            Creation-date:  14-Mar-85 23:00:19
Edited-by:   PRATT                            Edit-date:      14-Mar-85 23:13:22


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA




Problem:      

Any user supplying bad length arguments to the QUEUE% jsys can
cause a ILLUUO bughalt.

Diagnosis:      

The Queue jsys code was trying to verify the users arguments and
found an error. It took the error return which tried to clean up
by releasing free space used for building an IPCF packet. 
Unfortunately we hadn't gotten the free space yet which happens 
later on in the code. The QUMSG location had garbage in it and
when we tried to release that, we die a horible death.

Solution:      

In QUVERF, change the two error conditions before the call to ASGPGS
to generate an illegal instruction trap. After the call, use RETBAD
so we can return and clean up the free space used.


                               [End of TCO 6.1.1271]

                               TCO-number:  6.1.1272



Written-by:  GRANT                            Creation-date:  15-Mar-85 08:18:30


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO




Problem:  Impossible to figure out the algorithm for homeblock checking.
Diagnosis:  Routine UPDPDB is too hard to read.
Solution:  Restructure the code.  No logic changes intended.

                               [End of TCO 6.1.1272]

                               TCO-number:  6.1.1273



Written-by:  GRANT                            Creation-date:  18-Mar-85 10:05:38


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyklp




Problem:  Occasional problems with CI wires going from good to bad and from
bad to good.
Diagnosis:  There is a problem in the KLIPA (to be ECOed) which is
aggrevate by frequent loopback.  TOPS-20 sends 2 loopbacks per second.
Solution:  Change TOPS-20 to send only 1 loopback per second;  this should
not deminish the monitor's ability to detect real problems.

                               [End of TCO 6.1.1273]

                               TCO-number:  6.1.1275



Written-by:  GROSSMAN                         Creation-date:  18-Mar-85 15:12:10


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLINKS




Problem:  Invalid DECnet event 3.0 from LLINKS.

Diagnosis:  An NSP Disconnect Confirm message with a bad reason code was
being handled incorrectly.

Solution:  Make LLINKS follow the NSP version 4.0 spec in this regard.

Note that if VMS followed the spec in this regard, this problem would not
have occurred.


                               [End of TCO 6.1.1275]

                               TCO-number:  6.1.1276



Written-by:  PALMIERI                         Creation-date:  18-Mar-85 16:56:29


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	GTJFN




Problem:    Wildcard for nodename not permitted for network parse-only JFN.

Diagnosis:    Code is too restrictive.

Solution:    Remove restriction and allow wildcard in nodename.


                               [End of TCO 6.1.1276]

                               TCO-number:  6.1.1277



Written-by:  MELOHN                           Creation-date:  19-Mar-85 17:30:33


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	TTYSRV




Problem:    Users on VT2xx terminals in VT2xx, 7-bit mode see garbage
when hosting via NRT to a remote 20.

Diagnosis:    Parity checking is enabled for NRT (and all other line
types). VT200 in 7-bit control mode use the 8th bit not for parity but
for 8-bit characters.

Solution:    Parity is for farmers, not for NRT, CTERM, or LAT lines.
Remove the TRZ(TRO) to clear/set parity in PARTBL:.


                               [End of TCO 6.1.1277]

                               TCO-number:  6.1.1278



Written-by:  GROSSMAN                         Creation-date:  20-Mar-85 09:40:01


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR




Problem:  The Read Portal Counters function of the NI% JSYS does not convert
global job numbers to local indexes.

Diagnosis:  Oops!

Solution:  Add a call to GL2LCL to the portal locating routine.


                               [End of TCO 6.1.1278]

                               TCO-number:  6.1.1279



Written-by:  LOMARTIRE                        Creation-date:  20-Mar-85 10:48:07


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	APRSRV	GTJFN


Related-TCO:  	6.1.1127



Problem:   ILLUUO BUGHLTs.

Diagnosis:   TCO 6.1.1127 attempted to solve the cause of ILLUUOs from passing a 
bad byte pointer to GTJFN.  The changes were made to KIMXLP under a false 
assumption.

Solution:   Remove the code added to KIMXLP by TCO 6.1.1127.  Instead, rewrite 
GTJFN so that it does an XCT Q1 not an XCT -2(P) since P is not preserved when 
we are in KIMXLP.


                               [End of TCO 6.1.1279]

                               TCO-number:  6.1.1280



Written-by:  MELOHN                           Creation-date:  20-Mar-85 14:55:44


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    LCP show server Quack does not display the location string
of server Quack.

Diagnosis:    Recent additions to the LAT circuit database were made at
the end of the CB BEGSTR. Unfortunetly the LATOP% jsys BLTs the last
half the the CB to user context as the return for the get server
information function.

Solution:    Move the new cells to a more appropriate place in the CB.
Add the warning to the CB that the data structures in the last half of
the BEGSTR are returned as part of the jsys, and should not be changed
without corresponding changes to the LATOP code.


                               [End of TCO 6.1.1280]

                               TCO-number:  6.1.1281



Written-by:  MOSER                            Creation-date:  21-Mar-85 11:15:03


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MEXEC




Problem:   When a job is logged out because of a carrier off action the wishes
of the ACJ are ignored. If the ACJ refuses logout the job is logged out anyway!

Diagnosis:   Coded that way. This is inconsistent with regular logout
using the LGOUT Jsys where the ACJs wish is honored.

Solution:   If the ACJ denies the request to logout wait 1 minute and ask it
again.


                               [End of TCO 6.1.1281]

                               TCO-number:  6.1.1282



Written-by:  LOMARTIRE                        Creation-date:  21-Mar-85 16:13:30
Edited-by:   LOMARTIRE                        Edit-date:      17-Apr-85 14:27:20


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGUTL	STG	GLOBS


Related-TCO:  	6.1.1328



Problem:     PITRAP BUGHLTs

Diagnosis:     During the lookup of a resource block in HSHLOK, one of the
pages which is referenced is no longer in core.  Since CFS locks all it's pages 
down, and never unlocks them, someone else is unlocking the page.  We need a 
diagnostic patch which will catch the culprit.

Solution:   In the routine MULK1, check to see if the page to be unlocked is 
a page from CFSSEC.  If so, die with an ILULK5 BUGHLT.  This feature is 
controlled by the location CFUNLF.  When zero, this checking will be done.  
When non-zero, it will not.  CFUNLF is zero by default.


                               [End of TCO 6.1.1282]

                               TCO-number:  6.1.1283



Written-by:  GROSSMAN                         Creation-date:  21-Mar-85 23:18:52


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  KNISTP BUGCHK's

Diagnosis:  Keep alive timer for KLNI was implemented wrong.  Every five
seconds, KLNI is checked for activity within the last 10 seconds.  If
there has been no activity in the last 10 seconds, NISRV gives the KLNI
a command.  This means that on an idle system, the KLNI is processing
a command every 10 seconds.  If the keep alive routine in NISRV determines
that it hasn't heard from the KLNI in 15 seconds, it generates a KNISTP
BUGCHK, and reloads the KLNI.

Due to variability in the second level clocks from the scheduler, the
KLNI keep alive routine may be called in such a manner that it won't queue up
a command to the KLNI for about 15 seconds.  This results in a KNISTP BUGCHK.

Solution:  Halve the minimum activity interval.  Ie: if the KLNI has done nothing
in the last 5 seconds, give it a command.  This allows a large margin of
error for level 2 clock drift.


                               [End of TCO 6.1.1283]

                               TCO-number:  6.1.1284



Written-by:  GRANT                            Creation-date:  22-Mar-85 06:45:40
Edited-by:   GRANT                            Edit-date:      22-Mar-85 07:02:04


Edit-checked:         No     Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MEXEC	scampi	phyklp	PROLOG	globs	STG


Related-QAR:  	838136	838142



Problem:   SCAEBD BUGHLT (Error while doing Buffer Defferal)

Diagnosis:   Two systems are simultaneously running CHECKD at system startup.
Facts to remember:  1) MSCP server doesn't open a listener until after CHECKD
finishes, 2) MSCP continuously tries to connect to another 20's MSCP server
once the virtual circuit is open.  These are both the desired actions.  However,
during the time MSCP's connection attempts fail, many SCA connect blocks are
marked for reaping but the fork which does the reaping doesn't get started
until after CHECKD finishes!  This deadly embrace causes SCA to use up the
entire section's worth of buffers for the connect attempts, thus producing the
SCABSF (Buffer Section Full) BUGCHK and finally the SCAEBD BUGHLT.

Solution:   What was once TEFORK (temporary fork) will now live on through the
life of the system.  It gets started immediately after starting the CI20 and
will handle SCA buffer creation, SCA connect block reaping, and (while we're
at it) loading/dumping of the CI20.  The fork will be called CIFORK and usually
will be sitting in the scheduler test CITEST waiting for any of the bits in the
flag word CIFRKF to get set.  There are bits for each of the 3 actions CIFORK
performs.

Comments: This eliminates SCA's use of DDMP to do SCA buffer creation.  Also,
moving loading/dumping of the CI20 away from Job 0 may eliminate some of the
KLPNRL BUGHLTs you get due to the deadly embraces that occur, but it is not
a 100% cure;  that is planned for Release 7.0.


                               [End of TCO 6.1.1284]

                               TCO-number:  6.1.1285



Written-by:  GROSSMAN                         Creation-date:  25-Mar-85 10:50:56


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NISRV	PHYKNI




Problem:  Spurious KLNI Reload Timeouts, resulting in KNIRTO BUGCHK's.  Sometimes
KLNI is not started because of this.

Diagnosis:  Reload timeouts being performed wrong.  Timing starts whenever the
reload request is made.  During system startup, the reload request happens
almost immediately, KNILDR doesn't run until CHKR gets around to it.  If the
startup procedures take too long, KNIRTO's result.  Usually, they are spurious
and have not effect onthe KLNI, but sometimes, they cause the KLNI to be
shut down.

Solution:  Get rid of KNIRTO.  Don't time out KNILDR.  When CHKR (via KNIJB0)
runs KNILDR, check the state of the KLNI after KNILDR completes.  If the
KLNI isn't running, shutdown the KLNI and issue a KNIRLF (Reload Failed)
BUGCHK.

This can happen if KNILDR dies while reloading a KLNI, or if someone put a
bogus program that just happens to be called KNILDR up on SYSTEM:.


                               [End of TCO 6.1.1285]

                               TCO-number:  6.1.1286



Written-by:  LOMARTIRE                        Creation-date:  25-Mar-85 11:20:50


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO	MSTR	MONSYM


Related-QAR:  	706407



Problem:  
     Whenever  a  cluster change occurs, the monitor is going to force all dual
ported disks offline while it goes through a  homeblock  check  on  them.  This
causes  MOUNTR to print out that the "previously mounted structure is no longer
mounted". This can be confusing since this homeblock  check  is  temporary  and
should be transparent to the user.

Diagnosis:    
     If  a  .MSRUS  function  of MSTR% is done during the time that TOPS-20 has
forced the disk offline, MS%OFL will be returned. MOUNTR interprets this  as  a
true  physical  offline  when, in fact, the disk is still online but it is just
temporarily inaccessible.

Solution:  
     The monitor needs a way to return the status of a disk which it has forced
offline in a way other than MS%OFL. Invent a new bit, called MS%IAC, which will
be  returned whenever the disk is U1.OFS (forced offline). This bit is returned
in additon to whatever other bits are appropriate (such as MS%OFL). Now, MOUNTR
can test for  MS%IAC  and  take  more  desirable  actions  under  this  special
situation.


                               [End of TCO 6.1.1286]

                               TCO-number:  6.1.1287



Written-by:  MCCOLLUM                         Creation-date:  25-Mar-85 15:01:51


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FORK




Problem:  
CFORK% returns illegal instruction trap for certain resource exhaustion
problems. CFORK% is documented as returning +1 and error code in AC 1
for all error types.

Diagnosis:  
As above.

Solution:  
Change an ITERR to a RETERR.


                               [End of TCO 6.1.1287]

                               TCO-number:  6.1.1288



Written-by:  LOMARTIRE                        Creation-date:  26-Mar-85 08:51:11


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FORK


Related-QAR:  	706305



Problem:    
     The  RT%DIM  bit  of  the  RTIW% JSYS has no effect. The deferred terminal
interrupt mask is always returned regardless of the setting of the RT%DIM bit.

Diagnosis:    
     There was never any code to check for the RT%DIM bit.

Solution:    
     Add  code  to check to see if the user specified RT%DIM before placing the
mask in T3.


                               [End of TCO 6.1.1288]

                               TCO-number:  6.1.1289



Written-by:  LOMARTIRE                        Creation-date:  26-Mar-85 11:15:10


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA


Related-QAR:  	706307



Problem:    
     GTDIR%  returns  illegal  memory  write  error, yet argument block and all
addresses are writeable

Diagnosis:    
     The range checking code is off by one.

Solution:    
     Decrement  the  value  of  Q2  (the  argument  block  length)  before  the
comparisons so that it reflects the highest value to be returned.


                               [End of TCO 6.1.1289]

                               TCO-number:  6.1.1292



Written-by:  LOMARTIRE                        Creation-date:  28-Mar-85 09:34:19


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GTJFN


Related-SPR:  	 19910



Problem:    
     If  there  are files (such as A.A.2, B.A.2, C.A.2) which have been deleted
but not expunged with a higher generation number than those of  the  same  name
(such  as  A.A.1,  B.A.1,  C.A.1) a command like COPY *.A.0 (TO) NUL: will only
copy the first file; A.A.1.

Diagnosis:    
     GNJFN%  always  has  VERLUK look for deleted files even if the GTJFN% call
did not consider them. So, B.A.1 and  C.A.1  will  not  be  found  because  the
higher,  deleted,  versions will be found first. Then, when GNJFN% notices that
the file is deleted, it attempts to find the next one. Of course, there  is  no
next one, so the GNJFN% fails.

Solution:    
     In  GNJFN1,  check  the  flags passed to GNJFN% which reflect the original
GTJFN% call. If GJ%GND (deleted files were not considered) is set, do  not  set
IGDLF  (ignore  the deleted bit). This way, deleted files will be found only if
they were originally requested.


                               [End of TCO 6.1.1292]

                               TCO-number:  6.1.1293



Written-by:  GROSSMAN                         Creation-date:  28-Mar-85 09:57:21


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	APRSRV	STG	GLOBS




Problem:  BUGHLT code can lose valuable PI information if no SYSERR blocks are
available.  BUGHLT code also loses state of PI system ON/OFF bit.

Diagnosis:  BUGH0 (in APRSRV) does a PIOFF before saving any PI information.
This loses the state of the PI system ON/OFF bit.

Solution:  When a BUGHLT occurs:

1) Save the current CONI PI, in PISV1.
2) Turn off the PI system (PIOFF).
3) Acquire the bug lock (AOSE BUGLCK).
4) Copy PISV1 to PISAV.

In case a recursive BUGHLT occurs, only PISV1 will be disturbed.  PISAV will
contain the PI status from the original BUGHLT.


                               [End of TCO 6.1.1293]

                               TCO-number:  6.1.1294



Written-by:  GROSSMAN                         Creation-date:  28-Mar-85 15:06:19


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLMOP




Problem:  LLMOP% JSYS does not accept a service password in the Reserve
Console (.RCRSV) function.

Diagnosis:  Code never implemented.

Solution:  Write the code.


                               [End of TCO 6.1.1294]

                               TCO-number:  6.1.1295



Written-by:  LOMARTIRE                        Creation-date:  28-Mar-85 15:25:18
Edited-by:   LOMARTIRE                        Edit-date:      28-Mar-85 15:27:42


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	DIRECT


Related-QAR:  	838071



Problem:        
     TCO  6.2005 was never installed in either 6.0 or 6.1. It attempts to solve
the problem of receiving "?No such directory name" during directory name  field
recognition when a file of the same name exits.

Diagnosis:  

Solution:        
     Install it.


                               [End of TCO 6.1.1295]

                               TCO-number:  6.1.1296



Written-by:  MCCOLLUM                         Creation-date:  29-Mar-85 15:24:51


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GTJFN




Problem:  
RELBAD BUGHLTs.

Diagnosis:  
When attempting to perform recognition on a file name and the default device
name in the GTJFN block is DSK*:, routine DEFDEV will call STRDVD to translate
DSK*: to the public structure name. STRDVD alters the byte pointer in FILOPT
when it changes the free space block to accomdate longer strings. In this case,
however, other routines depend upon the old value in FILOPT and assume it
will not change over the call to DEFDEV.


Solution:  
Since the new value in FILOPT is only in use during the call to DEFDEV, save
the value of FILOPT before calling DEFDEV. Restore FILOPT to its initial
state after the call to DEFDEV is completed.


                               [End of TCO 6.1.1296]

                               TCO-number:  6.1.1297



Written-by:  TBOYLE                           Creation-date:  29-Mar-85 15:58:02
Edited-by:   TBOYLE                           Edit-date:       1-Apr-85 16:06:05


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGEM




Problem:      PTNON0 BUGHLTS. The application was 1022.

Diagnosis:      PMAP to change another forks address space crashes
because the code in MSETPT does not remain NOSKED between
the removing of page-table entries and the adding of new ones.
In this case, the target fork intervened and faulted a private
page.

Solution:      Be NOSKED during the release and set page-table entry
process. Swap pages in beforehand to prevent NOSKED page-faults.
This is also a significant improvement over going OKSKED in the middle
of critical code to prevent NOSKED page-faults!

We will include this change for 6.1 to solve 1022's invocation
of these PTNON0's. However, we must plan to revamp this code
during the next release because several of these routines have
races and also racy methods of preventing NOSKED page-faults.


                               [End of TCO 6.1.1297]

                               TCO-number:  6.1.1298



Written-by:  GROSSMAN                         Creation-date:  30-Mar-85 09:33:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLMOP




Problem:  KPALVH's, ILMNRF's and many other BUG.'s sometime after receiving
an LLMOP Request Counters message.

Diagnosis:  When LLMOP receives a Request Counters message, it generates an
internal request block and puts it on a request queue.  It then asks NISRV
for the desired counters.  When NISRV returns the counters sometime later,
LLMOP completes the request, and deallocates the request block.
Unfortunately, LLMOP never removed the request block from the request queue,
and it now has a stale pointer to some memory it used to own.

In any case, somebody else eventually acquires the deallocated memory, and
it's downhill from there.  In one particular case, the memory was picked up
by SCLINK, and LLMOP's request queue ended up pointing at SCLINK's logical
link list.  LLMOP then tried to find something on the request queue, and
got lost, resulting in a KPALVH.

Solution:  Don't queue the request.  There was never any reason to do so, as
LLMOP kept track of th request block by storing it's address in UNRID of
the NISRV arg block.


                               [End of TCO 6.1.1298]

                               TCO-number:  6.1.1299



Written-by:  GROSSMAN                         Creation-date:  30-Mar-85 11:24:15


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLMOP




Problem:  Lost memory when using LLMOP functions .RCRSV, .RCREL, and .RCRBT.

Diagnosis:  Each of these functions allocates an LLMOP request block, and
never returns it.

Solution:  Set the 'abort' bit in the request block for each of these functions.
When the transmit complete interrupt happens, this will cause the block to
be deallocated.


                               [End of TCO 6.1.1299]

                               TCO-number:  6.1.1300



Written-by:  PAETZOLD                         Creation-date:  31-Mar-85 13:07:39


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	tcptcp




Problem:    

TCP hangs up.  Lots of FLKTIMs.  TCPHLK is locked.

Diagnosis:  

Solution:    

BUFHNT: needs to check for a null buffer.


                               [End of TCO 6.1.1300]

                               TCO-number:  6.1.1301



Written-by:  PAETZOLD                         Creation-date:  31-Mar-85 13:16:34


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	tcpjfn




Problem:    

;LOCAL-HOST and ;FOREIGN-HOST GTJFN attributes do not work.

Diagnosis:  

Solution:    

HSTHST is returning results in T1 and not T2 like it is supposed to.


                               [End of TCO 6.1.1301]

                               TCO-number:  6.1.1302



Written-by:  PAETZOLD                         Creation-date:  31-Mar-85 13:27:24


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	IPNIDV




Problem:    

ARPBFL BUGCHKs and eventual ARP shutdown.

Diagnosis:    

IPDWNS BUGINFs cuased by among other things carrier failures on the NI.
Eventually we run out of ARP buffers and an IPABFL results.

Solution:    

ARPCDS transfers to CDSERR on a send failure but forgets to release the
ARP buffer.  Fix up ARPCDS to do the correct thing and make it display
the real error code instead of the NISRV dispatch address for ARP service.


                               [End of TCO 6.1.1302]

                               TCO-number:  6.1.1303



Written-by:  PAETZOLD                         Creation-date:   2-Apr-85 08:17:31


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	ipipip




Problem:    

resource problems in the tcp/ip code.  specifically ipiblp and knifqe
buginfs.

Diagnosis:    

At times of high load the internet fork is not running fast enough.

Solution:    

Use jobbit to set the priority of the internet fork up.  The code used to
be this way but was changed a while back as an expiriment.  the expirament
failed.


                               [End of TCO 6.1.1303]

                               TCO-number:  6.1.1304



Written-by:  WAGNER                           Creation-date:   2-Apr-85 10:55:22


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCHED


Related-SPR:  	 20483



Problem:  SKDIDL overflows into SKDTM0 after 95 hours, 26 minutes, 37 seconds
	 of idle time.

Diagnosis:  SKDIDL is kept in HP time units, only so many can fit. We convert
	 these to mS and put them in SKDTM0. But a conversion of overflowed
	 garbage is still garbage.

Solution:  Check for impending overflow, correct down by a constant, remember
	 that constant. Convert to mS where it is subsequently used, add
	 back in that converted constant. Since SKDIDL is only used to be
	 converted to mS anyway, and only in one place (two if class scheduling)
	 it is more effecient this way (one compare each load average update)
	 than if we changed all the calculations to use double word arithmetic.
	 Besides, the compare only succeeds every 95 hours, etc.


                               [End of TCO 6.1.1304]

                               TCO-number:  6.1.1308



Written-by:  PAETZOLD                         Creation-date:   5-Apr-85 15:04:37


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	ipipip




Problem:    

ILULK2 BUGHLTs from RCVGAT in IPIPIP when a system has multiple network 
interfaces and the system is a gateway and an interface is down and a
gateway client sends packets to be forwarded on the interface which is down.

Diagnosis:    

The  packet  first  comes  into RCVGAT. RCVGAT checks for a full length
buffer and unlocks the buffer  accordingly.  Since  the  buffer  was  a
receive buffer received from a hardware interface it is full length.

The  destination  address is not for the local host so the packet is to
be forwarded. The packet is now given to another interface to  forward.
Unfortunatly the target interface is down.

GWYLUK  is  called  to  find  another  interface for the packet. GWYLUK
returns an interface which is currently up. It is determined  that  the
local host is indeed the gateway and the packet is forwarded to SNDLCL.
SNDLCL  locks  down the packet again. But uses the actual length of the
data and not the length of the  buffer.  This  is  usually  OK  because
SNDLCL is not usually sending out recieve buffers.

RCVGAT  gets  the packet again and unlocks it. RCVGAT unlocks the whole
buffer but SNDLCL only locked the data portion. If the buffer crosses a
page boundary (probability .5) an ILULK2 will result.

Solution:    

Change SNDLC4 to lock the whole buffer if the buffer is full size. This
appears to be a day one BBN problem.


                               [End of TCO 6.1.1308]

                               TCO-number:  6.1.1309



Written-by:  GRANT                            Creation-date:   5-Apr-85 20:09:29
Edited-by:   GRANT                            Edit-date:       6-Apr-85 12:30:59


Edit-checked:         No     Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	scampi




Problem:    KPALVH, SCASCQ, and KLPNOM BUGHLTs.  There were probably others
but we've lost track.

Diagnosis:    A packet was appearing on 2 of the CI port's queues at the same
time, leading to massive confusion.  The results were CI-related BUGHLTs
of various flavors.

Solution:    When CIFORK was created SCAMPI was made to set a bit whenever it
needed buffers allocated or connect blocks reaped.  Newly-added code caused
a buffer to get returned when it shouldn't have been.  This was caused by
incorrectly skipping over a RET.


                               [End of TCO 6.1.1309]

                               TCO-number:  6.1.1310



Written-by:  GRANT                            Creation-date:   8-Apr-85 07:42:47


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyklp




Problem:  Can't test CI microcode's NO-ANSWER feature which is used by
	diagnostics.
Diagnosis:  No code.
Solution:  Add a routine which does the SET-COUNTER function to set the
NO-ANSWER bit.  This is never called by the standard system;  it is only used
for debugging the micorcode.

                               [End of TCO 6.1.1310]

                               TCO-number:  6.1.1312



Written-by:  PALMIERI                         Creation-date:   8-Apr-85 14:08:51


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	SCPAR	SCJSYS




Problem:    COMMMS BUGHLTs

Diagnosis:  
        SCJSYS is releasing SAB blocks twice.  If a link blocks as a result of
        a SOUT/SOUTR the SAB it was using is entered into the "active" slot of
        the SAB indirect table.  If the fork runs at a higher priority level
        before the output completes the monitor will notice the incomplete
        output and attempt to complete it.  If it succeeds, the SAB will be
        returned to the monitor free space pool.  After the higher priority's
        output completes the blocked lower priority will be wakened.  It still
        has a pointer to the now released SAB in its ACs and may attempt to
        release it a second time, resulting in a COMMMS BUGHLT.

Solution:  
        Keep an indirect "active" slot for each PSI level (normal,1,2,3).
        Only attempt to complete blocked output that is at the current PSI
        level.  Do not return a SAB to its "normal" (not active) slot if the
        SAB indirect table pointer (PSBSAB) is zero.


                               [End of TCO 6.1.1312]

                               TCO-number:  6.1.1313



Written-by:  PALMIERI                         Creation-date:   8-Apr-85 16:01:14
Edited-by:   PALMIERI                         Edit-date:       8-Apr-85 16:05:50


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	SCLINK


Related-QAR:  	838176



Problem:    36-bit byte mode does not send all bytes.

Diagnosis:    SCLINK tries to determine if all bytes in users buffer will fit
	into a message.  AC P1 is used as a flag to indicate more to send.
	If SCLINK thinks all bytes will fit into a segment but the copy
	routine does not, SCLINK does not notice.

Solution:    Update P1 after user's data is copied.


                               [End of TCO 6.1.1313]

                               TCO-number:  6.1.1314



Written-by:  PALMIERI                         Creation-date:   9-Apr-85 11:16:45


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	SCJSYS




Problem:    DECnet free space used up

Diagnosis:    SCJSYS not releasing port indirect tables

Solution:    Change SKIPN to SKIPE in RELSJB


                               [End of TCO 6.1.1314]

                               TCO-number:  6.1.1316



Written-by:  MELOHN                           Creation-date:   9-Apr-85 15:16:07


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV




Problem:    When SET HOSTing from VMS to TOPS-20 several of the initial
characteristics sent out by TOPS-20 are ignored by VMS.

Diagnosis:    VMS does not correctly handle an init and a
characteristics CTERM message in the same foundation common data message.

Solution:   Provide characteristics in smaller spoonfulls for VMS by
putting each CTERM message in it's own common data foundation message
type.


                               [End of TCO 6.1.1316]

                               TCO-number:  6.1.1317



Written-by:  MELOHN                           Creation-date:   9-Apr-85 15:44:25


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV




Problem:    System crashes when user CTERMs in and HOSTs out.

Diagnosis:    CTHOOE, the TDCALL which determines out-of-band echoing
for CTERM terminals, is in swappable code. It is called at scheduler
level when the .MOSNH MTOPR% is being used. *BAM*

Solution:    Put CTHOOE is RESCD.


                               [End of TCO 6.1.1317]

                               TCO-number:  6.1.1318



Written-by:  TBOYLE                           Creation-date:  11-Apr-85 15:56:19
Edited-by:   TBOYLE                           Edit-date:      11-Apr-85 15:59:07


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FORK




Problem:    SPLFK% can cause a job to run out of fork handles if
another fork obtains handles on either of the forks used in the
call that exercises the new SPLFK% with suicide option.

This can occur if you ^C out of LINK when it has two forks and do
an INFORMATION FORK, and it also happens by just using LINK with
Rutgers "WATCH" program running with the "program-watch" option.

Diagnosis:    Part of the code must exchange the Job-wide data on
the two forks so that one becomes the other. The Job-fork-handle
share counts should not, however, be exchanged. This is because
other forks have relative fork handles that will point to the
new forks and they must remain the same so that the Job-fork-handles
can be properly released.

Solution:    Add code to insure that the share counts (FKHCNT) remain
the same after the splice with suicide option.


                               [End of TCO 6.1.1318]

                               TCO-number:  6.1.1319



Written-by:  PAETZOLD                         Creation-date:  12-Apr-85 11:28:53


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	IPCIDV




Problem:    

TCP/IP on the CI does not initialize.

Diagnosis:    

Job zero startup has changed around and internet is usually not initialized
when IPCIDV tries to initialize.  This is possible and IPCIDV handles the
situation. However it handles it wrong and marks the interface desired 
state to be down.

Solution:    

Change a jrst into a ret at CIPRST+1.


                               [End of TCO 6.1.1319]

                               TCO-number:  6.1.1320



Written-by:  LEACHE                           Creation-date:  14-Apr-85 14:39:22
Edited-by:   LEACHE                           Edit-date:      14-Apr-85 17:00:07


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	BOOT


Related-SPR:  	 19606



Problem:      BOOT does not always reload all DX20's on the system.  Also,
BOOT will often unnecessarily reload some DX20's more than once.

Diagnosis:      Lost in the dawn of history is the reason why BOOT specifically
avoids reloading tape DX20's.  The unnecessary reloadings are an artifact
of the design of pre-DX20 BOOT.

Solution:      Make each invocation of BOOT (whether manual or auto-reload)
load each DX20 on the system exactly once.


                               [End of TCO 6.1.1320]

                               TCO-number:  6.1.1321



Written-by:  LEACHE                           Creation-date:  14-Apr-85 14:47:11
Edited-by:   LEACHE                           Edit-date:      14-Apr-85 17:12:37


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	BOOT	CHECKD	DSKALC	PROLOG


Related-SPR:  	 19084



Problem:      BOOT will abort an auto-reload if it gets a dump error.

Diagnosis:      This change was made in V5 so that if the dump was important
the dump could be attempted again.  This is acceptable behaviour on a
development system, but not on a production machine where the most important
thing is to get the system back up.

Solution:      Create a home-block cell for storing BOOT parameters and modify
CHECKD to read and write these parameters.  Define a parameter that, when
set, will cause BOOT to halt when dump errors are encountered during an
auto-reload.  The default behaviour has been changed to proceed on dump
errors.


The CHECKD command ENABLE BOOT-PARAMETERS will set and clear the parameters.
SHOW BOOT-PARAMETERS will display the settings.


The first bit in the parameter-word controls whether BOOT will read the
remaining flags or not.  If the first bit is set, then BOOT will read the
remaining flags (only 1 of which is defined: halt-on-dump-errors).  When
BOOT encounters an enabled parameter word it will change its prompt to
[*BOOT ...] to indicate on the console that it is reading parameters that
may change its behaviour.


                               [End of TCO 6.1.1321]

                               TCO-number:  6.1.1323



Written-by:  GLINDELL                         Creation-date:  16-Apr-85 21:03:45


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCLINK	STG


Related-QAR:  	838204



Problem:  
Remember the TCO last week about the number of nodes growing
on the Enet?  Well, you can forget it now.  Instead of having
a static maximum size of the database, make it dynamic.

Diagnosis:  

Solution:  
Instead of allocating a chunk of memory at initialization time,
get a page at a time when needed instead.  Use ASGVAS.


                               [End of TCO 6.1.1323]

                               TCO-number:  6.1.1324



Written-by:  GROSSMAN                         Creation-date:  16-Apr-85 23:10:33


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLMOP




Problem:  Remote console programs, and NCP's TRIGGER NODE command eventually stop
working.

Diagnosis:  LLMOP loses track of the number of receive buffers it has posted for
the Remote Console protocol type.  It gets into a state where it beleives that
it has two receive buffers posted, when in reality, it has none posted.

This was caused by the use of a bizarre mutation of the INCR/DECR macros.

Solution:  Change all occurances of the INCRF and DECRF macros into INCR and DECR
macros as appropriate.  Delete definitions of INCRF and DECRF to prevent
future abuse.


                               [End of TCO 6.1.1324]

                               TCO-number:  6.1.1325



Written-by:  GRANT                            Creation-date:  17-Apr-85 13:01:01
Edited-by:   GRANT                            Edit-date:      17-Apr-85 13:09:20


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO




Problem:      If a cluster system gets hung during startup, the console message on
	the other systems is not informative, namely, " %Problem Drive Dual
	Ported to Unknown System ".

Diagnosis:      It doesn't tell you what the problem is.

Solution:      Change the message to, " %Drive forced offline because a running
	system hasn't joined the cluster ".


                               [End of TCO 6.1.1325]

                               TCO-number:  6.1.1326



Written-by:  PALMIERI                         Creation-date:  17-Apr-85 13:24:40


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	SCJSYS


Related-QAR:  	838190



Problem:    No interrupt if connect is rejected.

Diagnosis:    If connect initiate is rejected interrupt is given on the data
	channel rather than the connect channel.

Solution:    Give interrupt on the connect channel.


                               [End of TCO 6.1.1326]

                               TCO-number:  6.1.1327



Written-by:  LOMARTIRE                        Creation-date:  17-Apr-85 13:56:11


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CFSSRV




Problem:    
     PITRAP BUGHLTs.

Diagnosis:  
     The  routine  CFCARV is not correctly calculating which page to lock down.
If the old page has been completely used and a new page is being acquired, then
page N+2, not N+1 is locked down. Thus, the next page which will be used is now
unlocked.

Solution:  
     First,  ask  Tom  Moser  for  help.  Then,  change  a ADDI T1,PGSIZ to ADD
T1,CFNXSZ.


                               [End of TCO 6.1.1327]

                               TCO-number:  6.1.1328



Written-by:  LOMARTIRE                        Creation-date:  17-Apr-85 14:27:20


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGUTL	STG	GLOBS


Related-TCO:  	6.1.1282



Problem:    
     TCO 6.1.1282 is no longer needed. The PITRAP problem has been found.

Diagnosis:  

Solution:    
     Remove it.


                               [End of TCO 6.1.1328]

                               TCO-number:  6.1.1329



Written-by:  GLINDELL                         Creation-date:  17-Apr-85 15:56:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCLINK




Problem:  
It is possible to redefine the executor name after DECnet has
initialized.  This does not cause any problems except perhaps
user confusion.  However, the philosophy and documentation says
that it is not allowed to change either the executor name or
address.

Diagnosis:  
The 'add node' routine SCTAND in SCLINK checked if the user
was changing the executor address, but not if the name was
changed.

Solution:  
When not finding the node name to add in the database, see if
the node address is in the home area.  If so, clear out any
ADRTAB entry that there may be.


                               [End of TCO 6.1.1329]

                               TCO-number:  6.1.1330



Written-by:  GRANT                            Creation-date:  17-Apr-85 17:23:08


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phymsc




Problem:  MSCP tries to connect to TOPS-10 systems.
Diagnosis:  MSCP tries to connect to HSC50s and KL10s.
Solution:  Change MSCP to attempt connections based on software type, not
	hardware type.  Try to connect to systems who run either "HSC" or
	"T-20", as indicated in their start packets.

                               [End of TCO 6.1.1330]

                               TCO-number:  6.1.1331



Written-by:  GLINDELL                         Creation-date:  18-Apr-85 11:56:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	D36COM




Problem:  
ILMNRF's when generating a 3.0 event caused by a bad incoming
'disconnect initiate' DECnet message.

Diagnosis:  
DNCM2B is called to copy the disconnect data from the message block
into a disconnect block.  Doing so, it resets the index pointer in the
MSD from T6 to T3 and leaves it that way.  The poor event code later
tries to read the message again to create the event, and uses the T3
that was left in the MSD as index pointer instead of the correct T6.
Solution:  
Restore the index AC used before saving the update byte pointer in
DNCM2B.


                               [End of TCO 6.1.1331]

                               TCO-number:  6.1.1332



Written-by:  GLINDELL                         Creation-date:  19-Apr-85 14:26:49
Edited-by:   GLINDELL                         Edit-date:      19-Apr-85 15:24:30


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CIDLL




Problem:      
ZERO LINE CI-0-0 COUNTERS produced amusing results.

Diagnosis:      
DECnet does not maintain any CI line counters.  CIDLL therefore
returned error code NF.FNS which means "function not supported"
when asked to zero these counters. This caused great consternation
in NMLT20.

Solution:      
Make CIDLL return "NF.NDP" which means 'function succeeded but
no data present to return' when asked to do something to the
CI line counters.  This will be more to NMLT20's taste.

Also fix NMLT20 to handle an error return from the ZERO COUNTER
call.


                               [End of TCO 6.1.1332]

                               TCO-number:  6.1.1333



Written-by:  PAETZOLD                         Creation-date:  22-Apr-85 09:44:58


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	IPIPIP




Problem:    

asniq jsys checks for sc%nwz put does not check for sc%whl.

Diagnosis:  

Solution:    

make wheel and operator good enough as well as sc%nwz.


                               [End of TCO 6.1.1333]

                               TCO-number:  6.1.1334



Written-by:  LOMARTIRE                        Creation-date:  23-Apr-85 08:34:04


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYMVR




Problem:    
     MSSCFS BUGCHKs.

Diagnosis:    
     During startup, the system is careful to insure that the system has joined
the  cluster  before allowing it to serve any of it disks with the MSCP server.
However, this is not the case if we are already up, connections are broken, and
then reestablished. In this case, we have already allowed access to the  disks.
If the MSCP connection is established before the CFS one, then the server could
be  handed  a queued up IORB from the other system before CFS has started. This
should not cause any problems since CFS has already arbitrated the I/O request,
but the BUGCHK is not very reassuring.

Solution:    
     First  change MSSCFS into a BUGHLT to try to trap any real problems. Next,
reject any connect requests to the server if we have not  or  are  not  in  the
process  of  establishing  a  CFS  connection.  The MSCP driver will repeat the
attempt later so, eventually, we should get a connection established.


                               [End of TCO 6.1.1334]

                               TCO-number:  6.1.1335



Written-by:  GRANT                            Creation-date:  23-Apr-85 08:46:53


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO


Related-QAR:  	838025



Problem:  If you have an RP06 with both its port to the same KL, you can't
	switch from Port A to Port B while working is going on.

Diagnosis:  %Problem on device .......   messages result.  TOPS-20 won't use
	the newly selected port.

Solution:  The code which searches for a second path to the same disk wrongly
	establishes the primary and secondary paths.  In some cases it was
	setting the primary path to be the side which is now offline.  Fix
	the code so it always has an active port as the primary path.


                               [End of TCO 6.1.1335]

                               TCO-number:  6.1.1336



Written-by:  GROSSMAN                         Creation-date:  23-Apr-85 17:01:05
Edited-by:   GROSSMAN                         Edit-date:      23-Apr-85 17:03:47


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MONSYM	NIUSR




Problem:    No way for user's to find out the Ethernet address of the local
node via the NI% JSYS.

Diagnosis:    OOPS.

Solution:    Return the Physical address (the current address), and the
Hardware address in the .EIRCI (Read Channel Info) function.


                               [End of TCO 6.1.1336]

                               TCO-number:  6.1.1337



Written-by:  GROSSMAN                         Creation-date:  23-Apr-85 17:27:21


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR	NISRV




Problem:  SMON% code for setting Ethernet address in wrong module.

Diagnosis:  The code belongs in NIUSR, but NIUSR didn't exist when the code
was written.

Solution:  Move the code from NISRV to NIUSR.


                               [End of TCO 6.1.1337]

                               TCO-number:  6.1.1339



Written-by:  PALMIERI                         Creation-date:  24-Apr-85 10:09:43


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	STG	FREE


Related-QAR:  	838215



Problem:    DECnet swappable free space is available but never used in 6.1.

Diagnosis:    Pre 6.1 needs it but it is always configured regardless of setting
	of FTNSPSRV.

Solution:    Add FTNSPSRV around parameter that makes it size non-zero.


                               [End of TCO 6.1.1339]

                               TCO-number:  6.1.1341



Written-by:  GROSSMAN                         Creation-date:  24-Apr-85 13:03:26


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR




Problem:  Ill mon refs if bad byte pointers given to NI% JSYS.

Diagnosis:  No ERJMPx after some PXCT's.

Solution:  Put ERJMPs after all PXCTs that can occur while NOINT.


                               [End of TCO 6.1.1341]

                               TCO-number:  6.1.1342



Written-by:  GROSSMAN                         Creation-date:  24-Apr-85 13:40:04


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NISRV	NIUSR	NIPAR




Problem:  When the NI% JSYS returns monitor portal ID's to the user, these
ID's are just NISRV's portal block addresses.  There are a number of
problems with this:

	1) User portal ID's are only a halfword, but monitor portal ID's
	   are fullwords.
	2) If the user passes a portal ID to the NI% JSYS, it shouldn't
	   be an address, because he might pass a bad address.
	3) Monitor addresses are ugly looking ID's to pass back to a user.

Solution:  Have NISRV generate a unique 'external' ID for all portals that
are opened.  Now, each portal will have a unique 9 bit code associated
with it.  This code can then be returned to the user as the portal ID.
Also, create a new NISRV function which will translate 'external' portal
ID's to real portal ID's.

The 'external' portal ID will now be returned by the NU.RPI (read portal
info) function of NISRV.


                               [End of TCO 6.1.1342]

                               TCO-number:  6.1.1343



Written-by:  PALMIERI                         Creation-date:  25-Apr-85 16:11:15


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	JNTMAN




Problem:    SYSDPY gets wrong information as to whether task is active (DCN) or
	passive (SRV).

Diagnosis:    JNTMAN doesn't understand the new way to access port blocks.

Solution:    Teach JNTMAN about the port indirect table.


                               [End of TCO 6.1.1343]

                               TCO-number:  6.1.1346



Written-by:  GROSSMAN                         Creation-date:  29-Apr-85 14:28:36


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGEM




Problem:  IDFOD2 BUGCHK'S, PAGLCK BUGHLT'S sometime later on.  Seem to be
related to doing word searches in MDDT.

Diagnosis:  MDDT gets bad information from the MRPAC% JSYS, and touches
CXBPGA.  This causes the page to be created with the section 6 map as
it's owner, instead of the currently running process.  When the process
gets destroyed, it attempts to deassign the page, and a PAGLCK results.

Solution:  Add a check to the MRPAC% JSYS to explicitly check for section
XCDSEC and the special pages, and treat them as process private pages.


                               [End of TCO 6.1.1346]

                               TCO-number:  6.1.1347



Written-by:  GROSSMAN                         Creation-date:  29-Apr-85 14:36:41


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGEM




Problem:  MDDT hangs when doing word searches in section 6 (XCDSEC).

Diagnosis:  MDDT touches a page that is not mapped into section 6, but is
mapped into section 0/1.  The test for page ownership in FPTA incorrectly
claimed that the page was owned by section 0/1, and paged in the wrong
page.  PAGEM eventually restarts the faulting instruction, and the faults
ocurrs again and again ad nauseum.

Solution:  Fix the page ownership test for XCDSEC pages (FTPAXC) to claim
that all pages between NRCOD and NRCODZ (within section 6) are owned by
the XCDSEC map.  All other pages in section 6 are owned by the section 0/1
map.


                               [End of TCO 6.1.1347]

                               TCO-number:  6.1.1348



Written-by:  GROSSMAN                         Creation-date:  29-Apr-85 14:54:05


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLMOP




Problem:  LLMOP% remote console loses buffers, eventually, it stops working
altogether.

Diagnosis:  When LLMOP receives a message with an error from NISRV, it neglects
to update it's outstanding receive buffer count (SVBPC).  Eventually when it
calls PSTBUF to setup more receive buffers, PSTBUF doesn't queue up any
because SVBPC is too high.  Eventually, all receive buffers are lost and
no more messages come in for the remote console protocol type.

Solution:  Re-arrange the routine LLMRCX (the common receive processing
routine for all LLMP protocol types).  Ensure that SVBPC gets decremented
for all buffers received back from NISRV.  Also, keep seperate counts for
the receipt of unsupported messages, and bad messages.


                               [End of TCO 6.1.1348]

                               TCO-number:  6.1.1350



Written-by:  LOMARTIRE                        Creation-date:  30-Apr-85 15:52:12


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MEXEC


Related-QAR:  	838042



Problem:    
     When  a  system  in  a CFS cluster automatically reloads, the cluster-wide
time may be set backwards by some amount as dictated by the reloading machine.

Diagnosis:    
     The  startup code in MEXEC sets the system time before joining the cluster
by obtaining the time from the front  end.  This  time  may  actually  lag  the
cluster-wide  time  by  a  large  amount.  Depending  on  the seqequence of CFS
connections at startup, this bogus time may propogate to other machines.

Solution:    
     Do  not set the system time until after we have joined the cluster. Prefer
the cluster time to that supplied by other sources. The  ordering  will  be  as
follows:

	1.  the cluster time
	2.  the front-end
	3.  the person at the console


                               [End of TCO 6.1.1350]

                               TCO-number:  6.1.1351



Written-by:  GLINDELL                         Creation-date:  30-Apr-85 16:11:31


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NTMAN


Related-QAR:  	838253



Problem:  
SHOW LOOP NODES returns a node number for the loop node names.
Loop node names do not have node numbers.  This causes a funny-looking
display in NCP.

Diagnosis:  
Loopback code in NTMAN calls NMXS2A (convert sixbit to ascii) and
thinks NMXS2A returns +1.  It doesn't, it returns +2 always.

Solution:  
Change NMXS2A to always return +1.  This will fix another case
in NTMAN when a routine didn't expect NMXS2A to return +2.  That
case is more interesting - although I haven't seen any problems
with it - every time NTMAN was asked to convert a node number to
a name (often) it would fall through to the code on the next page.
It didn't seem to do any harm though.


                               [End of TCO 6.1.1351]

                               TCO-number:  6.1.1354



Written-by:  GROSSMAN                         Creation-date:   2-May-85 09:53:16


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MEXEC




Problem:  Programs doing CRJOB JSYS can hang permanently.

Diagnosis:  The job that is being created by the CRJOB JSYS (hereafter
called the 'object' job) can be logged out before being completely created.
If this happens, the object job never gets to setup CRJANS (the CRJOB answer
cell), and the job that did the CRJOB JSYS hangs forever waiting for CRJANS
to become non-zero.

Although, there is a relatively small window between the time of the job's
creation, and it's setup of CRJANS, this window can be lengthened quite
a lot by an ACJ that takes a long time when monitoring LOGINs.

Solution:  Have the logout code at FLOGO check to see if this job is the
object of a CRJOB.  If it is, ensure that CRJANS gets set to -1 to wake up
the program doing the CRJOB.


                               [End of TCO 6.1.1354]

                               TCO-number:  6.1.1355



Written-by:  GROSSMAN                         Creation-date:   2-May-85 17:12:46


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  KNIIPT BUGHLTs.

Diagnosis:  When closing an NISRV portal, a resource failure (from ASGRES)
can occur.  This error gets passed back up to the user, who may then
optionally try closing the portal again.  Unfortunately, PHYKNI was not
cleaning up properly after the failure.  When the user re-tried the close
(somewhat later), a sanity check failed, and a KNIIPT BUGHLT resulted.

Solution:  Re-do some code so that the resource failure cannot occur in the
routine NIDPT.


                               [End of TCO 6.1.1355]

                               TCO-number:  6.1.1359



Written-by:  WAGNER                           Creation-date:   6-May-85 11:22:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGUTL




Problem:  PMCTL% returns "page available" status, even if page number requested
	 is greater than physical memory.

Diagnosis:  Code was dreaming, comparing against maximum possible, not against
	 what we can actually afford.

Solution:  Make check against reality, in this case NHIPG instead of MAXCOR.


                               [End of TCO 6.1.1359]

                               TCO-number:  6.1.1360



Written-by:  TBOYLE                           Creation-date:   6-May-85 14:30:21
Edited-by:   TBOYLE                           Edit-date:       6-May-85 14:49:02


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	DSKALC




Problem:    Unecessary pages marked in the batblocks. Extra entries
present that could have been a single entry.

Diagnosis:    Disks that have sectors/page equal to one (HSC, RP20)
allocate BAT entries as if there were 4 sectors per page. Also, a
miscalculation of sector rounding causes a bad page to be entered as
a separate entry even if an entry for the next page exists in the
batblocks. It should alter the existing entry to include the new bad page.

Solution:    Use SECPAG based on the UDB to properly add pages to
existing and new pages to the batblocks around the code in DOPAIR.


                               [End of TCO 6.1.1360]

                               TCO-number:  6.1.1363



Written-by:  GROSSMAN                         Creation-date:   6-May-85 15:43:08


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	TTYSRV




Problem:  ILMNRFs, potential core trashing when doing sendalls.

Diagnosis:  The routine ASGSHT was smashing T2 on error returns.  The caller
expected T2 to contain the line number upon return from ASGSHT.  It then used
the bad T2 to zero an entry in TTACTL.  In this case, we were very lucky,
because we tried to zero write protected memory.

Solution:  Make ASGSHT preserve T2 even for error returns.


                               [End of TCO 6.1.1363]

                               TCO-number:  6.1.1364



Written-by:  GROSSMAN                         Creation-date:   6-May-85 15:53:54


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  Systems without KLNI's get messages at startup indicating that the
KLNI is being reloaded.  In addition, the reload fails, and a KNIRTO BUGCHK
results.

Diagnosis:  When I moved the KLNI initialization from PHYH2 to PHYSIO, I
lost the check for KLNIness.  So, no matter what happens to be in RH slot
5, NISRV will pounce on it and do lots of nasty things.  This is generally
not a problem if there is nothing in RH slot 5.  However, if there is an
RH20 in slot 5, chaos would result.

Solution:  Put the check for KLNIness back into NIINI.


                               [End of TCO 6.1.1364]

                               TCO-number:  6.1.1365



Written-by:  GLINDELL                         Creation-date:   6-May-85 16:35:32


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCLINK




Problem:  
COM* bug* when doing SHOW KNOWN NODES in NCP.

Diagnosis:  
The buffer to return 'known nodes' was allocated based on a symbol
that is not guaranteed to be kept up to date.  When the latest round
of node name tables came around, node numbers that were higher in value
than the symbol used occurred.

Solution:  
Don't try to be smart - always set up a buffer big enough for the highest
possible node number (1023).  The memory allocated is only 1 bit per node.


                               [End of TCO 6.1.1365]

                               TCO-number:  6.1.1366



Written-by:  TBOYLE                           Creation-date:   6-May-85 17:03:51


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYPAR	PHYP2




Problem:  OVRDTA, DXBEWC, Hung Jobs and of course, headaches.

Diagnosis:  There is a class of servo errors that seem to wedge
the DX20. On an unrecoverable error (seemingly a servo track
error.) The DX20 remains locked to one drive. A hang occurs
when an attempt is made to transfer to a different drive.
If another transfer to the same drive occurs successfully,
the error is reset. This will usually happen right away because
the monitor will write the errant page to the batblocks so
as to never use the bad page again and the error will
usually be reset. There is a problem window however.

Solution:  Include a new bit in the UDB to watch for an overdue
transfer that occurs twice. When it happens restart the microcode,
this will reset the world. This is ok, because PHYSIO will retry
pending transfers. It will also recover from this error so that
processing may proceed.

In the event that there are other forms of errors that hang the DX20,
this TCO will guard against them from killing the system. Since they
all seem to happen with unrecoverable errors, we can be content
because they will be marked bad and not used again.


                               [End of TCO 6.1.1366]

                               TCO-number:  6.1.1369



Written-by:  TBOYLE                           Creation-date:   7-May-85 16:32:44
Edited-by:   TBOYLE                           Edit-date:       7-May-85 16:37:38


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYP2




Problem:    Bad pages used over and over again on RP20's.

Diagnosis:    The error handler is not catching all possible media
and HDA weakened errors. The error handler is mostly biased toward
considering errors that are not clean data errors as device errors.
There are, however, numerous nasty errors such as weak or defects on
servo track, bad formatting, etc. There are also media errors
where the information returned is incomplete because these errors
cause the microcode to have head pains. Since all these errors are
flagged as device errors, they never make it into the BAT blocks.

Solution:    Bias the error handler toward media errors. Pick out the
controller, DCU errors, parity errors, etc. and flag them as retriable
device errors. Treat all others as data errors. If they do not succeed
after the many retries, this will insure that they are put into the
BAT blocks.


                               [End of TCO 6.1.1369]

                               TCO-number:  6.1.1370



Written-by:  MELOHN                           Creation-date:   7-May-85 17:47:30
Edited-by:   MELOHN                           Edit-date:      14-May-85 16:55:51


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV	CTERMD


Related-TCO:  	6.1.1390



Problem:    SKDPF1/SKDCL1 in LOKWAI+a few.

Diagnosis:   Problems with the CTERM interlock scheme between the
CTERM fork, which runs at process level, and TTYSRV TDCalls which want
to mung around in the CDB. 

If someone other than the CTERM fork wants the CTERM lock, their
TDB is unlocked and they wait in scheduler test LOKWAI for the CTERM
lock. With the TDB unlocked, Bad Things can happen to the TDB, and if
TDB goes away, LOKWAI will blow up. If the DECnet link goes away first,
LOKWAI returns to LOKCDB, which returns to its caller with the TDB
still unlocked. When the caller tries to unlock the unlocked TDB,
ULKBADs can happen. It is really a bad idea for CTERM to unlock
something it didn't lock in the first place.

Solution:    Do NOT try to unlock the TDB while waiting around for the
CDB lock.


                               [End of TCO 6.1.1370]

                               TCO-number:  6.1.1371



Written-by:  MELOHN                           Creation-date:   7-May-85 17:55:24


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV	CTERMD




Problem:    CTERM gets wedged such that it will not accept any new
connections and its current connections are hung.

Diagnosis:    Another problem with the CTERM interlock scheme between the
CTERM fork, which runs at process level, and TTYSRV TDCalls which want
to mung around in the CDB. 

Many error conditions (DECnet link gone, CTERM protocol error, etc)
decide to "blow the link away". As part of this process, the CDB is
deallocated. If however the CTERM line requires service, the CTERM
fork (which is always given the CTERM lock) will hang trying to mung
the deallocated CDB.

Solution:    Add a new state for the CDB - .STDEL. Instead of "blowing
the link away", put the CDB in this state and tell the CTERM fork to
dispose of it as part of its service routine.


                               [End of TCO 6.1.1371]

                               TCO-number:  6.1.1372



Written-by:  GROSSMAN                         Creation-date:   9-May-85 09:30:03


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  ILMNRF's, ILLUUO's, and various other assorted problems.

Diagnosis:  A bad KLNI microcode resulted in the generation of an error
response which was unknown.  The error code was used to index into a
dispatch table.  Unfortunately, the entry for the error code had no IFIW,
and subsequently caused PHYKNI (in section 6) to jump somewhere into
section 0, resulting in total chaos.

Solution:  Add IFIW's to all the dispatch words in the error dispatch table.
Now, if we get an unknown error code a KNIIEC BUGHLT will result.


                               [End of TCO 6.1.1372]

                               TCO-number:  6.1.1373



Written-by:  LOMARTIRE                        Creation-date:   9-May-85 15:28:59


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	DSKALC	MSTR




Problem:    
     If  a  structure  which  was previously mounted is re-created, MOUNTR will
refuse to mount it due to an ambiguous ID. The ID for  the  disk  is  found  in
UDBMID and is returned to MOUNTR via an MSTR%.

Diagnosis:    
     Whenever  a homeblock is created, a new media ID is written on it. This ID
is not placed in the UDB (or SDB) and this causes confusion.

Solution:    
     Make CRTHOM and MAKHOM smarter and have them update UDBMID and SDBPUC when
a new media ID is created and written on the homeblocks.


                               [End of TCO 6.1.1373]

                               TCO-number:  6.1.1374



Written-by:  LOMARTIRE                        Creation-date:   9-May-85 15:33:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CFSSRV


Related-QAR:  	838252



Problem:    
     CFRECN BUGHLTs as a result of a large configuration.

Diagnosis:    
     Currently, the delay time set by CFS is not large enough to handle a large
configuration.  These  sites  will experience a CFRECN whenever their port dies
due to the amount of time it takes to reestablish all the required connections.

Solution:    
     Make  the  delay  a function of the configuration size. Wait 10 seconds to
reload the CI microcode (very generous) and 5 seconds per node on the CI.


                               [End of TCO 6.1.1374]

                               TCO-number:  6.1.1375



Written-by:  LOMARTIRE                        Creation-date:   9-May-85 16:33:26


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CFSSRV




Problem:    
     When  two  systems  are  coming  up  simultaneously, they will always have
different times.

Diagnosis:    
 When both systems are waiting in "Enter date and time:", the system with the 
larger serial number is supposed to broadcast it's time to the systems on the 
CI with a lower serial number.  However, the routine is missing an index 
register in a key place and so we never broadcast to any one.

Solution:    
     Add  the index register in routine BRDTIM. Note that a bug still exists if
you proceed the higher numbered system first. In this case, the lower one  will
proceed  second  and  it  will  not  broadcast to the higher. Once again a time
mismatch. However, the solution to this one is left until later.


                               [End of TCO 6.1.1375]

                               TCO-number:  6.1.1376



Written-by:  GROSSMAN                         Creation-date:   9-May-85 16:58:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	STG	PHYKNI	GLOBS




Problem:  Undefined global symbols when KNIN=0.  The symbols DLLUNI, LLMRSJ,
LLMRSF, NIJKFK, and LLMINI are all undefined at LINK time when STG is
compiled with KNIN=0.

Diagnosis:  Oops.  Several misspellings.  Now, DLLUNI will return an error
code of UNIFC% (invalid function code) if NISRV is not loaded.


                               [End of TCO 6.1.1376]

                               TCO-number:  6.1.1377



Written-by:  GRANT                            Creation-date:  12-May-85 08:34:47
Edited-by:   GRANT                            Edit-date:      12-May-85 08:41:02


Edit-checked:         No     Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MEXEC	phyklp


Related-QAR:  	838252



Problem:      KLPNRL BUGHLTs

Diagnosis:      Any number of deadly embraces can occur when there is heavy CI
	activity and TOPS-20 tries to reload the CI microcode.  These are all
	related to the monitor's having to read in IPALOD and run it.

Solution:      At system startup, read the CI microcode into resident memory.
	Then, whenever the monitor needs to reload the CI, it can simply do the
	DATAOs itself.


                               [End of TCO 6.1.1377]

                               TCO-number:  6.1.1378



Written-by:  GRANT                            Creation-date:  12-May-85 08:52:10
Edited-by:   GRANT                            Edit-date:      12-May-85 08:53:38


Edit-checked:         No     Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyklp




Problem:    None observed, but code is wrong.

Diagnosis:    At system startup, code currently does a CONO KLP,400000 and then
	does a CONI to verify that the thing in RH slot 7 is really a KLIPA.
	If an RH20 were in slot 7, it just got its PIA zapped.

Solution:    Verify KLIPAness before shooting it.


                               [End of TCO 6.1.1378]

                               TCO-number:  6.1.1379



Written-by:  GRANT                            Creation-date:  12-May-85 09:01:36
Edited-by:   GRANT                            Edit-date:      12-May-85 09:05:46


Edit-checked:         No     Document:          No     TCO-tested:  Yes
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyklp




Problem:    Unnecessary BUGCHKs, namely, KLPNMG and KLPNDG.

Diagnosis:    PHYKLP outputs a BUGCHK if it is called to remove a buffer from an
	empty free queue.  This is unnessary since it gets a buffer from the SCA
	pool and returns that to the caller;  if this should fail, the caller
	will do whatever it feels necessary, including BUGxxxing.

Solution:    Put KLPNMG and KLPNDG under CIBUGX control.


                               [End of TCO 6.1.1379]

                               TCO-number:  6.1.1380



Written-by:  PAETZOLD                         Creation-date:  12-May-85 13:54:19


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	mnetdv	STG




Problem:    

Normal NIC supplied host table no longer fits.

Diagnosis:    

This time the file has grown to a point larger than the code driving the
existing data structures are capable of handling.  Most hosts now have
multiple addresses.

Solution:    

Make HOSTN twice as large as NHOSTS.  Fix code in MNETDV that assumes
all table are of the same length.  Also make the initialization code
that used to be a loop using setzm's use blt's.


                               [End of TCO 6.1.1380]

                               TCO-number:  6.1.1381



Written-by:  GLINDELL                         Creation-date:  13-May-85 11:07:00
Edited-by:   GLINDELL                         Edit-date:      13-May-85 11:20:08


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCJSYS	JSYSF


Related-QAR:  	838154



Problem:    
SWJFN% jsys does not work properly with DECnet JFN's.

Programs that do SWJFN% involving DECnet JFN's get spurious IO data
errors (IOX5).  Job which run such programs will eventuallu get DCNX5
(No more logical links available) errors upon trying to create such
links.

Diagnosis:    
Not all cells of the JFN blocks are swapped.  One of the cells it does
not swap is the cell that contains the index of the DECnet-36 channel
represented by a DECnet JFN.

Solution:    
Have SWJFN% swap all cells of a JFN block, removing the list of cells
to be swapped in favour of a simple loop swapping all cells.  Also call
a routine in SCJSYS to reevaluate the channel numbers for the swapped
JFN's.

Thank you Rob Gingell for analysis and suggested fix.

BTW, SWJFN% can never have worked for any devices that use the new IO
byte pointer stuff (Arpanet?).  This will hopefully fix that as well.


                               [End of TCO 6.1.1381]

                               TCO-number:  6.1.1382



Written-by:  PALMIERI                         Creation-date:  13-May-85 11:57:51


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	SCJSYS


Related-QAR:  	838222



Problem:    Can't get JFN on "SRV:" or "SRV:objectid.taskname".

Diagnosis:    Code doesn't parse network JFNs correctly.  Also doesn't
	allow non-privileged user to open generic TASK.

Solution:    Rewrite the SRV parsing code.


                               [End of TCO 6.1.1382]

                               TCO-number:  6.1.1384



Written-by:  GLINDELL                         Creation-date:  13-May-85 17:23:42
Edited-by:   GLINDELL                         Edit-date:      13-May-85 17:30:11


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLINKS




Problem:        
No problem, except that we could be a little nicer and not exercise
a RSX DECnet bug.

Diagnosis:        
Currently we send a 'flow off' when our local buffer resources are
completely depleted.  It is likely that we will drop incoming messages
that were already under way on the floor.  This is ok, the other end
should retransmit these.  However, unless we are congested we may as
well keep an extra few messages around.  The "goal" concept in LLINKS
was supposed to address this, but it was never fully implemented.

Solution:        
Implement a static goal, defined by the value of NSGOAL (currently 8).
Unless the system is congested, the following will now happen if messages
come in faster than we can process them:

1) Green zone: we accept messages up to the user's quota and put them on
   the user's queues.

2) If more messages come in than fits in the user's quota, then we send
   a 'flow off' and enter yellow zone.  We continue to accept messages
   up till NSGOAL.

3) If we run out of NSGOAL, we enter red zone and drop all incoming messages.

If the system is already globally congested, then we skip yellow zone and
go directl to red zone.

Also, remove all the old 'goal' stuff and put it under feature switch
FTGOL in LLINKS.  Leave the data fields in the ELB and SLB in case we want
to change again, but these fields should probably be removed before SDC.


                               [End of TCO 6.1.1384]

                               TCO-number:  6.1.1385



Written-by:  GLINDELL                         Creation-date:  13-May-85 17:36:57
Edited-by:                                    Edit-date:      13-May-85 17:37:30


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLINKS




Problem:    
None observed, but we could send ACK's faster.

Diagnosis:    
LLINKS used the concept of "buffer-rich" sublinks in order to delay
sending ACK's till clock level, hoping that more messages would come
in and therefore making it possible to ack more than one message.

However, in NSP 4.1, the ACK DELAY concept takes over this function
and removes the need for "buffer-rich" sublinks.  Indeed, when we do
get a message that actually requests an ACK, then we should ack immediately
and not defer it.

Solution:    
Remove "buffer-rich" code and put it under feature switch FTBFR.
Leave the data structures in in case we want to put it back.  However,
the BFR subfield of the ES structure should probably be removed before
SDC ship.


                               [End of TCO 6.1.1385]

                               TCO-number:  6.1.1387



Written-by:  NICHOLS                          Creation-date:  14-May-85 11:37:34


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLINKS




Problem:  When LLINKS receives a DECnet Link Service message from
a remote system using No Flow Control, it rejects the message if the
FCVAL field is non-zero.  The Phase IV+ architecture may make use
of non-zero values here.

Diagnosis:  LLINKS is older than Phase IV+

Solution:  Remove the check.

                               [End of TCO 6.1.1387]

                               TCO-number:  6.1.1390



Written-by:  MELOHN                           Creation-date:  14-May-85 16:55:51


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV


Related-TCO:  	6.1.1370



Problem:    ULKBADs whenever logging in on CTERM TTYs since TCO 6.1.1370 added.

Diagnosis:    I removed the global case of locking and unlocking the
TDB, however a few cases remained which fool around with TLOCKing.
None of these seems to be necessary since CTERM, LAT, and NRT TDBs are
marked as non-deletable and so the TLOCK count is meaningless.

Solution:    Remove the rest of the TLOCK/ULKTTYs from CTHSRV.


                               [End of TCO 6.1.1390]

                               TCO-number:  6.1.1391



Written-by:  PALMIERI                         Creation-date:  15-May-85 22:07:12


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	ROUTER




Problem:    NMLT20 can hang in SHOW CIRCUIT NI-0-0 COUNTERS

Diagnosis:    NMLT20 is dismissed while NISRV asks port to return counter data.
	If the port dies and cannot be restarted the counters may never be
	returned.  NISRV will not respond to the read counters request until
	the portal is closed.

Solution:    Whenever NISRV reports that the port has died close the DECnet
	portal which will cause NISRV to return an error to the read counters
	request and this will be enough to satisfy the scheduler test and get
	NML running again.


                               [End of TCO 6.1.1391]

                               TCO-number:  6.1.1392



Written-by:  PALMIERI                         Creation-date:  15-May-85 22:19:32


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	ROUTER




Problem:    Protocol initialization to MCB DTE often fails and is never retried.

Diagnosis:    First attempt to restart MCB protocol after reload of KL succeeds
	and Router is told that the line is okay to use.  However, the first
	message that Router tries to queue to DTESRV fails because the MCB
	has terminated protocol. (Don't know why it does this).  Router
	then notifies NMLT20 of the termination and closes that circuit
	waiting for the next pass through the once a second code to reopen it.
	If NMLT20 attempts to restart protocol before the next second DECnet
	will not act on a protocol up since the circuit is closed.  NMLT20 will
	not try again and the circuit will never come up.

Solution:    Add TOPS20 specific event 96.7 which will notify NMLT20 that Router
	is attempting to use the circuit.  NML will then re-initialize MCB
	protocol.


                               [End of TCO 6.1.1392]

                               TCO-number:  6.1.1393



Written-by:  LOMARTIRE                        Creation-date:  16-May-85 12:06:18


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGEM


Related-QAR:  	838336



Problem:    
     Code is wrong in NWRBS. This could cause DDMP to hang in WTFOD.

Diagnosis:    
     A SKIPN is done over a TMNN macro which expands to 2 instructions.

Solution:    
     Add an IFSKP.


                               [End of TCO 6.1.1393]

                               TCO-number:  6.1.1394



Written-by:  LOMARTIRE                        Creation-date:  16-May-85 12:16:27


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GTJFN


Related-QAR:  	838295



Problem:    
     File  extension  recognition/completion  does  not work when wildcards are
specified in the filespec.

Diagnosis:    
     It appears that routine ENDALZ was rewritten to use IFSKPs and friends. In
the  process  a  semi-colon was mistakenly inserted in the mask field of a TXNN
instruction. This causes the instruction not to skip when it should. So instead
of returning ambiguous, it completes the field with a bogus extension.

Solution:    
     Remove the semi-colon.


                               [End of TCO 6.1.1394]

                               TCO-number:  6.1.1395



Written-by:  LOMARTIRE                        Creation-date:  16-May-85 12:29:32


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGUTL


Related-QAR:  	838294



Problem:    
     When  an  OPENF%  is done on a system which has no more OFNs, the user may
receive a garbage error code.

Diagnosis:    
     When failing, the correct error code is not always setup.

Solution:    
     Fix routine BGCTYP to always return OPNX10.


                               [End of TCO 6.1.1395]

                               TCO-number:  6.1.1396



Written-by:  NICHOLS                          Creation-date:  16-May-85 21:18:15
Edited-by:   NICHOLS                          Edit-date:      16-May-85 21:31:43


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLINKS




Problem:      Hung DECnet link

Diagnosis:    
This node sends a message to another node which is using No Flow Control.  The
remote node cannot accept the message yet and sends back a Link Service OFF
message.  Then this node's application asks for a synchronous disconnect,
which requires that all outstanding messages be ACKed before the Disconnect
Initiate message is sent.  LLINKS puts the logical link into Disconnect
Initiate state to prevent the application from sending any more messages.  If
the remote node never sends a Link Server ON message, LLINKS will wait forever
to send the Disconnect Initiate message and close the logical link.

Normally such a hung link is detected by sending Link Service messages with
zero data requests and then timing out the ACK.  LLINKS was only doing this
"pinging" for links in the RUN state.

Solution:  

Check for logical link inactivity in DI state as well as RUN state.


                               [End of TCO 6.1.1396]

                               TCO-number:  6.1.1397



Written-by:  MCCOLLUM                         Creation-date:  17-May-85 15:08:44
Edited-by:   MCCOLLUM                         Edit-date:      17-May-85 15:09:17


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FORK


Related-QAR:  	838213	838214



Problem:  
CFORK does not clean up on certain resource exhausted errors.

Diagnosis:  
If CFORK gets a resource exhaustion error after it has assigned a system
and job wide fork handle, it returns to the user without killing the
newly created fork.

Solution:  
Kill the newly created fork when resource exhaustion errors are
encountered


                               [End of TCO 6.1.1397]

                               TCO-number:  6.1.1398



Written-by:  MELOHN                           Creation-date:  17-May-85 18:03:16


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	TTYSRV




Problem:  LAT lines are considered "local".
Diagnosis:  No LAT line type was set up; concensus was that they should be
grouped with remote lines, not local lines.

Solution:  make them remote lines; change remote line message from 
?LOGGING IN ON DATASETS IS NOT ALLOWED to
?LOGGING IN ON REMOTE TERMINALS IS NOT ALLOWED.



                               [End of TCO 6.1.1398]

                               TCO-number:  6.1.1399



Written-by:  MOSER                            Creation-date:  20-May-85 13:37:42


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGUTL




Problem:   OFNBDB BUGHLTs when running CFS with the OFN performance mods.

Diagnosis:   This is a long file/CFS bug. The problem arises because the
OFN database SPTO4 disagrees with the data provided by a caller of ASNOFN.
Examination of the dump reveals that the caller expects to assign an ofn for
second level index block 0 in a long file. The system data reflects the fact
that the OFN currently exists and is a regular index block in a long file. When
a file goes from short to long the OFN database is updated and so this is
unexpected.

Under CFS the problem can arise when:

 - Users G and C running on systems G and C respectivly each open the same
   file when it is short.

 - User G extends the file so it goes long (correctly updating G's local data)

 - User P running on C now opens the same file again (it is long since G
   extended it but C's database does not know)!


Solution:  Expect this to happen and update SPTO4 when it does instead of 
crashing. OFNBDB will remain for other real errors.


                               [End of TCO 6.1.1399]

                               TCO-number:  6.1.1401



Written-by:  MCCOLLUM                         Creation-date:  20-May-85 14:23:38


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA


Related-QAR:  	838367



Problem:  
The BREAKI BUGINF provides that structure index of the structure under
"attack". Since the index is a dynamically assigned value, after-the-fact
analysis of this BUGINF can be difficult.

Diagnosis:  
Same.

Solution:  
Change the value provided in the additional data to the sixbit structure
name.


                               [End of TCO 6.1.1401]

                               TCO-number:  6.1.1402



Written-by:  MCCOLLUM                         Creation-date:  20-May-85 14:47:38


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MONSYM


Related-QAR:  	838366



Problem:  
Customers need reserved error codes in MONSYM to support local additions
to TOPS-20.

Diagnosis:  

Solution:  
Reserve a block of 1000 (octal) error codes from 6000 to 6777 for
customer use.


                               [End of TCO 6.1.1402]

                               TCO-number:  6.1.1403



Written-by:  MELOHN                           Creation-date:  23-May-85 15:13:48


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    SET TER NO PAU COMMAND resulted in the LAT server clearing
input flow control but not output flow control. Likewise setting TER
PAU COMMAND did not reset output flow control.

Diagnosis:    The XON/XOFF characters in the DATA_B slot were shifted
incorrectly in the slot data information.

Solution:    Flush the shifting in favor of canned message which
supplies the correct input and output flow control characters.


                               [End of TCO 6.1.1403]

                               TCO-number:  6.1.1405



Written-by:  MCCOLLUM                         Creation-date:  24-May-85 12:13:46


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	scsjsy




Problem:  
SCSFR1 BUGCHKs.


Diagnosis:  
When SCSKIL is called to close the CI connections of a fork, an SCSFR1
BUGCHK will result if the fork number provided is not the current fork.
Currently this can only occur when a CLZFF% JSYS is performed given as
an argument a fork handle other than the current fork.


Solution:  
If the fork number passed to SCSKIL does not match the current fork, 
let the caller get away with it, but don't do any actual work. Note
that this is not the proper solution, but will  do for now. The correct
solution would be to make SCSKIL close the CI connections of any fork
that is an inferior of the current fork.



                               [End of TCO 6.1.1405]

                               TCO-number:  6.1.1406



Written-by:  MCCOLLUM                         Creation-date:  24-May-85 13:36:03


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	ALL	MODULES	AND	UTILITIES




Problem:  
The copyrights need updating.

Diagnosis:  

Solution:  
Do it.


                               [End of TCO 6.1.1406]

                               TCO-number:  6.1.1407



Written-by:  PALMIERI                         Creation-date:  24-May-85 15:03:50


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	ROUTER


Related-QAR:  	838359



Problem:    Partial routing update loss event does not show correct highest
	address.

Diagnosis:    No code to search for highest address so garbage is displayed.
	Also the rest of message causing the update loss is not being pro-
	cessed and some routing information may be lost.

Solution:    Process all of message and remember highest address.


                               [End of TCO 6.1.1407]

                               TCO-number:  6.1.1408



Written-by:  PALMIERI                         Creation-date:  24-May-85 15:10:46


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	DNADLL




Problem:    DECnet is in a funny state if it tries to use the Ethernet when the
	physical address is not DECnet's.

Diagnosis:    Code to check for DECnet address is commented out.

Solution:    Enable the checking and and robustness to it.


                               [End of TCO 6.1.1408]

                               TCO-number:  6.1.1409



Written-by:  GRANT                            Creation-date:  28-May-85 09:29:50
Edited-by:   GRANT                            Edit-date:      28-May-85 09:30:45


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	STG


Related-QAR:  	838288



Problem:    PHYICE and MSCTMU BUGCHKs.

Diagnosis:    Not enough section 0/1 resident free space in the Units Pool.

Solution:    Add another page.


                               [End of TCO 6.1.1409]

                               TCO-number:  6.1.1410



Written-by:  NICHOLS                          Creation-date:  28-May-85 15:21:45


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	llinks




Problem:  DECnet links hang if device drivers lose messages

Diagnosis:  LLINKS waits for all output message blocks to be returned from
the lower layers before it will close a link.  It now appears that there
are cases in which the KLIPAs can crash in such a way that the drivers
cannot return all the output messages reliably.

Solution:  Allow links to close even if the out-in-router count is non-zero
and check all output done messages to see that the corresponding link block
is still active.  For debugging purposes, a new switch FTORC can be set
non-zero to make links wait for output completion.


                               [End of TCO 6.1.1410]

                               TCO-number:  6.1.1411



Written-by:  MOSER                            Creation-date:  28-May-85 15:57:47
Edited-by:   MOSER                            Edit-date:      28-May-85 17:18:33


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FORK	STG	GLOBS




Problem:       FLKTIM does not actually time out the fork lock. It simply 
reports the problem without unlocking. This is desirable when 
debugging but is a problem for customers.

Diagnosis:       We recode this every release and it is a BIG DRAG.

Solution:      Fix this ONCE AND FOR ALL. The following rules now apply:

 -  DEBUG monitor - never unlock

 -  NON-DEBUG monitor and DBUGSW <> 0 - Never unlock

 -  otherwise (no debugging of any kind) unlock when FLKTIM occurs. 

It may be desirable to change the timeout value from the current 2 minutes.
Make this a parameter, FLKTMV, in STG and set it to 2 minutes as the default.


                               [End of TCO 6.1.1411]

                               TCO-number:  6.1.1413



Written-by:  GROSSMAN                         Creation-date:  28-May-85 16:57:30


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  NISRV's Read Portal Info function (NU.RPI) does not return list of
multicast addresses correctly for portals that have more than one multicast
enabled.

Diagnosis:  Too complicated to explain, and not worth listening to.

Solution:  An ADDI and a JUMPE in the correct places in NIRPI.


                               [End of TCO 6.1.1413]

                               TCO-number:  6.1.1414



Written-by:  GROSSMAN                         Creation-date:  28-May-85 17:04:16


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NISRV




Problem:  Return Portal Info (NU.RPI) function of NISRV did not check
portal ID's correctly.  It also was not returning the User ID.

Diagnosis:  Oops.  Forgot to set the FC.POR bit in the NISRV function dispatch.


                               [End of TCO 6.1.1414]

                               TCO-number:  6.1.1415



Written-by:  GROSSMAN                         Creation-date:  28-May-85 17:31:42


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	NIUSR




Problem:  The NI% JSYS didn't deal with global portal ID's correctly.
Also, the function NU.RPI did not return multicast address list in the
correct format.

Diagnosis:  Designer brain damage.  Monitor portal ID's are fullwords, and user
portal ID's are halfwords.  I had to figure out some way to identify monitor
portals to users.  Just returning the address was not good enough.

Solution:  Invent an "external" portal ID for monitor portals.  This ID is
created by NISRV, and is guaranteed to be unique.  This id is also guaranteed
to fit into 18 bits.  This ID is what the NI% JSYS will deal with when talking
about monitor based portals.


                               [End of TCO 6.1.1415]

                               TCO-number:  6.1.1416



Written-by:  PALMIERI                         Creation-date:  29-May-85 15:07:04


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	SCJSYS


Related-QAR:  	838141



Problem:    If a user issues a SOUT/SOUTR before the connect confirm is received
	on a DCN link SCJSYS calls SCLINK without waiting for the connect
	confirm.  Possible race condition if SCJSYS blocks in IMPWAT.

Diagnosis:    No code to wait for connect confirm.  IMPWAT does not revalidate
	JFN before returning to caller.

Solution:    Add routine WCCFRM to await the confirm before calling SCLINK with
	the users buffer.  Have IMPWAT call SCLFNU to revalidate the JFN be-
	fore returning to caller.


                               [End of TCO 6.1.1416]

                               TCO-number:  6.1.1417



Written-by:  GROSSMAN                         Creation-date:  29-May-85 15:34:09
Edited-by:   GROSSMAN                         Edit-date:      30-May-85 14:25:39


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	APRSRV


Related-QAR:  	838362



Problem:  Names of BUGHLTs are messed up in ERROR.SYS if the BUGHLT occurs in
XCDSEC.

Diagnosis:  The routine BUGH0 (BUGHLT processor) was using a HRRZ to fetch
the PC of the BUGHLT when passing it on to the SYSERR block generator.

Solution:  Load the full PC by using the mask EXPCBT when calling BUGSTO.


                               [End of TCO 6.1.1417]

                               TCO-number:  6.1.1418



Written-by:  GLINDELL                         Creation-date:  30-May-85 16:27:04


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	ntman




Problem:  
Event block is not deallocated if user requested a signal and the
signal queue was full (very unlikely).

Diagnosis:  

Solution:  
Call DNFWDS to deallocate event block if signal queue is full.
Thank you Bill Davenport.


                               [End of TCO 6.1.1418]

                               TCO-number:  6.1.1419



Written-by:  MOSER                            Creation-date:  31-May-85 15:59:30


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGEM




Problem:   SKDPF1 when extended addressing (mapping sections indirect) and
working set preloading.

Diagnosis:  Similar problem as TCO 6.2000. The monitor looks at entries in
the working set cache but may page fault on another forks PSB. What is
desired is to remove that page from the WSC. We crash trying to look at the
PSB that is swapped out.

Solution:  Change PRELW1 to call FPTAXP and expect to get -1,,SPT indicating
SPX is not in core (or would cause a PF). If this is the case then delete the
page from the working set cache.


                               [End of TCO 6.1.1419]

                               TCO-number:  6.1.1420



Written-by:  TBOYLE                           Creation-date:   3-Jun-85 11:31:21
Edited-by:   TBOYLE                           Edit-date:       3-Jun-85 11:49:34


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	DIAG




Problem:      

	ILULK4 crashes when killing the diagnostic monitor D20MON
under certain circumstances.

Diagnosis:      

	When ports CI and NI are set unavailable, they are both
in maintenance mode. This situation confuses the DGFKIL routines
which heretofore did not expect such a situation. The routines
DGEXRL and DGUNLK don't check to see that the lock word is 
no longer in use on the second call to release resources.
Further confusion resulted when making pointers out of -1.

Solution:      

	Add an additional paranoia check to the routines. Have them
check to see if the DIAG lock is no longer in use.


                               [End of TCO 6.1.1420]

                               TCO-number:  6.1.1421



Written-by:  TBOYLE                           Creation-date:   3-Jun-85 11:50:24
Edited-by:   TBOYLE                           Edit-date:       3-Jun-85 12:00:18


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGEM	MONSYM




Problem:    

	Monitor hangs.

Diagnosis:    

	This TCO replaces TCO 6.1.1297 which was supposed to fix PTNON0
BUGHLTs. This new scheme was necessary because CFSWUP likes to go
OKSKED at times to get its work done when NOSKED is in its way. The
callers of CFSWUP are usually never aware of this behaviour.

Solution:    

	Work around the behaviour of CFSWUP. Remove the extra NOSKED
and OKSKED in MSETPT. Have SETPT0 inform its caller if the page-slot
set failed. Have caller in MSETPT retry if this happens. Teach
the other callers to SETPT0 to crash as they did before with PTNON0
when the page-slot set fails.


                               [End of TCO 6.1.1421]

                               TCO-number:  6.1.1423



Written-by:  WAGNER                           Creation-date:   4-Jun-85 11:30:35


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	SCHED	phymvr	STG	globs




Problem:  MSCP server gets higher priority than it should. Party line says that
	 served disks are there for customer convenience, hence we should not
	 give them the priority that they get.

Diagnosis:  We call MSSCHK to check for server requests within 20mS of the
	 Server requesting to be scheduled. Currently the server requests
	 this by AOSing SRVSKD, which the scheduler notices at the next
	 short cycle, and therefore will always check the server within
	 20 mS, removing the currently running fork more often than not.

	 Adding insult to injury, the server already has a mechanism to
	 be called when needed, but at the 100mS long cycle through CLK2.
	 This is done be SETZMing MSSTIM, which is noticed in the long
	 cycle, and MSSCHK is called at that time too!

	 We only need to call MSSCHK every long cycle, there is no need
	 to forcibly dismiss the currently running process within 20mS
	 because of a server request.


Solution:  Get rid of the SRVSKD flag, replacing AOSes of it with SETZMs of
	 MSSTIM. This is done in a new routine MSSCZK (MSS Czech...).
	 This will make it check server requests only when other running have
	 been forcibly dismissed anyway.

	 Note: This has the pleasant side effect of reducing the overhead that
	 running DUMPER from another system to our served disks has on other
	 jobs on the poor server system. If the server system is standalone
	 no effect in DUMPER performance is seen.


                               [End of TCO 6.1.1423]

                               TCO-number:  6.1.1424



Written-by:  GROSSMAN                         Creation-date:   5-Jun-85 10:23:00
Edited-by:   GROSSMAN                         Edit-date:       5-Jun-85 10:44:20


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI	NIPAR


Related-QAR:  	838380



Problem:    Not enough information is returned when the NIA20 gets a planned CRAM
parity error.

Diagnosis:    The KNIPPE BUGCHK was used for all planned CRAM parity errors.  This
required that someone go look up the code returned by KNIPPE in order to fix the
problem.

Solution:    Create a seperate BUGINF for each possible planned CRAM parity
error.  Make the short text and the long text be fairly explicit about what's
going on, and also return the info that CSSE wants to see.


                               [End of TCO 6.1.1424]

                               TCO-number:  6.1.1425



Written-by:  GROSSMAN                         Creation-date:   5-Jun-85 10:38:14


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phykni




Problem:  KNIIPF (Illegal PHYSIO Function) should be a BUGHLT.  There is not
enough context saved by BUGCHKs to figure out the problem.

Also, start removing useless TOPS-10 conditionals from PHYKNI.


                               [End of TCO 6.1.1425]

                               TCO-number:  6.1.1426



Written-by:  LOMARTIRE                        Creation-date:   5-Jun-85 12:52:18


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	DISC


Related-QAR:  	838337



Problem:    
     Job can hang in CFSRWT when doing a RENAME. This happens when renaming the
dump file which is being examined by FILDDT.

Diagnosis:    
     OPNLNG  calls  ASNOFN  without  setting up the ACs it expects. This causes
ALOC1 to be screwed up which later confuses CFS.

Solution:    
     Call GASOG before calling ASNOFN at OPNLNG.


                               [End of TCO 6.1.1426]

                               TCO-number:  6.1.1428



Written-by:  MELOHN                           Creation-date:   5-Jun-85 15:06:38


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA


Related-QAR:  	838321



Problem:    Can't built monitor with LAHFLG=0

Diagnosis:    SMON code to set LAT-STATE was not under LAHFLG conditional.

Solution:    Put it under said conditional


                               [End of TCO 6.1.1428]

                               TCO-number:  6.1.1429



Written-by:  MELOHN                           Creation-date:   5-Jun-85 16:22:09
Edited-by:   MELOHN                           Edit-date:       5-Jun-85 16:24:47


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	TTYSRV	CTHSRV	CTERMD	TTYDEF	GLOBS


Related-QAR:  	838331	838376	838309	838441



Problem:      Wrapping lines, ^U, ^R, all work inconsistently with
CTERM. Programs that do multi-line TEXTI%, (like MS) lose track of the
current character position, and produce text that is garbled garbage.

Diagnosis:      Both TOPS-20 and CTERM have two basic modes of doing
output to TTYs, binary mode and ascii mode. TOPS-20 uses several
different means of doing combinations of binary mode and ascii mode to
the same TTY. (example, the BLANK command in the EXEC, which puts out
the escape sequence to clear the screen in binary and then prints the
EXEC prompt in ascii). TOPS-20 also maintains the wrap count on the
line on the basis of how many ascii (non-binary) characters have been
output to the terminal. It is therefore critical to output characters
in the server in the same mode in which they were generated on the
host. 

Originally CTHSRV set the mode of the message based on the value of
TT%DAM in the JFN mode word at the time when the output was to be sent
to the server. This didn't work because it is possible to output
binary characters to the TTY without changing the JFN mode word by
opening a JFN on the TTY with bytesize of 8. It turns out also that
the CTERM fork which moves the output from the TTY line buffers to the
CTERM message sent to the server runs asynchronously from the user
process which sets and clears the JFN mode word.

Edit 842 to CTHSRV recognized that the above couldn't work, and
therefore always put the message in binary mode. This worked for
normal output in most cases. Unfortunetly, the binary output did not
correctly update the current line and character position on the server
terminal, with the result that wrapping the line occurred at seemingly
random times, and control-R, control-U, and DELETE across line
boundaries produced incorrect results. These include the symptoms
described in QARs 838331, 838376, 838309, and 838441. In all of these
cases the line and character counts were only being updated when the
server echoed characters locally (since they were echoed in non-binary
mode). The line and character position was unaffected by the output
from the remote system, and therefore remained wheverever the last
character read and echoed from the terminal left off.

Solution:      The only way to make the remote server do the right thing
is send it binary output in binary(transparent) mode, and non-binary
output in ascii mode. Since there are several ways to switch back and
forth between binary and non-binary mode to the TTY, the only
practical place to tell when we are in binary mode and when we are not
is in TCO, the first level output routine in TTYSRV. I propose to put
markers in the output stream for cterm terminals only that signal when
the output mode is switching between ascii and binary, and between
binary and ascii mode. CTHSRV, when it copies characters from TTY
line buffers to CTERM messages will look for these markers, set the
tranparent bit in the CTERM message as approporate, and segment the
message such that it contains only binary or only ascii mode data.

To implement this I have added TTOASC and TTOBIN markers to the output
markers in TTYDEF; TT%BIN to the TDB which tells whether the terminal
is in binary mode or not, and CH%BIN to the CDB which tells whether
the last message sent to the server was in binary mode or not. This
last flag is necessary since the output markers only signal the
transition between modes, so the mode must remain sticky between
different cterm output messages.


                               [End of TCO 6.1.1429]

                               TCO-number:  6.1.1431



Written-by:  MELOHN                           Creation-date:   5-Jun-85 17:03:31


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    If you change the server name of a LAT server to a name
shorter than the original name, you get an extra character from the
old name at the end of the output of the NTINF% jsys.

Diagnosis:    Routine MMVAZO doesn't work with non-ASCIZ strings.

Solution:    Make it work right with strings not terminated with a null
byte. Fix UMVAZO to do the same.


                               [End of TCO 6.1.1431]

                               TCO-number:  6.1.1432



Written-by:  PALMIERI                         Creation-date:   5-Jun-85 17:39:34


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	DNADLL




Problem:    DECnet close of Ethernet portal causes DNDNCE bugchks.

Diagnosis:    DNADLL gives NISRV a portal ID of zero.  This happens because
	the UN block that was used by DNADLL to open the portal is not the
	one it uses to close the portal.  DNADLL expects them to be the same.
	Routine CHKADR was switching UN blocks in the process of opening a
	portal.

Solution:    Have CHKADR use the same UN block that the open portal routines
	are using.


                               [End of TCO 6.1.1432]

                               TCO-number:  6.1.1433



Written-by:  PALMIERI                         Creation-date:   5-Jun-85 21:02:18


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	GTJFN




Problem:    Can't use wild cards in filename in parse only network JFNs.

Diagnosis:    Code does not allow wildcards in filename.

Solution:    Remove restriction for parse only network JFN.


                               [End of TCO 6.1.1433]

                               TCO-number:  6.1.1434



Written-by:  PALMIERI                         Creation-date:   6-Jun-85 10:44:26


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	DNADLL




Problem:    If received data on the Ethernet exceeds the size of the allocated
	buffer a "frame too long" event is generated.  This event does not
	include the source and destination adresses of the oversize message.

Diagnosis:    No code to supply the addresses.

Solution:    Add necessary code.


                               [End of TCO 6.1.1434]

                               TCO-number:  6.1.1435



Written-by:  PALMIERI                         Creation-date:   6-Jun-85 10:56:13


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	D36COM	ROUTER	JNTMAN




Problem:    Big buffers on the Ethernet not as big as they could be.

Diagnosis:    Too many bytes reserved for Routing and who knows what overhead.

Solution:    Make an accurate computation of overhead and subtract it from
	the Ethernet maximum of 1504 not 1498.


                               [End of TCO 6.1.1435]

                               TCO-number:  6.1.1436



Written-by:  MOSER                            Creation-date:   6-Jun-85 20:53:14


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	APRSRV




Problem:  Wrong Bughlt info. Bogus dumps. TRAPPC points to BUGH5+5.

Diagnosis:  Monitor does a LOAD FKJSB but FX is garbage.

Solution:  Have a good FX.


                               [End of TCO 6.1.1436]

                               TCO-number:  6.1.1437



Written-by:  PAETZOLD                         Creation-date:  10-Jun-85 10:04:29


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	STG




Problem:    

nimaxh too low.  customers are running bigge ethernets than we thought.

Diagnosis:  

Solution:    

increase ght size from 16. to 128.


                               [End of TCO 6.1.1437]

                               TCO-number:  6.1.1438



Written-by:  LOMARTIRE                        Creation-date:  10-Jun-85 11:12:14


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGUTL




Problem:    
     Bad code at BGCTYP.

Diagnosis:    
     The routine is doing a GTAD% instead of calling LGTAD.

Solution:    
     Change it to call LGTAD to get the internal date.


                               [End of TCO 6.1.1438]

                               TCO-number:  6.1.1439



Written-by:  PALMIERI                         Creation-date:  10-Jun-85 11:18:15


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	SCJSYS


Related-QAR:  	838464



Problem:    User gets message "No response from destination process" when ACJ
	on the local system refuses network access.

Diagnosis:    Wrong error code returned by DCN OPENF routine if ACJ refuses
	network access.

Solution:    Return correct error code in DCNOPN.  This error was caused by
	one of two incorrect error codes in the error conversion table.
	Fix both of them.


                               [End of TCO 6.1.1439]

                               TCO-number:  6.1.1440



Written-by:  PALMIERI                         Creation-date:  11-Jun-85 13:44:36


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	D36PAR	D36COM


Related-QAR:  	838296



Problem:    Page faults while in the scheduler.  Can be manifested by OKSKBG
	BUGHLTs.

Diagnosis:    DECnet-36 often backs up byte pointers with an ADJBP. This
	can create byte pointers of the form 0410xx,,-1.  If xx contains
	the first address in a section an effective address will be
	computed that is the last address in the previous section.
	If this section is not mapped...the monitor will die somewhere.

Solution:   In routines that adjust the byte pointer check to see if the pointer
	is of the form 0410xx,,-1.  If so change the byte pointer to
	4410xx,,0.


                               [End of TCO 6.1.1440]

                               TCO-number:  6.1.1442



Written-by:  MCCOLLUM                         Creation-date:  11-Jun-85 15:21:47
Edited-by:   MCCOLLUM                         Edit-date:      11-Jun-85 15:28:59


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GTJFN


Related-QAR:  	838346	838418	838431



Problem:      

RELBAD BUGHLTs.

Diagnosis:      

When parsing a field of a file spec in GTJFN, the right half of FILTMP(JFN)
is used to store the address of  the block of free space that contains  the
text of the  field currently being  parsed. FILOPT(JFN) is  used to hold  a
byte pointer to the end of this string. If the field being parsed is  after
the device field (e.g.  the directory,  file name or extension fields)  and
the device field is defaulted to DSK*: or a logical name defined as  DSK*:,
STRDVD is called to translate DSK*: to  the name of the first structure  in
STRTAB. In all cases this is  the public structure. STRDVD allocates a  new
block of free space if  the structure name does not  fit into the block  of
free space which  currently holds the  string "DSK*". It  then updates  the
pointer in FILOPT(JFN).   This is  incorrect.  FILOPT(JFN)  should only  be
updated if the field currently being parsed is the device field. The result
is that ENDTMP is eventually called to trim the block of free space pointed
to by the right  half of FILTMP(JFN). It  assumes FILOPT(JFN) contains  the
pointer to the end  of the string  stored in this block,  but it no  longer
does and a a RELBAD BUGHLT results.

Solution:      

There is an alternate entry to STRDVD  (STRDEV) that is used when GTJFN  is
currently parsing the device  name field.  Remember  which entry point  was
used and only  update FILOPT(JFN) if  the device field  is currently  being
parsed.


                               [End of TCO 6.1.1442]

                               TCO-number:  6.1.1443



Written-by:  PAETZOLD                         Creation-date:  12-Jun-85 08:31:27


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  monitor
  Routines-affected:   	FREE




Problem:    

ILMNRF from job zero when shrinking a resident free space pool.

Diagnosis:    

a bad byte pointer was being constructed due to an AC being trashed.

Solution:    

restore the ac after destroying it.  this fix thanks to bill schilitt 
and debs.


                               [End of TCO 6.1.1443]

                               TCO-number:  6.1.1445



Written-by:  MELOHN                           Creation-date:  12-Jun-85 14:58:07


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    LATIST BUGINFs

Diagnosis:    LAT Slot dmultiplexor routine (LSDMUX) did not handle the
case where multiple start slots were received in the same message. It
also didn't always correctly adjust the byte pointer when an invalid
or unexpected slot was received.

Solution:    Fix LSDMUX to parse slots on the basis of the slot size,
rather than assuming (sometimes incorrectly) how much data is left in
the slot and adjusting the byte pointer by that amount.


                               [End of TCO 6.1.1445]

                               TCO-number:  6.1.1446



Written-by:  GRANT                            Creation-date:  12-Jun-85 17:24:02


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKLP




Problem:  Monitor doesn't handle another system running CI diagnostics very
	well.

Diagnosis:  When another system ACKs REQUEST-IDs but doesn't return IDRECs,
	we continue to send STARTs.

Solution:  Notice that REQUEST-IDs are no longer being answered and return
	the state of the v.c. to closed.


                               [End of TCO 6.1.1446]

                               TCO-number:  6.1.1447



Written-by:  MOSER                            Creation-date:  13-Jun-85 11:34:57


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGEM




Problem:   Many engineers waste lots of time looking at bogus OKSKBG dumps.

Diagnosis:  The real problem is a PF in the scheduler, SKDPF1, but the
code only detects this when CKSPFL is turned on. The PF handler
goes NOSKED using the macro but goes OKSKED by doing instructions. If
INSKED then NSKED gets erroneously decremented.

Solution:  Allways check for SKDPF1 (this takes 1 instruction). Always
use NOSKED and OKSKED macros.


                               [End of TCO 6.1.1447]

                               TCO-number:  6.1.1448



Written-by:  MOSER                            Creation-date:  13-Jun-85 11:45:35


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	FILMSC




Problem:   ILMNRF.

Diagnosis:  User does a BIN using .TTDES+line number. If the terminal
switches and the code rechecks then it can crash as it expects JFN
to contain an index into the JFN blocks not 400nnn.

Solution:  Expect this and do the "right" thing. Note that the dumps I examined
the "right" thing is probably not what the user wants because the user
haas bad code but it is the logically correct action.


                               [End of TCO 6.1.1448]

                               TCO-number:  6.1.1449



Written-by:  WAGNER                           Creation-date:  13-Jun-85 14:51:29


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	cthsrv	STG


Related-QAR:  	838315	838455



Problem:  Doing CTERM host output uses 50-70% of CPU in scheduler, and 20-25%
	 gets charged to user doing the output. This is independant of baud
	 rate, and decreases only slightly on heavily loaded systems.

	 The result is that one user doing a large, non-blocking EXEC type
	 command (or a large SOUT% or it's friends) can effectively hog an
	 entire system.

Diagnosis:  The CTERM output processing code does an MDISMS on a flag, CTMATN,
	that gets set whenever there is output to do on a line (other conditions
	also set this "attention" flag as well). The real problem is that we
	are a JP%SYS process that is also CRSKED at the time of the MDISMS.
	This causes us to get 200 mS of balance set hold time, negating the 
	effect of the wait for the MDISMS to get checked. The result is that
	when we request CPU, we get it.

Solution:  Implement a new flag, CTMWAG (CTerM Wait And Go), that gets set
	when we want to do the output. Now OR this flag in with a SETZed
	CTMATN at the scheduler short cycle (LV8CHK), so that if we have
	output to do, we will notice it every 20mS at the most. This has
	the additional benefit of treating CTERM output just like RSX20F
	and NRT output (incidentally, the case of NRT output hogging the
	system was fixed in a similar manner).

	The results are quite noticable: the user only gets charged for
	a more RSX20F and NRT-like 8%-12% CPU while doing the output.
	The scheduler overhead is reduced from 50%-70% down to about 1%
	on a standalone system. The actual throughput is down from roughly
	1000 cps to 950 cps, a reduction of only 5%!


                               [End of TCO 6.1.1449]

                               TCO-number:  6.1.1451



Written-by:  NICHOLS                          Creation-date:  17-Jun-85 10:15:44


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	D36COM




Problem:  JSR BUGHLT in D36COM

Solution:  Replace JSR BUGHLT with a real BUGHLT.

                               [End of TCO 6.1.1451]

                               TCO-number:  6.1.1452



Written-by:  GROSSMAN                         Creation-date:  17-Jun-85 10:57:43


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  Symbols in PHYKNI that are related to SYSERR entries are wrong or
misleading.

Diagnosis:  Wealth of confusion from CSSE error specs and SPEAR documents.

Solution:  Change the symbol names to be more meaningful, also update the
appropriate figures.


                               [End of TCO 6.1.1452]

                               TCO-number:  6.1.1453



Written-by:  GROSSMAN                         Creation-date:  17-Jun-85 11:08:02


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKNI




Problem:  NIA20 keep-alive routines do not always detect a dead KLNI.

Diagnosis:  It is possible for the NIA20 to get messed up in such a way that
it no longer processes command queue entries.  However, if there is a steady
stream of incoming datagrams, the NIA20 will be cause the KL to wake up and
process them.  The incoming datagrams are treated just like regular command
responses.  This fools PHYKNI into beleiving that the NIA20 is working just
fine, because it sees these responses frequently enough to keep the keep-
alive process happy.  Unfortunately, the KLNI is not responding to commands
during this period, and things like transmits just hang forever.

Solution:  Simplify the keep-alive process.  ALWAYS give the KLNI a Read Station
Info command every five seconds.  If that command is not processed within five
seconds, kill the KLNI and give a KNISTP BUGCHK.


                               [End of TCO 6.1.1453]

                               TCO-number:  6.1.1454



Written-by:  MOSER                            Creation-date:  17-Jun-85 15:43:46


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GLOBS	IPCF




Problem:   Global subroutine MTAMES is unused.

Solution:  Remove it.


                               [End of TCO 6.1.1454]

                               TCO-number:  6.1.1455



Written-by:  MELOHN                           Creation-date:  18-Jun-85 12:05:20


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV	FILMSC	GLOBS




Problem:    COMND (and perhaps users) need a way to determine whether a
CTERM terminal supports the full CTERM implementation (like TOPS-20,
TOPS-10, DECnet-DOS, RSX) or just a limited, bug-filled implementation
(like VMS).

Diagnosis:    The .MOCTM MTOPR% should return two different values; 1 if
the terminal in question is a True CTERM terminal, and 2 if the
terminal is a VMS CTERM terminal. Users may find this useful as well,
since many things don't work on VMS CTERM terminals, and will not
until VMS is fixed.

Solution:    Do it.


                               [End of TCO 6.1.1455]

                               TCO-number:  6.1.1456



Written-by:  LEACHE                           Creation-date:  18-Jun-85 13:12:30


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	STG




Problem:    A few sites have insufficient IPCF or ENQ freepool space.

Diagnosis:    Pools not large enough.

Solution:    Increase size of pool allocation.


                               [End of TCO 6.1.1456]

                               TCO-number:  6.1.1457



Written-by:  PALMIERI                         Creation-date:  18-Jun-85 15:34:45


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	STG	DNADLL




Problem:   Endnodes on the Ethernet may have more overhead than is desirable.

Diagnosis:    Non-routing nodes on the Ethernet always enable to receive
	multicast packets sent to the ID "All Routers".  This allows
	DECnet to eavesdrop on routing messages so a database can be main-
	tained for INFO DECnet.

Solution:    Only enable "all routers" multicast address in endnodes on the
	Ethernet when variable EVSDRP is non-zero.  The default value for
	EVSDRP is -1.


                               [End of TCO 6.1.1457]

                               TCO-number:  6.1.1459



Written-by:  MOSER                            Creation-date:  18-Jun-85 16:32:22


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GTJFN


Related-QAR:  	838488



Problem:  GTJFN no longer returns GJFX39 (logical name loop detected) it returns
GJFX24 (file not found) instead.

Diagnosis:  TCO 6.2261 caused this by always returning GJFX24 when SETDEV
fails.

Solution:  Return GJFX24 if SETDEV returns STRX09 otherwise return the SETDEV
error.


NOTE: *****************************************************************
      * THIS TCO DOES NOT IMPLY ANY "OWNERSHIP" OF GTJFN              *
      * I WILL DISAVOW ANY KNOWLEDGE OF THAT MODULE                   *
      *****************************************************************


                               [End of TCO 6.1.1459]

                               TCO-number:  6.1.1460



Written-by:  MOSER                            Creation-date:  19-Jun-85 09:35:44


Edit-checked:         No     Document:          Yes    TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	DISC




Problem:  JSR BUGHLT in DISC.

Solution:  Make it a "real" BUGHLT, XTRAPT (long file has extra page table).


                               [End of TCO 6.1.1460]

                               TCO-number:  6.1.1461



Written-by:  GROSSMAN                         Creation-date:  19-Jun-85 10:32:12


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LLMOP	MEXEC	STG	GLOBS




Problem:  Ethernet system ID's generated by LLMOP did not contain correct
system time information.

Diagnosis:  The system date and time was hardwired to be 23-Mar-1984 12:30:30.50.
This was probably done because the system ID message could be generated at
interrupt level.  The date and time conversion routines are not available at
interrupt level so the programmer just fudged up some numbers.

Solution:  Generate system ID's in CHKR.  This way, LLMOP can use the date and
time conversion JSYSs to acquire the necessary values for the system ID message.
Now, when a Request ID message is received at interrupt level, a request gets
queued up to CHKR level, and job 0 gets run immediately.  Also, periodically
(every 10 minutes) CHKR will call LLMOP to generate an unsolicited System ID
message.


                               [End of TCO 6.1.1461]

                               TCO-number:  6.1.1462



Written-by:  GRANT                            Creation-date:  19-Jun-85 15:26:33


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKLP




Problem:  CIPDFQ BUGINFs after reloading the CI.


Diagnosis:  TOPS-20 is not processing the response queue.  This occurred
	after a planned CRAM parity error;  one action taken by TOPS-20
	in the error processing is that of processing the response queue
	and cleaning all the command queues.  The routine KLPRQA is called
	in this case just as it is during normal operation.  During the
	error processing case, the port has been disabled but by calling
	KLPRQA we incorrectly enable the port for a small amount of time.
	During this time the port is capable of putting packets on the
	response queue.

	Thus, when we finally reload and start the port, it finds the
	response queue non-empty and never generates an interrupt.


Solution:  Create a second entry point KLPRQC to be used when we want to
	process the response queue without enabling the port.



                               [End of TCO 6.1.1462]

                               TCO-number:  6.1.1463



Written-by:  MOSER                            Creation-date:  19-Jun-85 16:21:29


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PAGUTL




Problem:  OFNBDB BUGHLT.

Diagnosis:   The problem occurs when a file is going from short to long.
The following scenario reproduces the bug.

Users Moe and Larry both open STOOGES.DAT as a short file.

User Moe goes out for lunch while Larry extends the file long. When
Larry makes the file go long the following counts are in effect.

Super PT share count = 3 (1 long opener [Larry] and 2 second level PTs)
Share count on second level PT0 = 3 (Moe, Larry and extra count from Larry)

Now Larry closes the file (Moe is still at lunch) The Share count on Super
PT becomes 0 and the OFN is deassigned) The share count on the PT for file
section 0 is still non zero because of Moe (still at lunch).

Now enter Curly who also opens the file. He assigns a OFN for the
Super PT and gets a different one than Larry had. When Curly tries to
aquire an OFN for file section 0 he finds the one that Moe is holding
but the data in SPTO4 does not agree with the data provided by Curly
in the call. An OFNBDB results.


Solution:  When assigning an OFN for a long file section 0 always use the
callers data. OFNBDB will still exist for other cases.


                               [End of TCO 6.1.1463]

                               TCO-number:  6.1.1464



Written-by:  MELOHN                           Creation-date:  19-Jun-85 20:31:44


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	COMND	CTERMD	CTHSRV	FILMSC	TTYDEF	TTYSRV




Problem:    CTERM terminals do not "unwrap" successfully; that is, if
you create a long enough line to wrap, and then edit that line to be
only one line long again, the CTERM-SERVER loses the prompt that began
the line.

Diagnosis:    In this case, the server must have a local copy of the
prompt in order to reprint the line successfully.

Solution:    Add a pointer to the prompt string as an argument to the
.MOTXT MTOPR% (AC 4). Make TEXTI% and friends fill in this prompt
string with the users's ^R buffer pointer. Make CTHSRV send this
prompt to the server. Make the server know that the prompt is really
an ^R buffer, and load it into the TEXTI% on the remote system.


                               [End of TCO 6.1.1464]

                               TCO-number:  6.1.1465



Written-by:  MELOHN                           Creation-date:  19-Jun-85 20:38:43


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV	CTERMD




Problem:    DECnet-DOS won't talk CTERM with us.

Diagnosis:    Multiple problems; the LOKCDB routine has some code which
normal speed CTERM connections apprently never tested. The GETIMG
routine; which is supposed to parse a DNA image field doesnt work, and
always uses the defaults, which keep everyone but DOS happy. We treat
DOS, and any non-10-20 system like VMS and don't trust it to do
editing. DOS can do editing.

Solution:    Fix problems, make CTERM assume that all implementations
handle the entire protocol with the exception of VMS.


                               [End of TCO 6.1.1465]

                               TCO-number:  6.1.1466



Written-by:  PALMIERI                         Creation-date:  20-Jun-85 15:16:34


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	STG




Problem:    No DECnet nodename if not defined in CONFIG.CMD.

Diagnosis:    The default name is nulls.

Solution:    Make the nodename and nodename count RSIs of TOPS20 and 6
	respectivly.


                               [End of TCO 6.1.1466]

                               TCO-number:  6.1.1467



Written-by:  GRANT                            Creation-date:  20-Jun-85 18:02:22


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyklp




Problem:  Unnecessary CI command when CI microcode is reloaded.


Diagnosis:  Now that CI microcode is loaded directly from the monitor we
	get the version from the load file;  we don't need to do the
	READ-COUNTER command to find out the microcode verion anymore.

Solution:  Remove the READ-COUNTER command from routine STRKLP, but have
	routine LODUCD put the edit number in the CDB so the utilities
	can find it.  The entire version (major, minor, edit) is in
	location UCDVER.


                               [End of TCO 6.1.1467]

                               TCO-number:  6.1.1468



Written-by:  GRANT                            Creation-date:  21-Jun-85 12:38:20


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYSIO


Related-QAR:  	838233



Problem:  PHYTPD BUGINF description is misleading.


Solution:  Chane name to PHYCPI (CI path ignored) and make the description
	more informative.


                               [End of TCO 6.1.1468]

                               TCO-number:  6.1.1469



Written-by:  PALMIERI                         Creation-date:  21-Jun-85 15:07:17


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	SCJSYS


Related-QAR:  	838514



Problem:    DECnet DCN write only logical links sometimes hang other end's
	SRV link.

Diagnosis:    A CLOSF% may complete "too soon" if the logical link is write
	only.  When the user issues a close a synchronous disconnect
	call is issued to SCLINK.  This causes a DI to be sent to the other
	end of the link.  If the DI can not go out immediately it is queued
	up for transmission later.  SCJSYS notices that the link is "write
	only" and decides that it can remove the entire port database at this
	time since there should not be any pending input from the network.
	In cleaning up the link it issues a Release for the link to SCLINK.
	If the DI is still on the queue to be transmitted it is discarded.
	The other end of the link never gets the DI and must wait for
	a "no confidence" on the link before cleaning up.

Solution:    Don't check in DNETIN to see if the link is "write only".
	Instead, always call SCLINK to read any pending data (there shouldn't
	be any).  If there is no data SCLINK will block until the DC arrives
	which means the other end of the link has received and processed the
	DI.  The only need of the write only check was for the MTOPR function
	READ LINK STATUS so make the check there.


                               [End of TCO 6.1.1469]

                               TCO-number:  6.1.1470



Written-by:  MELOHN                           Creation-date:  21-Jun-85 17:33:49


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	CTHSRV




Problem:    CTERM loses characters on input

Diagnosis:   CTHSRV sent start read messages requesting more input
than the line buffer could hold.

Solution:   Build start read messages with input length equal to TIMAX
minus TTICT, which is the remaining space in the line buffer.  Defer
additional start reads until the line buffer has more than five bytes
free.


                               [End of TCO 6.1.1470]

                               TCO-number:  6.1.1472



Written-by:  PALMIERI                         Creation-date:  24-Jun-85 14:38:02


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	ROUTER


Related-QAR:  	838475



Problem:    Setting cost on DTE circuits seems to have no effect on cost to
	nodes whose path is over the DTE.

Diagnosis:    Cost paramenter for other than first circuit is ignored when doing
	routing updates.

Solution:    When stepping through the circuits to build the routing vector
	use the cost for that circuit when computing costs to nodes over that
	circuit.


                               [End of TCO 6.1.1472]

                               TCO-number:  6.1.1473



Written-by:  GRANT                            Creation-date:  25-Jun-85 13:19:13
Edited-by:   GRANT                            Edit-date:      25-Jun-85 13:22:10


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	phyklp	cfssrv	PHYSIO




Problem:    Current CI diagnostic strategy using NO-ANSWER doesn't work well.

Diagnosis:    The CI microcode doesn't control the ACKing/NAKing of incoming
	packets.  So, the best it can do is not send back an IDREC even
	though the REQUEST-ID has been ACKed.  This isn't good enough because
	the other system sees that the NO-ANSWER system is there because it
	ACKed the packet.  This results in KLPNOA BUGCHKs rather than simply
	ignoring it.

Solution:    Instead of faking non-existence, set the maintenance bit in the
	port state field of the IDREC.  This will alert the other systems.


                               [End of TCO 6.1.1473]

                               TCO-number:  6.1.1474



Written-by:  LEACHE                           Creation-date:  25-Jun-85 13:58:34
Edited-by:   LEACHE                           Edit-date:      25-Jun-85 14:06:59


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	JSYSA


Related-QAR:  	838507	838509



Problem:       Password expiration not working correctly.

Diagnosis:       Leftover development code at LOGI2 is interfering with real
code at CHKPSW.

Solution:        Remove bogus code.


                               [End of TCO 6.1.1474]

                               TCO-number:  6.1.1475



Written-by:  MOSER                            Creation-date:  28-Jun-85 11:19:21


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	GETSAV




Problem:   GET of some execute only programs fails.

Diagnosis:   AC not always setup properly for calls to SREADF and CREADF.

Solution:  Set up T1.


                               [End of TCO 6.1.1475]

                               TCO-number:  6.1.1476



Written-by:  PALMIERI                         Creation-date:  28-Jun-85 15:07:10


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	LATSRV




Problem:    COMMMS bughlt's as a result of memory being returned by LATSRV.

Diagnosis:    If an attempt to post a buffer to NISRV fails the failure
	reason is returned in T1.  LATSRV originally had the buffer address
	in T1 and never saved it.  Instead it tries to return whatever T1
	now points to.

Solution:    Save buffer address in a STKVAR and retore it before calling
	DNFWDS.


                               [End of TCO 6.1.1476]

                               TCO-number:  6.1.1477



Written-by:  GRANT                            Creation-date:  29-Jun-85 22:21:24


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	PHYKLP	MEXEC




Problem:  Running IPALOD by hand causes confusion.


Diagnosis:  Now that the CI20 microcode is read into memory at system startup,
	you can't load a different version without rebooting the system.
	However, if you put a new IPALOD up and run it, it comes out and
	says "Loading .......x.y(z)" indicating it loaded the new version
	of the microcode when it actually reloaded the version that was read
	in at system startup.


Solution:  Make 2 entry points for IPALOD.  When a user tries to run IPALOD
	it will not say "Loading.....x.y(z)", but will say "Loading
	microcode that was read in at system startup".



                               [End of TCO 6.1.1477]

                               TCO-number:  6.1.1478



Written-by:  LEACHE                           Creation-date:  10-Jul-85 13:32:01


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	FORK




Problem:    PDVOP% causes freespace damage, RELBAD's, ILMNRF's, etc.

Diagnosis:    PDVOP function .POLOC causes recursive execution of PDVOP
with function .PONAM.  The PDVOP code fails to reinitialize the datablock
size in the argblock, so that the 2'nd through n'th recursion has an
enormously high value (something like 1,,1) stipulated as the size of
a block that is really 8 words long.  This usually causes no problem,
since the 8 word block is more than enough space to hold most program
name strings.  However, bogus executions of PDVOP (such as recently
performed by the EXEC) can create PDV's containing program name strings
that exceed 8 words in length.  The recursive PDVOP will then destroy
information in the freespace block adjacent to the PDVOP data block.

Solution:    Reinitialize the blocksize value before each recursive execution
of PDVOP.


                               [End of TCO 6.1.1478]

                               TCO-number:  6.1.1479



Written-by:  WAGNER                           Creation-date:  10-Jul-85 13:38:14


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	MONSYM




Problem:  MONSYM is very large

Diagnosis:  It is not purging unneeded storage.

Solution:  Have it purge .ERCOD on the second pass since it is used as internal
	 symbol only.


                               [End of TCO 6.1.1479]

                               TCO-number:  6.1.1480



Written-by:  MELOHN                           Creation-date:  10-Jul-85 16:50:04


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV




Problem:    ILMNRFs when PROs or VAXes acting as LAT servers with PO/S
or Oracle attempt to connect to TOPS-20 and any user does an NTINF% of them.

Diagnosis:    These brain-damaged LAT implementations do not set the LAT
server name. In this case we should display the Hex hardware address
of the remote server, but the code to do this jumps to the wrong part
of the routine and the system trips because the appropriate ACs are
not set up.

Solution:    Jump to the right place to correctly display the hardware
address of these "pseudo-lat-servers". Thanks to Peter Donahue for
helping find and exterminate this bug.


                               [End of TCO 6.1.1480]

                               TCO-number:  6.1.1481



Written-by:  PALMIERI                         Creation-date:  15-Jul-85 15:01:04


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	CTHSRV




Problem:    COMMMS BUGHLTs

Diagnosis:    If CTHSRV fails is intialize a connect block when opening a SRV
	connection is attempts to release the associated buffer, a pointer
	to which is in the CDB.  It uses P3 as a pointer to the CDB instead
	of the correct AC CDB.

Solution:    Change P3 to CDB.


                               [End of TCO 6.1.1481]

                               TCO-number:  6.1.1482



Written-by:  PALMIERI                         Creation-date:  15-Jul-85 15:11:10


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  Monitor
  Routines-affected:   	PHYKNI




Problem:    KNIIPF BUGHLT if system power fails.

Diagnosis:    PHYKNI believes that it should never receive a RESET CHANNEL
	call from PHYSIO and BUGHLTs if it gets one.  However this call
	is executed as a result of a power failure.

Solution:    Add routine KNIRSC to handle the reset channel call and stop and
	request reload for all KLNIs.


                               [End of TCO 6.1.1482]

                               TCO-number:  6.1.1483



Written-by:  MELOHN                           Creation-date:  15-Jul-85 15:39:11


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	latsrv




Problem:  RESCHK BUGHLTs
Diagnosis:  The header on a LAT transmit buffer was smashed. Examination of the
previous buffer revealed that it was a LAT circuit block. The code which clears
the counters at the end of the circuit block XBLTs one too many words and zeros
the first location of the next block of memory. We crash if we try to return
this block, since the check word has been cleared.

Solution:  Fix the routine that clears the circuit counters to zero the correct
number of words.


                               [End of TCO 6.1.1483]

                               TCO-number:  6.1.1484



Written-by:  MELOHN                           Creation-date:  15-Jul-85 15:42:21


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	latsrv




Problem:  ILMNRF BUGHLT
Diagnosis:  LAT server sends us an invalid Circuit block vector which we blithly
use to index into space.
Solution:  Check all CB vectors sent from the server to see if they make sense.
If they do not, consider it an illegal message.


                               [End of TCO 6.1.1484]

                               TCO-number:  6.1.1485



Written-by:  MELOHN                           Creation-date:  16-Jul-85 14:13:56


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	STG


Related-QAR:  	838536



Problem:    BADTTYs can occur if you build a monitor with LAHFLG turned off.

Diagnosis:    NTTLAH should be set to zero if LAHFLG is turned off.

Solution:    Do it.


                               [End of TCO 6.1.1485]

                               TCO-number:  6.1.1486



Written-by:  MELOHN                           Creation-date:  16-Jul-85 14:17:55


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	LATSRV


Related-QAR:  	838534



Problem:    Stop reason code sent to servers is garbage when host gets a LATIST
BUGINF.

Diagnosis:    Reason code is not set up after LATIST call.

Solution:    Set up the code to be invalid slot/format error and return it to the
server.


                               [End of TCO 6.1.1486]

                               TCO-number:  6.1.1487



Written-by:  MOSER                            Creation-date:  16-Jul-85 16:00:09


Edit-checked:         No     Document:          No     TCO-tested:  No 
Maintenance-release:  No     Hardware-related:  No 


Program:  MONITOR
  Routines-affected:   	APRSRV




Problem:   BOGUS dumps. PF in the Bughlt code. TRAPS0 shows PF occurred
trying to reference word 6,,0.

Diagnosis:   The Bughlt code tries to store the previous context ACs using
a STPAC. macro. This does a XCTBMU (data from prev context into monitor) of
a BLT of the ACs. When previous context is monitor and the section is not
0/1 then this BLT references memory not the ACs. This is a bug in the microcode.

Solution:  Do not do a STPAC. but do a PXCT of an XBLT. instead. This might
eventually get fixed in the u-code but for now this will work.


                               [End of TCO 6.1.1487]