Google
 

Trailing-Edge - PDP-10 Archives - bb-jr93e-bb - 7,6/ap017/mon703.d17
There is 1 other file named mon703.d17 in the archive. Click here to see a list.
MCO: 13284		Name: DPM/TL		Date: 23-Feb-87:02:33:56


[Symptom]
DECsystem-10 NOT RUNNING when TGHA runs.

[Diagnosis]
DTESER doesn't know that -20F has a protocol pause mode where it will
leave the -10 alone for about 30 seconds while it enters secondary
protocol.  Doing this part is easy, however, a few other buggers were
encountered:

1. Stopcode IME referencing .PDDIA in the PDB.
2. Stopcode IME when stuffing pointers into maps using IDPB instructions.
3. Stopcode IME putting -20F into protocol pause.  GETETD can't run in
   section one like MOSSER does.
4. -20F enters secondary protocol upon exiting protocol pause.
5. KLPSER will shut down KLIPA that aren't running with memory is set offline.
   It will try to turn on the KLIPAs when memory is set online even if they
   weren't on to begin with.
6. Ditto for KNISER.

[Cure]
1. MOSSER sets up W using a MOVE W,JBTPDB##(J).  Call FNDPDS like it should. 
2. GTPME hasn't returned a byte pointer for quite some time.  Do a MOVEM and
   an AOS instead of an IDPB.
3. Cause DTEAPP to run in section zero with the rest of the DTE code.
4. Clear out the last command -20F saw before entering protocol pause.
5. Respect IPAMSK.
6. Ditto for KNISER.

[Comments]

[Keywords]
TGHA

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	344	COMMON	.CPEPW
		DTESER	DTEAPP,SETPP
		MOSSER	DIAGT1,DIAGVM
		KLPSER	PPDMFL,PPDMON
		KNISER	KNIMOF,KNIMON

703A	


[End of MCO 13284]

MCO: 13286		Name: JJF		Date: 24-Feb-87:11:15:49


[Symptom]
No way for customers to enter their own specially-defined PIDs
in the system PID table.

[Diagnosis]
Make room.

[Cure]
Add two negative entries to the system special PID table in COMMON.

[Comments]
For EWS, ISWS, and NETSPL

[Keywords]
SPECIAL PIDS

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Retracted

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	345	COMMON	.GTCSD

703A	


[End of MCO 13286]

MCO: 13289		Name: JAD		Date: 26-Feb-87:12:39:36


[Symptom]
Potential software checksum errors on files which span more than
one unit in a structure.

[Diagnosis]
MCO 11937 fixed problems with returning the "DA" resource on the
"wrong" (original) unit when a file switched to a new unit.  However,
DEVUNI contains the original unit rather than the new unit, causing
the checksum to be computed incorrectly.

[Cure]
Call STORU in OUTGRP after calling WRTPTR.

[Comments]
Simultaneous update on a multi-unit structure still has several
holes, primarily due to the fact we own the DA on unit "A", call
TAKCHK/TAKBLK, and believe owning the DA on unit "A" will prevent
the job from looking at unit "B" once we've stored the unit change
pointer.  Tain't necessarily so.

Probably requires a per-file allocation resource (a few bits in the
NMB?) to prevent this type of race condition.  One of the guys at
ADP threatened (promised?) to look at it and get back to me if he
came up with a reasonable solution.

[Keywords]
SIMULTANEOUS UPDATE
MULTI-UNIT STRUCTURES

[Related MCOs]
11937

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	345	FILIO	OUTGRP

703A	


[End of MCO 13289]

MCO: 13290		Name: TL		Date: 27-Feb-87:09:10:10


[Symptom]
1. "Modems" don't work on the KS.
2. KS Autobaud code is different from KL/ANF.  
      a. It supports fewer speeds.
      b. It's more susceptible to noisy lines, and defective modems.
3. ANF autobaud sometimes fails at 1200 (^C, even)
4. ANF autobaud at 600 is hopelessly unreliable.

[Diagnosis]
1. They aren't real modems, and they don't obey the EIA spec.
	a. Devices which raise (or drop) RLSD and RI simultaneously lose.
	b. Devices which pulse RI only once, instead of at 20 Hz like
	   the Bell system lose.
2. Once it did more than the others.  Now, it's older and does less.
3. The "Ignore" next character bit isn't set for ^C even 1200 in
   high-speed mode.  This can cause a low-speed switch, which makes
   it even less likely that we will recognize the ^C next time.
4. The original empirical studies made it look like it might work; 
   analysis shows that we are asking the UART to sample a bit while it
   is in transition.  Since it's the start bit, it's quite hopeless.

[Cure]
1. Violate yet more of the EIA spec.  DEC's way is better.
2. Replace the autobaud code with the algorithm used by ANF-10 and RSX-20F.
   Search at two speeds, and support all the same speeds as the others.
3. Set the bit.
4. Remove 600 baud from the tables in ANF.

[Comments]
Makes reverse LAT connections work.  Simplifies the SPD.  

SPD revisions: The SPD references the original KS behavior (which changed
long ago).  Remove the KS-specific references, and merge with the KL/ANF
description.  Add support for 1800 and 2400 bps.  Footnote 4800 bps
to show that it is detected only with ^C, not CR.

This MCO causes ANF, the KS, and -20F to all behave the same way--years
late.

Thanks to Paul Mead and Brian Lilja of the CSSC/CS for testing this code
and assisting with remote debugging.  KS4097 STILL doesn't have the
facilities for this, though the request has been pending since 14-Apr-86!

[Keywords]
VAXination
KS
Autobaud
LAT
Modem Control

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		DZINT	LOTS,AND,LOTS
704		DNTTY	LSPTAB,HSPTAB


[End of MCO 13290]

MCO: 13291		Name: RJF		Date: 27-Feb-87:13:33:41


[Symptom]
Checkpointing a file using the FILOP. update-rib function can
cause a user specified blocking-factor to be messed up.

A FILOP, update-rib doesn't work correctly if the current buffer is
exactly full.

[Diagnosis]
The FILOP. code assumes that its implied "close" logic was
required to output the last buffer.  It then reads the last block of
the file back into the current output buffer so the user may append to
it.  A program that has done an OUT before the FILOP. update-rib, doesn't
want this to happen.  The OUT was done to advance to the next block of
the file.

The second problem happens if a FILOP. update-rib is done when the
current buffer is exactly full.  A virgin buffer ring header is returned
that can cause the user program all kinds of problems.

[Cure]
For the first problem, only read the last block of a file in if
the FILOP. update-rib function caused it to be written.

For the second problem, always call OUTF to unvirginize the ring header,
even if we don't have to merge any data into the current output buffer.

[Comments]
I think the only thing that does this stuff is COBOL.

[Keywords]
FILOP.
UPDATE-RIB
CHECKPOINTING
 .FOURB

[Related MCOs]
None

[Related SPRs]
35229

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	342	S
703A		UUOCON
		FILUUO


[End of MCO 13291]

MCO: 13292		Name: JAD		Date:  2-Mar-87:12:20:40


[Symptom]
MCO 13277 got lost

[Diagnosis]
Don't ask me, guess that's why we're still seeing IMEs and etc.
when SABs start on a page boundary.

[Cure]
Try again.

[Comments]
I'm concerned.  We seem to have lost a number of MCOs over the past
few weeks.  Is someone getting careless, or is there a bug in the
procedure?

[Keywords]
CORGRS

[Related MCOs]
13277

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	345	ONCMOD	CORGRS
703A	


[End of MCO 13292]

MCO: 13299		Name: JJF		Date:  5-Mar-87:13:39:03


[Symptom]
No way to easily patch in a customer system-wide PID.

[Diagnosis]
No negative entries in system PID table.

[Cure]
Add, under FTPATT, two negative words in the system PID table.

[Comments]
13286 gotten right.  This allows the stupid Galaxy 2 QUASAR needed for
NETSPL to work.


[Keywords]
Customer PIDs

[Related MCOs]
13286

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	346	COMMON	.GTCSD

703A	


[End of MCO 13299]

MCO: 13300		Name: JAD		Date:  5-Mar-87:14:17:01


[Symptom]
Probable KAF stopcode during illegal UUO (or any call to GIVRES).

[Diagnosis]
GIVRES calls DTXFRE which thinks it is being called with a DDB address
in F.  Wrong.  F contains zero at this point, so a LDB J,PJOBN gets the
contents of DEVJOB (= location 22 = BOOTWD).  Eventually we'll call
SRFREE which calls OWNIP which loops waiting for the CX resource.

Even if J didn't contain garbage, we PUSH the ACs on the stack in the
reverse order of which SRFRDT expects.

[Cure]
Remove the LDB J,PJOBN from DTXFRE, and add a PUSH/POP of T1 so the
stack is phased as SRFRDT expects.

[Comments]
This is the cause of the KAFs we've seen under the A/P monitor.

Did ANYONE test this code?

[Keywords]
DTXFRE
KAF

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	346	DTASER	DTXFRE

703A	


[End of MCO 13300]

MCO: 13301		Name: JAD		Date:  5-Mar-87:16:13:16


[Symptom]
Stopcode UIL, EUE, etc.

[Diagnosis]
DEFINEing breakpoints and REMOVEing them without INSERTing them
causes the replaced monitor instruction to get zeroed since it was
never really replaced (SNPRMI contains zero).

[Cure]
Call CKBINS at REMBPS and don't try to remove uninserted breakpoints.

[Comments]
SNOOP. is a good way to crash the system, but this seems a little
excessive.

[Keywords]
SNOOP. UUO

[Related MCOs]
None

[Related SPRs]
35710

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	346	UUOCON	REMBPS

703A	


[End of MCO 13301]

MCO: 13308		Name: JAD		Date: 10-Mar-87:12:10:38


[Symptom]
Trying to do I/O to a file located on a structure in the job's search
list AFTER a structure which has a RIB error for the directory
fails with bizarre error codes.  Specifying the structure name
explicitly or removing the structure with the RIB error for the
directory from the search list gets around the problem.

[Diagnosis]
STRLUP tries to read the UFD RIB to see if the file desired is
contained in that directory.  If the read of the RIB fails, I/O
error bits are left over in S.  Eventually they get stored in
DEVIOS (after the file is found on some subsequent structure)
which will cause the first I/O operation on that file to fail.

[Cure]
Clear IOIMPM/IODTER/IODERR at SCNSTR before branching back to
STRLUP to try the next structure in the search list.

[Comments]
A really wierd bug Eklund found while DSKB was falling apart
(I guess the HDA is 3 months old now, eh?  Time for more Elmer's
glue to hold the heads on.).

We actually SELL these disks?

[Keywords]
RIB ERRORS
LOOKUP
DSK

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	347	FILFND	SCNSTR

703A	


[End of MCO 13308]

MCO: 13309		Name: JAD		Date: 10-Mar-87:12:32:35


[Symptom]
Allocating more blocks to a file doesn't always use the expected
unit (i.e., the unit with the most free blocks) of a multi-unit
structure.

[Diagnosis]
T2 trashed before comparing it with UNITAL.

[Cure]
MOVE T2,(P) before CAMG.

[Comments]

[Keywords]
ALLOCATION
UNITAL

[Related MCOs]
None

[Related SPRs]
35711

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	347	FILUUO	UPDA1A

703A	


[End of MCO 13309]

MCO: 13318		Name: RDH/RCB		Date: 17-Mar-87:02:56:59


[Symptom]
Problems with DECnet and DDPs:

	1) DDPs don't always notice that the -10 has crashed, thus wedging
	   themselves.
	2) DDPs can't be configured to start up in a useful state.
	3) DECnet can't handle DDPs that are in a useful state at system
	   startup time.
	4) DECnet sometimes wedges circuits when processing strange state
	   transitions.

[Diagnosis]
	1) No code to detect dead -10.
	2) No configuration option.
	3) No code.
	4) Not recycling the circuit fully enough.

[Cure]
	1) Add code.
	2) Add LnDDP to set line block 'n' to be a DDP ab initio. (.P11)
	3) Add code.
	4) Cycle the circuit.

[Comments]
SCCed.  This is the result of RDH's visit to CCCC in Sweden.

[Keywords]
DECnet
DDP

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Checked
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	347	DNDCMP	DDPSER,DDPSI0,DDPOS0,DDPOS1,DDPIS0
703A		DNLBLK	LB.ST2
		NETDEV	NDDPCI,DDPKD7,DDPKD8
		DNADLL	DDIINC,DDIINE
		ROUTER	RTRCC1,RTRCC2,RTCINI


[End of MCO 13318]

MCO: 13347		Name: RCB		Date:  4-Apr-87:00:39:57


[Symptom]
Problems with DIE:

1)	JOB STOPCDs almost always reload the system.
2)	Potential halts trying to figure out if we want DDT.
3)	PERISH is needlessly inefficient.

[Diagnosis]
1)	Somebody wiped the PI status in S which is used by
	ZAPJOB to see whether we need to reload.
2)	Pushing onto the possibly corrupt stack while we still hold the
	DIE interlock.
3)	Somebody didn't realize that being called by an XPCW to section
	zero means never having to guess if you're addressable.

[Cure]
1)	Obtain the PI status again in ZAPJOB.
2)	Don't use the stack to preserve T1, just fetch it from the
	crash ACs again when we're done with it.
3)	Rearrange the PSECTs around PERISH, and save an instruction.

[Comments]
SCCed.  You'd think somebody would have complained about problem #1,
wouldn't you?

[Keywords]
DIE

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Checked

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	352	ERRCON	PERISH,DIE01,ZAPJOB
703A	


[End of MCO 13347]

MCO: 13348		Name: KBY		Date:  4-Apr-87:11:37:59


[Symptom]
IME deleting a section with PAGE. UUO .PAGSC.

[Diagnosis]
Calling KILSEC with junk in LH(T1).

[Cure]
MOVE-->HRRZS.

[Comments]
The code in 7.04 is different (and fixed for this bug), partly due to MHS.
This is perhaps not the best fix, but it's the quickest and safest.  IME130.
I don't understand why this doesn't happen every time.

[Keywords]
Huh?

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		VMSER	CHGSC5


[End of MCO 13348]

MCO: 13353		Name: RJF		Date:  7-Apr-87:12:01:50


[Symptom]
Two problems.

1) The FILOP. .FOAPP (Append) function didn't merge in the last block of a
   file into the user's output buffer.

2) The FILOP. .FOAPP function didn't return a virgin ring header when it  
   created a new file.  Some programs expect this.

[Diagnosis]
MCO 13291 caused both problems.  However the second worked before
MCO 13291 only because of a bug.

[Cure]
First change the name of the UP.URO bit to UP.MLB (Merge Last Block).
Then set the bit before falling into FOPN9B from the .FOAPP code.  This fixes
problem 1.

Add a test for a zero length file to FOPN9B so we don't unvirginize the buffer
ring header when writing to new or nul files.

[Comments]
Hope this fixes some of the MAIL problems, but I don't know why MAIL
wouldn't fail more consistently.

[Keywords]
FILOP.
APPEND
 .FOAPP

[Related MCOs]
13291

[Related SPRs]
35229

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	353	S
703A		UUOCON
		FILUUO


[End of MCO 13353]

MCO: 13375		Name: BAH		Date: 30-Apr-87:10:13:25


[Symptom]
GALAXY doesn't build.

[Diagnosis]
Some symbols were left out of the UUOSYM source for autopatch.

[Cure]
Define %LDOCS, .GTNXM, .GTBTX, .STPCP, .DCXSF.

[Comments]
These edits were required for Autopatch #16 and have been given
to Buzz.  I just needed to get the source on the black packs updated.

[Keywords]


[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		UUOSYM


[End of MCO 13375]

MCO: 13381		Name: JMF		Date:  5-May-87:10:12:54


[Symptom]
SET NOMESSAGE responds with ?No core assigned.

[Diagnosis]
Bits are wrong in UNQTB2. NOCORE is a right half bit.

[Cure]
Fix bits.

[Comments]
Wonder why this hasn't generated an SPR?

[Keywords]
NOMESS

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	356	COMMON	SNAMES
703A	


[End of MCO 13381]

MCO: 13386		Name: JAD		Date:  7-May-87:10:45:24


[Symptom]
SYSUNI chain gets loopy (2 of 2).

[Diagnosis]
MCO 13382 didn't go far enough in finding all the holes in the
code to twiddle with the SYSUNI and SYSDET chains.

[Cure]
Take a giant step and remove the half dozen separate chunks of
code which twiddle the chains, and make subroutines to do the
appropriate foolishness.  The subroutines can be made smarter
so we don't loop the chain by trying to attach a unit twice,
for example.

[Comments]
FINALLY solves the KAFs on CI disk failover.

Someone send me some hate mail so I Autopatch this crud.

[Keywords]
SYSUNI
LOOPS

[Related MCOs]
13382

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	357	FILIO	LOTS
703A	


[End of MCO 13386]

MCO: 13401		Name: KDO		Date: 19-May-87:15:27:31


[Symptom]
ANF-10 file transfers don't (2 of 2).

[Diagnosis]
None, but code is wrong.

[Cure]
MCO 13233 removes an AOS (P) which causes double skip returns.
Remove the other AOS (P) also.

[Comments]

[Keywords]
ANF-10
file transfer

[Related MCOs]
13233

[Related SPRs]
35721

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	360	NETSER	NTDSIB

703A		NETSER	NTDSIB


[End of MCO 13401]

MCO: 13407		Name: DPM		Date: 26-May-87:07:55:16


[Symptom]
A job has program to run set and is not supposed to return to monitor
level.  If the program running does a CTX. UUO to save the current
context and create a new one without running another program, the job
will get stuck in the captive program with no way to return to the old
context.

[Diagnosis]
It's a one way street.  Had the program saved the old context and run
a program in the new context, the implied restore at program exit time
would have cleaned things up.  CTXSER doesn't know that's what is needed,
so things never happen automatically.

[Cure]
Disallow context saves from a captive program when no RUN UUO is being
done on behalf of the job.  New error code CXCCC%==26, cannot create
context from a captive program. 

[Comments]

[Keywords]
CONTEXT

[Related MCOs]
11102

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO
Documentation change
UUOSYM change

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	361	CTXSER	LGLCHK
703A		UUOSYM	CXCCC%


[End of MCO 13407]

MCO: 13413		Name: DPM		Date: 29-May-87:05:00:22


[Symptom]
     Stopcode IME when allocating non-zero section core after a series
of allocates and deallocates.

[Diagnosis]
     At GVFW11, a section number is setup based on the section  number
of  the  chunk  pointed  to  by AC 'R'.  The instruction which handles
section numbers is under an FTMP conditional when it should  be  under
an  FTXMON  conditional.   FTMP controls the assembly of SMP features.
FTXMON controls extended addressing features.

[Cure]
     Change the FTMP conditional to FTXMON.

[Comments]

[Keywords]
XADDR

[Related MCOs]
None

[Related SPRs]
35729

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	362	VMSER	GVFW11

703A	


[End of MCO 13413]

MCO: 13414		Name: DPM		Date: 29-May-87:07:23:06


[Symptom]
     Stopcode NIJ when sending a tape labeler message to PULSAR.

[Diagnosis]
     We understand how the problem happens, but not  why  it  happens.
If  we  get  to LBLSND with J containing an invalid job number, an NIJ
stopcode will result when sending an  IPCF  packet  to  PULSAR.   This
behavior  is  most  infrequent  and  cannot  be  reproduced  at  will.
However, it is probably in the best  interests  of  our  customers  to
prevent  the  system failures rather than allowing them to occur until
the exact cause is known.

[Cure]
     In LBLSND, reset J with the job number of the owner of  the  tape
drive.  Also remove a redundant LDB J,PJOBN## at LBLPOS which would no
longer be needed with this change installed.

[Comments]

[Keywords]
LABELED TAPES

[Related MCOs]
None

[Related SPRs]
35985

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	362	TAPUUO	LBLPOS,LBLSND

703A	


[End of MCO 13414]

MCO: 13418		Name: JMF		Date:  1-Jun-87:10:21:42


[Symptom]
IME

[Diagnosis]
MAPHGH sometimes gets called in section 1 and does indexing
with junk in the left half of an AC.

[Cure]
Use another AC which doesn't contain the junk in the left half.

[Comments]
It mystifies me as to why we haven't seen this around here but
can clearly happen.

[Keywords]
IME
LOCK

[Related MCOs]
None

[Related SPRs]
35613

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		CORE1	MAPHGH


[End of MCO 13418]

MCO: 13419		Name: KDO		Date:  2-Jun-87:05:10:24


[Symptom]
Extended error status contains junk.

[Diagnosis]
PDVESE (in DEVESE) should be cleared in more places.

[Cure]
Clear PDVESE for OPEN, INIT, and FILOP. UUO's.

[Comments]
Doesn't solve all of LPTSPL's error recovery problems, 
but it does help.

[Keywords]
Error status
PDVESE
DEVESE

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	362	UUOCON	UINIT0

703A		UUOCON	UINIT0


[End of MCO 13419]

MCO: 13421		Name: JMF		Date:  3-Jun-87:10:08:47


[Symptom]
STOPCD NPJ

[Diagnosis]
CTXSER insists that a job must have a PDB when a swapout
finishes but this "ain't necessarily so" to quote Gershwin whilst
a logout is in progress.

[Cure]
Just ignore the CTX stuff if there's no PDB.

[Comments]
Winkler has been hacking around this since 7.03 went to field test.

[Keywords]
NPJ

[Related MCOs]
11102

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
Multi CPU only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A	363	SCHED1	FINOUT
704		CTXSER	CTXSCD


[End of MCO 13421]

MCO: 13422		Name: JMF		Date:  4-Jun-87:03:55:01


[Symptom]

If doing a LOOKUP of a non-existant file from a non-zero section causes
a UFD to be read, the result is
?Illegal address in UUO at user PC xxxxxx.

[Diagnosis]
PCS is getting zapped by the monitor I/O to read the UFD.

[Cure]
Call SSPCS.

[Comments]
Maybe someone really is using extended addressing (Read that as Edgecomb).

[Keywords]
Ill add
Non-zero section
Monitor I/O

[Related MCOs]
None

[Related SPRs]
35986

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	363	FILIO	SETBS2
703A	


[End of MCO 13422]

MCO: 13440		Name: JAD		Date: 16-Jun-87:07:56:50


[Symptom]
Potential for KLIPA microcode confusion on channel errors.

[Diagnosis]
I always thought the last of the 3 words transferred into the port
(.PCPCB, .PCPIA, and .PCIVA) was reserved.  Turns out the microcode
expects .PCIVA to contain the address of word 1 in the channel logout
area.  Thus, KLPSER never fills in this address, causing the KLIPA
microcode to fetch channel status from physical addresses 0 and 1
(NOT the ACs, but the first two physical addresses which are not
normally addressable).

[Cure]
Change the name of .PCIVA to reflect its new usage and fill in
that new word (.PCAL1) with the address of word 1 in the channel
logout area.

[Comments]
May improve recovery on channel errors, but I don't expect this
has anything to do with ADP's problems with error recovery.  Then
again, it might.

Unusual what you find when you're reading microcode listings when you're
bored.

[Keywords]
KLIPA
PCB

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	364	KLPPRM	.PCAL1
		KLPSER	PCBINI

703A	


[End of MCO 13440]

MCO: 13447		Name: KDO		Date: 22-Jun-87:23:53:33


[Symptom]
LATSER doesn't (4 of n).

[Diagnosis]
LATSER slobbers on section 3 space:

1. Slot block index numbers start with one (not zero).  When used as an index
   number into the SBVECT array, the first word of the table is never used, and
   the address of the last SB is written in some other block.

2. The table of circuit block addresses, CBVECT, can do the same thing.

3. NFRSBQ, the table of bits used to allocate slot block indices, is initialized
   to minus one, indicating that all indices are available.  That's fine as long
   as the number of remote terminals is evenly divisible by 36.  If not, SBALOC
   may see a one bit for which there is no space in SBVECT, allocate space for
   a slot block, and save the address in the contents of SBVECT + "n".

[Cure]
Subtract one before saving the addresses in SBVECT and CBVECT.
Build a proper bit mask for the last word in NFRSBQ at initialization time.

[Comments]
I can't be overdrawn, I still have checks in my checkbook.

[Keywords]

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	365	LATSER	LATINI

703A		LATSER	LATINI


[End of MCO 13447]

MCO: 13455		Name: JAD		Date: 25-Jun-87:09:56:33


[Symptom]
The first word of a block of free core immediately following a
label DDB is mysteriously zeroed whenever the label process
does (dump mode) I/O to process tape labels.

[Diagnosis]
TAPUUO defines a prototype label DDB which doesn't exactly
match the regular magtape DDB definition.  Several words are
incorrectly documented, but more important, the prototype DDB
is SHORTER than the regular DDB by several words.  When dump
mode I/O is done, the first thing TAPUUO does is to zero the
word TDVREM in the DDB.  Well, guess what?  The prototype DDB
ends just BEFORE word TDVREM.  If the prototype DDB ends on
the last word of a 4-word block (which it does in 7.03 with
FTMP turned off), the first word of the next 4-word block will
get zeroed.  This may likely be a disk I/O request block (the
first word of which contains a link to the previous and next DRBs).

[Cure]
Get rid of the prototype label DDB.  Use the prototype magtape
DDB and set/clear a few bits in the label DDB copied from the
prototype magtape DDB so the label DDB is "correct".

[Comments]
A MAJOR contributor to Pan Am's crashes.

Question of the day:  Why does a regular magtape DDB have DVLNG
set but not the label DDB?

[Keywords]
LABEL DDBS

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	366	TAPUUO	LPROTO,TPLBGA

703A	


[End of MCO 13455]

MCO: 13456		Name: DPM		Date: 26-Jun-87:05:51:02


[Symptom]
     CTX. UUO function .CTDIR (return  directory  of  contexts)  fails
with an illegal job number error.

[Diagnosis]
     AC 'M' is incorrectly advanced beyond the word  in  the  argument
block  containing  the target job number.  When this is corrected, the
job executing the UUO may hang trying to get the MM resource on an SMP
system.   This  happens  because  the resource is already owned by the
job.  On single CPU systems, the  job  performing  the  UUO  may  call
SCDCHK.   During  the  interval  in  which the job is not running, the
target job may log out, thus deleting  its  PDB  and  context  blocks.
Later references to those data structures may cause IMEs or unexpected
results.

[Cure]
     Do not advance the pointer to the  user's  argument  block.   The
call  to GETEW1 will do that automatically.  Interlock PDB and context
block scanning by using the CX resource, not only in the case  of  the
 .CTDIR function,  but in other places as well.  That was the intended
purpose of the CX resource, however, the code was written  before  the
resource existed and was never updated afterwards.

[Comments]

[Keywords]
CONTEXTS

[Related MCOs]
11102

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	366	CTXSER	INIQTA,REDQTA,SETQTA,XITQTA,DIRECT,DIRINI,DIRRST,INFORM,INFRST

703A	


[End of MCO 13456]

MCO: 13457		Name: DPM		Date: 26-Jun-87:06:41:00


[Symptom]
     After a series of context commands to switch between  and  delete
at  least  one  context, an NNF, IME or KAF stopcode may result if the
original context had any opened files.

[Diagnosis]
     When an adjacent context is created  (one  without  a  superior),
DDBs for any opened files in the old context are propagated to the new
context.  In the normal case, the pointers to the DDBs are  zeroed  to
prevent  file  read  and  write  counts  from  being skewed.  When one
context executes the CTX. UUO function (or equivalent monitor command)
to   delete   another  context,  the  DDB  pointers  are  not  zeroed.
Subsequent calls to GETMIN as part  of  the  context  block  switching
sequence  cause the DDBs and associated access table information to be
deleted.  Later, when the original context is continued,  FILSER  uses
the  contents  of  DEVACC in the DDB to find access table information.
At this point, the access blocks have been recycled and the link words
changed.   Some flavor of a stopcode will result.  If it happens to be
an NNF, the monitor trys to continue.  This will  never  work  because
NMB  scanning  happes  in  a very low level routine which has no error
return.  An IME usually follows.

[Cure]
     Zero USRHCU and .USCTA when switching context blocks  for  delete
functions.   Also  make  the  NNF  stopcode  be  a  STOP rather than a
continuable DEBUG stopcode.

[Comments]

[Keywords]
CONTEXTS

[Related MCOs]
11102

[Related SPRs]
35558, 35559

[MCO status]
Restricted distribution

[MCO attributes]
Documentation change

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	366	CTXSER	DELCTX
		FILUUO	S..NNF

703A	


[End of MCO 13457]

MCO: 13459		Name: DPM		Date: 26-Jun-87:07:33:25


[Symptom]
     A disconnect of a device from  an  MPX  channel  fails  with  the
unknown device error code from the CNECT. UUO.

[Diagnosis]
     Two problems exist.   First,  the  user's  job  number,  not  the
job/context  handle  is  stored in the DDB when connecting a device to
the MPX channel.  Disconnects fail because the standard DDB  searching
routines expect a JCH, not just a job number in the DDB.  Second, once
the correct DDB is located, the call to SETCPP to put the job  on  the
CPU which owns the device fails, resulting in the device not available
error code.  Upon inspection, of  the  four  callers  of  SETCPP,  two
expect  the  non-skip  routine  to  mean  success, while the other two
expect it to indicate a dead CPU.

[Cure]
     Set up AC 'J' with the job/context handle prior to assigning  the
device.  This will allow the disconnect function to find the DDB.  Also 
change the SETCPP routine to take the  non-skip return if the CPU  which 
owns the device is dead, and change the callers of SETCPP to reflect 
the change.

[Comments]

[Keywords]
CONTEXTS

[Related MCOs]
11102

[Related SPRs]
35728

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	366	COMMON	SETCPP
		CPNSER	SETCPP
		MSGSER	CNDEV4
		UUOCON	RELEA1

703A	


[End of MCO 13459]

MCO: 13460		Name: DPM		Date: 26-Jun-87:08:22:02


[Symptom]
     When using contexts in conjuction with  eight  or  less  assigned
logical  names, an IME stopcode or unpredicable behavior in the use or
existance of those logical names may result.

[Diagnosis]
     The PDB contains a pointer to short assigned logical name  table.
This  table  came  into being during mid 5-series monitor development.
It was supposed to facilitate high speed DDB  searching.   This  table
was  searched  prior  to scanning the DEVLST chain.  Prior to 7-series
monitors, all DDBs were on the DEVLST chain, so  alot  of  CPU  cycles
were  saved by using the job's logical name table.  With the advent of
funny space DDBs, the table's usefulness dimminished  somewhat.   When
contexts came into being, it was stated that funny space DDBs would be
saved and restored with other funny space  quantities.   In  practice,
this  never  happened  unless  the  job  had  more that eight assigned
logical names, as the  table  could  accomodate  eight  DDBs.   CTXSER
doesn't  save and restore the table pointer since this would result in
rebuilding the table on  each  context  block  switch.   Consequently,
funny  space  can get corrupted when assigned names are manipulated in
conjuction with context block switching.

[Cure]
     Remove all references to the job's assigned logical  name  table.
This  includes  PDB  location  .PDDVL.  The DEVLST chain is relatively
short because it contains only unit record DDBs.  A substantial amount
of  code  existed  to  handle  the  eight  name table and the overhead
incurred probably outweighs the benifits gained by  using  the  table.
Without  the  table,  all funny space DDBs will get saved and restored
across context block switches and potential IMEs are  eliminated.   In
addition, fix other related bugs which prevented these DDBs from being
deassigned if more than eight existed, and  prevented  any  disk  DDBs
from  being  deassigned  if  no  arguments  were given to the DEASSIGN
command.

[Comments]

[Keywords]
CONTEXTS

[Related MCOs]
11102

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
Documentation change

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	366	COMCON
		COMMON
		ERRCON
		FILUUO
		NETSER
		SCNSER
		TAPUUO
		UUOCON

703A	


[End of MCO 13460]

MCO: 13472		Name: KBY		Date:  4-Jul-87:15:06:08


[Symptom]
TWICE crashes the monitor in strange and wondrous ways.

[Diagnosis]
The real problem is, indeed, trying to use PG.IDC to indirect-map
a section which is already indirect mapped.  All that really needs to be done,
assuming legality checks out OK (no indirect loops, etc.), is to put in
the new pointer.  Unfortunately, the code assumes that if a pointer is there
already, it's an independent section pointer and thus all pages in the
section should be returned and the map killed.  The routine to kill all
pages in the section is smart enough to not do anything with an indirect
section, but the map-killer just picks up the pointer, keeps what it
thinks is the physical page number, and returns that page to the
free list.  This thus returns physical page JOB#+M.CPU to the free list;
a relatively low numbered page and thus in 99.99999% of cases, a random
monitor low seg page.  The crash occurs after the page gets re-used and
the monitor next tries to do something with the virtual page which
maps to that physical page.

[Cure]
Don't call KILSEC or ZAPNZM for an indirect pointer.

[Comments]
Gee, Ned, who wudda thunk?

[Keywords]
new TWICE

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
KL10 only
PCO required

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	367	VMSER	CHGS11
703A	


[End of MCO 13472]

MCO: 13474		Name: RCB		Date:  6-Jul-87:18:29:07


[Symptom]
MCO 13460 broke TSKs, RDXs, and DDPs.  Actually, they didn't always
work before, 13460 just made it more obvious.

[Diagnosis]
If there is no short logical name table (as after 13460), or if
the user has too many logical names to fit (before 13460), we don't clear
DD%LOG when searching for devices.  This causes the network devices to have
real hiccups when trying to figure out whether we might have exhausted the
DDB chain.

[Cure]
Make sure that the TSTxxx routines get called in ways that they like.

[Comments]

[Keywords]
Not-PHYONLY

[Related MCOs]
13460

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
PCO required

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	367	UUOCON	DDSRC3,DEVLP4
703A	


[End of MCO 13474]

MCO: 13476		Name: DPM		Date:  7-Jul-87:06:15:29


[Symptom]
     After MCO 13456 (PCO 10-703-092), a job will hang in CX wait when
an illegal job number is supplied to CTX. UUO function .CTDIR.

[Diagnosis]
     Prior to MCO 13456, the MM resource was used to interlock PDB and
context  block  scanning.   Obtaining  the  MM  does not require a job
number so validating the job number could happen  after  the  resource
had  been gotten.  The CX is job specific.  Therefore, when an illegal
job number is  specified,  the  user's  job  will  block  indefinetly,
waiting for a resource which will never become available.

[Cure]
     Validate the job  number  argument  prior  to  obtaining  the  CX
resource.

[Comments]

[Keywords]
CONTEXTS

[Related MCOs]
13456

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	367	CTXSER	DIRECT

703A	


[End of MCO 13476]

MCO: 13478		Name: RCB		Date:  7-Jul-87:07:29:03


[Symptom]
Entry vector's still don't work right.

[Diagnosis]
MCO 13269 didn't go far enough.

[Cure]
Finish the job.

[Comments]
Famous last words.

[Keywords]
ENTRY VECTOR

[Related MCOs]
13269

[Related SPRs]
None

[MCO status]
Checked

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	367	SEGCON
703A		COMCON
		UUOCON


[End of MCO 13478]

MCO: 13479		Name: JAD		Date:  7-Jul-87:07:46:17


[Symptom]
Possible scheduler confusion running HPQ jobs.

[Diagnosis]
If a job selected to run is pre-empted by an HPQ job, SP.CJn
isn't set for the job originally selected to run.  This can lead
to cache confusion, etc., and odd stopcodes when the user mode
stack has gotten confused.

[Cure]
Set SP.CJn in two cases in CLOCK1.  Add a routine to CPNSER
to do the dirty work.

[Comments]
Part of Mike Bisco's scheduler changes which might help fix
the EUE problems seen at Abbott.  Don says put it in . . .

[Keywords]
SCHEDULER
HPQ JOBS

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO
Multi CPU only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	367	CLOCK1	CIP6A
		CPNSER	SETSJ0

703A	


[End of MCO 13479]

MCO: 13502		Name: RCB		Date: 20-Jul-87:22:41:47


[Symptom]
TTY DEFER (command or TRMOP.) doesn't always really cause deferred echo.

[Diagnosis]
Missing code to have RECINT notify XMTECH about the desired behavior.

[Cure]
Add the code.

[Comments]
I don't know for sure why Tony only started seeing this recently, since
the code's been broken since 7.03 shipped.

[Keywords]
TTY DEFER

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Checked

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	371	SCNSER	RECIN3
703A	


[End of MCO 13502]

MCO: 13503		Name: RCB		Date: 21-Jul-87:03:08:03


[Symptom]
Fix various problems with TRMOP. and TTY commands.  Some functions are
mis-defined, and return erroneous results.  Some commands don't work as
expected.  Some commands do not act like their parallel functions.

[Diagnosis]
Yes.

[Cure]
Yes.

[Comments]

[Keywords]
TRMOP.

[Related MCOs]
13502

[Related SPRs]
None

[MCO status]
Checked

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	371	COMCON	TTCEDT,TTCREM,TTCDFR
		SCNSER	TOPTB1
		COMDEV	TERMCR

703A		COMCON	TTCREM,TTCDFR
		SCNSER	TOPTB1


[End of MCO 13503]

MCO: 13504		Name: RCB		Date: 21-Jul-87:04:49:36


[Symptom]
EUE after MCOs 13460 & 13474.

[Diagnosis]
Missing code to balance the stack.

[Cure]
Add the code.

[Comments]

[Keywords]
ASSIGN
INIT

[Related MCOs]
13474, 13460

[Related SPRs]
None

[MCO status]
Checked
Restricted distribution

[MCO attributes]
PCO required

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		UUOCON	DDBSRC
		SCNSER	TOPDD1,GETDDB


[End of MCO 13504]

MCO: 13507		Name: JAD		Date: 22-Jul-87:09:38:42


[Symptom]
System runs out of free core.  System error queue is full of entries
(which are consuming the free core) but DAEMON apparently isn't taking
any entries from the queue.  DAEMON only processes the first system
error block from a crash file.

[Diagnosis]
Code which adds entries to the system error queue tries to save a
little space in the ERRPT. block by only inserting one code 13 entry
(system error block available) in the ERRPT. block when a system error
block is queued.  Unfortunately, that single code 13 entry may be lost
or overwritten if errors occur too fast for DAEMON to keep up.  Also,
if DAEMON dies while processing the system error queue, it will not
check the system error queue again when a new DAEMON is started.

[Cure]
Forget about the code 13 entries in the ERRPT. block.  Have DAEMON
always do an SEBLK. UUO when it is woken (after the ERRPT. UUO).
This will guarantee the system error queue is processed in a timely
fashion.  The single extra UUO will not add appreciable overhead to
DAEMON.  Always scan the entire system error queue when processing
a crash file.  Requires edit 1020 to DAEMON.

[Comments]
Bad design from day one.  Mea culpa.

[Keywords]
System error blocks
Lost free core

[Related MCOs]
None

[Related SPRs]
36034

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	372	ERRCON	QUESEB
		S	.ERSEB

703A	


[End of MCO 13507]

MCO: 13513		Name: JAD		Date: 27-Jul-87:12:22:06


[Symptom]
Undeserved checksum errors on multiple RIB files.

[Diagnosis]
Code at NXTBL6 in FILIO believes if the current operation is a
read there is no need to update changed RIB pointers.  This isn't
necessarily true, since a previous write may have caused the
checksum to change.

[Cure]
Call PTRTST instead of PTRCUR.  If not update mode, the call
to PTRTST is essentially a call to PTRCUR.  If update mode, PTRTST
will rewrite the changed pointers if necessary.

[Comments]
Takes an odd combination of record sizes and etc. to cause
this problem, which only occurs in extended RIBs.

[Keywords]
UPDATE MODE
CHECKSUM ERRORS
EXTENDED RIBS

[Related MCOs]
None

[Related SPRs]
36001

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	372	FILIO	NXTBL6

703A	


[End of MCO 13513]

MCO: 13521		Name: JMF		Date: 31-Jul-87:03:42:34


[Symptom]
RLT17J doesn't run.

[Diagnosis]
704 edit didn't happen to 703A even though the code is the same
in both monitors.

[Cure]
JUMPE T1,CPOPJ=>JUMPE T1,CPOPJ1##

[Comments]
This probably wouldn't happen if we patched rather than edited on
Wednesday morning. It also reinforces the need to run SMP AP monitors.

[Keywords]
Off-line
Disks

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A	17000	CPNSER	SETCF2


[End of MCO 13521]

MCO: 13524		Name: JAD		Date:  3-Aug-87:09:58:24


[Symptom]
RTTRP UUO too restrictive with respect to EVM.

[Diagnosis]
The RTTRP UUO requires the job to be locked in Exec Virtual Memory
(EVM), even if no references will be done to the user address space
without using PXCT'd instructions.  Only if an indirectly-addressed
CONSO bit mask is supplied, or if "fast mode" (no AC save/context
switching to be done on a real-time interrupt) is specified, does
the job need to be locked in EVM so the monitor can access the user
address space during real-time interrupt level.

[Cure]
Move the test for "locked in EVM" down to the end of the code
which sets up for a RTTRP UUO.  Only test for locked in EVM if an
indirectly-addressed CONSO bit mask is specified, or if "fast mode"
handling is specified.

[Comments]
Gets around the EVM bind which Rockwell complained about
in the SPR.  Since we're moving things around in the address space
for 7.04, there will be much less of a bind.

"Fools rush in . . . "

[Keywords]
EVM
RTTRP
LOCK

[Related MCOs]
None

[Related SPRs]
35659

[MCO status]
None

[MCO attributes]
New development MCO
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	373	RTTRP	RTTRP,RTTRP5,BLKRET
703A	


[End of MCO 13524]

MCO: 13525		Name: JAD		Date:  3-Aug-87:11:01:38


[Symptom]
Can't build a monitor with DECnet but without LOKCON.

[Diagnosis]
DECnet (SCLINK) calls MOVPAG in LOKCON to move unmapped pages on
a SET MEMORY OFFLINE command/UUO.  If LOKCON wasn't loaded, MOVPAG
won't exist, leading to an undefined global.

[Cure]
Define MOVPAG in COMMON, and add a few FTLOCKs in D36COM and
SCLINK.

[Comments]
Ho hum

[Keywords]
MOVPAG

[Related MCOs]
None

[Related SPRs]
35767

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	373	COMMON	MOVPAG
703A		D36COM	DCNMOV
		SCLINK	SCTMOV


[End of MCO 13525]

MCO: 13526		Name: RCB		Date:  3-Aug-87:20:51:15


[Symptom]
CAL11. function to queue data is broken for IBM FEs in 7.04 and in
7.03 A/P.

[Diagnosis]
CAIE / jrst good/ jrst bad

[Cure]
CAIE/ jrst bad/ fallinto good

[Comments]
I have no idea how this got messed up, especially since this is the first
time I've touched any relevant code in 7.03A.

[Keywords]
IBMCOM
DTEs

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
PCO required

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	373	DTESER	DTEQU1
703A	


[End of MCO 13526]

MCO: 13528		Name: DPM		Date:  4-Aug-87:03:23:12


[Symptom]
     Stopcode IME updating page maps.

[Diagnosis]
     At EXOPF1+1 in VMSER, a DPB instruction is  used  to  update  the
contents  of  a  map  slot.   The routine GTPME no longer returns byte
pointers.  It returns a full word address.

[Cure]
     Change a DPB instruction to a MOVEM.

[Comments]

[Keywords]
PAGE MAPS

[Related MCOs]
None

[Related SPRs]
36000

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	373	VMSER	EXOPF1

703A	


[End of MCO 13528]

MCO: 13530		Name: DPM		Date:  4-Aug-87:05:28:34


[Symptom]
     Stopcode IME processing characters in SCNSER.

[Diagnosis]
     PCO 10-703-056 added a SE1INT at TICAVL.   The  published  answer
included  this change, but the Autopatched version lost the SCNOFF.
Also during control-R processing,   SCNSER  interrupts  are  enabled
without first disabling the interrupts.

[Cure]
     Add a SCNOFF at TICAVL and at the start of control-R processing.

[Comments]

[Keywords]
SCNOFF

[Related MCOs]
None

[Related SPRs]
35758

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	373	SCNSER	METRT1

703A		SCNSER	METRT1,TICAVL


[End of MCO 13530]

MCO: 13535		Name: JAD		Date:  6-Aug-87:09:12:47


[Symptom]
UNJ stopcodes and other cruft associated with command processing.

[Diagnosis]
Some rash developer moved FLMCOM from bit 18 (which translates
into the sign bit when in the left half of M) to bit 34.  Turns out
a LOT of code in the monitor expects COMCON to set the sign bit of
M to indicate "command level" so ECOD doesn't attempt to store an
error code at the location addressed by M (which just happens to
be the address of the command processing routine).

[Cure]
Move FLMCOM back to bit 18 (actually, (1B0) to make it obvious
someone expects it to be a sign bit).  Fix up lots of references
to "400000" to be "FLMCOM".  COMMENT THE HECK OUT OF THE DEFINITION
OF FLMCOM SO NO ONE SCREWS THIS UP AGAIN!

[Comments]
What happens when you let the air out of Nick's tires?

Dago Wop Wop Wop

[Keywords]
COMMAND LEVEL
UNJ
FLMCOM

[Related MCOs]
None

[Related SPRs]
36030

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	374	CLOCK1
703A		COMCON
		COMMON
		IPCSER
		NETDEV
		QUESER
		TAPUUO
		UUOCON
		VMSER


[End of MCO 13535]

MCO: 13536		Name: JAD		Date:  6-Aug-87:09:24:44


[Symptom]
Looping in ESTOP if LOGOUT (LOGIN) dies with a fatal job error, such
as illegal memory reference, I/O to unassigned channel, etc.

[Diagnosis]
We get to ESTOP at UUO level, which branches off to JOBKL, which
eventually winds back up at ESTOP, which . . .

[Cure]
Test .CPISF before branching off to JOBKL, and if set, go to
STOP1C instead.

[Comments]
I believe this old ADP patch fixes SNETCO's NCM stopcode bug,
but since I'm not 100% sure I'm not going to answer their 7.02 SPR
this way.  They're upgrading to 7.03 and if they SPR the bug there
I'll consider propagating this fix to 7.03.

[Keywords]
JOBKL LOOP

[Related MCOs]
None

[Related SPRs]
35430, 35740

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	374	CLOCK1	ESTOP3

703A	


[End of MCO 13536]

MCO: 13542		Name: KBY		Date:  9-Aug-87:11:37:29


[Symptom]
Stopcode PPQ most common; other variants possible.
All will be related to and follow some page-in operation.

[Diagnosis]
For paging in, PLTSN will figure out how many of
the requested pages are on the various in-core queues and return
the count in .USTMP.  Unfortunately, the pages may have moved around
from one of the IN queues to the IP queue (PPQ stopcode) or from the OUT
queue to some other job by the time PAGIMT actually gets around
to pulling the pages off the queues.  This is particularly true since
MCO 13137 which liberally sprinkles SCDCHK calls around the code to
prevent KAFs; calling SCDCHK requires giving up the MM.  I'm not sure
window really existed before then in 7.03 (although the one related
incident of this did at one time exist) except in the case of
mixed page-in/page-out lists.

[Cure]
Resurrect what should have been done instead of MCO 11833.  PLTSN
will now pull the pages off the in-core queues as it finds them and
place them on a "private" queue whose header is .USTMP.  The left
half of .USTMP contains minus the number of pages on the queue.
The success exit from the PAGE. UUO will check to be sure
this value is non-negative (must be zero from PAGEB; could
be positive for some other users which utilize .USTMP for
other things) and will issue a PMW stopcode if .USTMP is
negative.  The error exit from the PAGE. UUO will place the pages
back on the appropriate queues from whence they came and will
also issue a PMW stopcode if something doesn't match up correctly
(disk address in map doesn't match MEMTAB).  PAGIMT will pull pages
off the private queue now (eliminates the PPQ stopcode).
Note that when the pages are placed on the private queue, P2.TRN (transient
page) is lit in the PT2TAB entry for the page (turned on in PLTSN; off in PAGIMT
or on error return from the PAGE. UUO when it returns pages).  This is for
the benefit of CPIASN who can then figure out who owns the page by checking
MT.JOB and returning that job number.
This has the side benefit of not having to restart PLTSN if we have to
wait for a page currently on the IP queue discovered there (supercedes
MCO 11833).

[Comments]
I know no one likes Edgecomb's SPRs but his diagnosis did
save a lot of time.

[Keywords]
PPQ

[Related MCOs]
11833

[Related SPRs]
36028, 36036, 36038

[MCO status]
Restricted distribution

[MCO attributes]
Documentation change

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	374	VMSER	PAGIMT,PLTSN,UPAGE4
703A		S
		ERRCON	CPIASN


[End of MCO 13542]

MCO: 13544		Name: KBY		Date:  9-Aug-87:19:23:03


[Symptom]
Inelegant grammar in CORE command output.

[Diagnosis]
Lazy pluralizing.

[Cure]
Check numbers to decide how to output "s"

[Comments]

[Keywords]
CORE

[Related MCOs]
None

[Related SPRs]
36013

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	374	COMCON	COR5
703A	


[End of MCO 13544]

MCO: 13547		Name: DPM		Date: 11-Aug-87:06:02:43


[Symptom]
     If the first record of a magtape has a parity error, and the user
has  disabled DX10 error retry, the monitor will loop endlessly trying
to do error recovery on the tape.

[Diagnosis]
     TX1KON increments TUBREC when the read done interrupt happens and
again  when  unit  exception  (tape  mark  seen) is checked.  The path
through the code assumes that the unit exception  check  can  only  be
made if a short record length error occurs.

[Cure]

[Comments]
One half of the oldest SPR in LCG.  Not bad for 11 hours of debugging.

[Keywords]
DX10
RETRY

[Related MCOs]
None

[Related SPRs]
30357

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	374	TX1KON	ERRRD2

703A	


[End of MCO 13547]

MCO: 13553		Name: KBY		Date: 16-Aug-87:08:56:11


[Symptom]
MCO 13542 not quite complete.

[Diagnosis]
Usage of .USTMP is nice and fits in with what used to work, but
it is possible for a job to get swapped fragmented in the middle of
certain page UUOs which will lead to .USTMP getting smashed at interrupt
level.

[Cure]
New (2) temporaries .USTMU (UUO level temp) which may be safely
used by UUO level code only.  Change all references in the PAGE. UUO
to use these locations (any other UUO may also do this).  Actually
PAGE. UUO can run at pseudo-clock level for the CORE UUO, but this is
still essentially UUO level code.

[Comments]
Not the KAFs.
Note that in order to implement this for 7.03A, the UUO PDL will have
to be shortened (by 8. words).  Since the PDL is extendable, this ought
to be OK as a short term solution.

[Keywords]

[Related MCOs]
13542

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	375	VMSER	LOTS
703A		S


[End of MCO 13553]

MCO: 13555		Name: DPM		Date: 17-Aug-87:06:29:55


[Symptom]
     Stopcode PDLOVF using contexts.

[Diagnosis]
     If a user specifies a TMPCOR argument but no data  buffer  length
and  address  in  the  CTX. UUO argument block, a PDLOVF stopcode will
result.  This happens because the monitor  does  not  adequately  keep
track  of  intermediate  TMPCOR  file storage when switching contexts.
TMPCOR files are appended to the end of  the  user's  data  buffer  in
funny  space.   If  no  data  buffer exists, arithmetic to compute the
start of the TMPCOR file falls  short  of  our  expectations,  usually
yieling an address of zero.

[Cure]
     Correct the fauly bookkeeping with the addition of a new word  in
the  context block which contains the exec address of the TMPCOR file,
thus  eliminating  bizzar  computations  in  determining  the   file's
address.

[Comments]

[Keywords]
CONTEXTS

[Related MCOs]
11102

[Related SPRs]
36004

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	375	CTXSER	LOTS

703A	


[End of MCO 13555]

MCO: 13556		Name: JAD		Date: 17-Aug-87:08:26:15


[Symptom]
RTTRP lives up to its name (n of 2**18); incorrect arguments to
the UUO can lead to a KAF stopcode.

[Diagnosis]
Specifying a PI channel and device of zero with a non-zero CPU
argument causes the "remove device" routine to loop.  The code
in REMDEV checks the device code against a real-time block, and
if a match is found, obtains the SYSPIF interlock, then re-checks
the device code and CPU number (assuming someone snuck in).  If
the device code OR CPU number don't match, the code starts over
at the beginning of the loop, assuming some other job snuck in
and allocated a real-time block.

Since the CPU number in an un-initialized real-time block is 0,
the code will loop until a KAF stopcode occurs.

[Cure]
The code should be checking device code AND CPU number under
the SYSPIF interlock.  Change it to do so, as well as fixing a
number of other cases where something is tested, an interlock
is obtained, then the same thing is tested.

[Comments]
Rat's nest is the only appropriate comment.

[Keywords]
RTTRP
KAF

[Related MCOs]
None

[Related SPRs]
35372

[MCO status]
None

[MCO attributes]
New development MCO
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	375	RTTRP	REMDEV
703A	


[End of MCO 13556]

MCO: 13565		Name: KBY		Date: 22-Aug-87:12:14:11


[Symptom]
703A doesn't

[Diagnosis]
Yes

[Cure]
Yes

[Comments]

[Keywords]

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		VMSER	PAGAC1


[End of MCO 13565]

MCO: 13570		Name: JAD		Date: 24-Aug-87:14:04:22


[Symptom]
Files can be marked non-existant after a RENAME error even though
the files still exist.  This allows ENTERing another file in the
directory, leading to more than one entry in the directory for a
specific file name.  (Gee, who snuck generations in on us?)

[Diagnosis]
RENAME error 20 clears the YES bit for the file being RENAMEd,
but leaves the KNO bit on.  This will allow another ENTER with
the same file name to succeed and create a second entry in the
directory using the same name.  Also, use counts are wrong after
the error.

[Cure]
Don't branch to ENERR1 in this case, add a new routine which
clears rename in progress, decrements use counts, and branches
to a more appropriate clean-up routine.

[Comments]
Only ADP . . .

Gee, I thought Steve fixed all the use count bugs.

[Keywords]
USE COUNTS
RENAME ERROR

[Related MCOs]
None

[Related SPRs]
35709

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	376	FILUUO	RENA32
703A	


[End of MCO 13570]

MCO: 13573		Name: DPM		Date: 26-Aug-87:07:01:50


[Symptom]
     Stopcode TMDELI or TMDELE in SCNSER during character processing.

[Diagnosis]
     Since the begining of 7.03 development, we have experienced these
stopcodes  about  once  every  two or three months.  They indicate the
monitor has skewed  its count of characters in the  chunks,  and  that
skew  is  detected  while performing some backspacing operation.  When
this occurs, SCNSER stopcodes, corrects the character count,  and  the
user  job  continues  never  having  known the problem existed and not
being affected in any way.  The crashes never contain any useful data.

[Cure]
     Change the TMDELI and TMDELE stopcodes from type DEBUG  to  INFO.
This  will  eliminate  useless crashes.  If a case arises where one of
these conditions can be reproduced  at  will,  the  stopcodes  may  be
patched  to  a DEBUG so potentially useful information may be captured
in a dump.

[Comments]

[Keywords]
INPUT
ECHO

[Related MCOs]
None

[Related SPRs]
35724

[MCO status]
None

[MCO attributes]
Documentation change

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	360	SCNSER	TICAVL,ECCAVL

703A	


[End of MCO 13573]

MCO: 13575		Name: DPM		Date: 31-Aug-87:12:55:44


[Symptom]
     Stopcode FOP restoring a context which  had  sometime  previously
spawned another context using the TMPCOR feature of the CTX. UUO.

[Diagnosis]
     PCO 10-703-109 separated out TMPCOR handling from the data buffer
processing.   The  routine ZERDAT is supposed to zero out the user and
exec addresses for both the data buffer and TMPCOR when a  context  is
restored.   Instead  the  user  address  is  zeroed twice and the exec
address is ignored.  The next time a new context is created  and  then
restored, CTXSER will try to return the old exec address again causing
a FOP.

[Cure]
     Change a SETZM .CTTCA(P1) to a SETZM .CTTCE(P1).

[Comments]

[Keywords]
CONTEXTS

[Related MCOs]
11102

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	377	CTXSER	ZERDAT

703A	


[End of MCO 13575]

MCO: 13579		Name: KBY		Date:  2-Sep-87:20:03:53


[Symptom]
703A doesn't (again)\

[Diagnosis]
I sure hope so.

[Cure]
Delete an instruction turning a NOP into something useful.

[Comments]
It really isn't the same thing as it was last week.

[Keywords]

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		VMSER	PAGAC1


[End of MCO 13579]

MCO: 13584		Name: RCB		Date:  3-Sep-87:16:16:10


[Symptom]
STOPCD ANFSBA (secondary buffer allocated) at unpredictable times.

[Diagnosis]
When calling GETWDS to allocate a PCB, we call CLNPCB, which doesn't clear
enough locations in the PCB.  Since GETWDS doesn't zero core, this has been a
time-bomb waiting to explode for quite some time.

[Cure]
Clear yet more locations in CLNPCB.

[Comments]
Sigh.

[Keywords]
ANFSBA
Dirty core

[Related MCOs]
None

[Related SPRs]
36045

[MCO status]
None

[MCO attributes]
PCO required

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	310	NETSER	CLNPCB
703A	


[End of MCO 13584]