MCO: 14111 Name: JMF Date: 1-Sep-88:07:17:30 [Symptom] Patches made to virtual user mode programs with FILDDT disappear. [Diagnosis] If the patch happens to get made to a write locked page, the page doesn't get written to the swapping space the next time the job gets swapped out. [Cure] If the job and the page are in core and the page is write locked, write enable the page and decrement .USWLP before copying the data from the patcher to the patchee. [Keywords] JOBPEK [Related MCOs] None [Related QARs] None [MCO status] Deferred [MCO attributes] New development MCO QAR answer [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 UUOCON JOBPK3,JPKLW? VMSER RTNFS0 [End of MCO 14111] MCO: 14127 Name: JMF Date: 27-Sep-88:05:23:44 [Symptom] Non-zero section address break doesn't work as expected. [Diagnosis] 1) Section number gets lost in DATAO APR, in SSEUB. 2) SET BREAK command changing conditions but not break address zaps section number. [Cure] 1) DATAO APR,.CPAPR 2) DPB rather than various flavors of HLLxy. [Keywords] extended addressing address break [Related MCOs] None [Related SPRs] None [MCO status] Deferred [MCO attributes] New development MCO KL10 only Extended addressing only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 APRSER SSEU2 COMCON SETB10 [End of MCO 14127] MCO: 14131 Name: ERS Date: 13-Oct-88:09:41:14 [Symptom] All known bad areas on a disk are not known to the monitor. Possible, but unlikely IME. [Diagnosis] When we're scanning the BAT blocks we first figure out how many we have to scan. To get this we add the number of bad regions the monitor found to the number of areas the disk started with (bad regions found by the various diagnostic programs). However, the latter we get by indexing off of T3. T3 happens to point to outer space. [Cure] Set up T3. [Keywords] Bad regions Swap read errors? [Related MCOs] None [Related QARs] None [MCO status] None [MCO attributes] PCO required QAR answer [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 REFSTR SCNBAT [End of MCO 14131] MCO: 14132 Name: JAD Date: 13-Oct-88:10:47:42 [Symptom] Possible inconsistent runtimes on the KL (MCO 13856 revisited). [Diagnosis] Forgot one case where "Inhibit Update" was set needlessly. [Cure] Clean it up. [Keywords] RUNTIME [Related MCOs] 13856 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 APRSER SSEUB [End of MCO 14132] MCO: 14133 Name: JAD Date: 13-Oct-88:10:55:32 [Symptom] Protocol pause doesn't exist under secondary protocol, but DTESER doesn't check before trying to effect protocol pause. [Diagnosis] Missing test. [Cure] Test ED.PPC at SETPP before doing anything rash. [Keywords] PROTOCOL PAUSE [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 DTESER SETPP [End of MCO 14133] MCO: 14134 Name: JAD Date: 14-Oct-88:10:57:48 [Symptom] (Unsupported) feature to print PC during SET WATCH FILE output gets wrong PC during RENAME. [Diagnosis] PATH UUO done by PTHFIL blows away .USMUO. [Cure] Use JOBPDO+1 for PC. [Keywords] WATCH FILE [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 UUOCON WCHPCP [End of MCO 14134] MCO: 14135 Name: JAD Date: 14-Oct-88:11:04:43 [Symptom] Including expensive "want to run" time calculation is an all or nothing proposition. [Diagnosis] Either you JFCL RQTPAT or you don't. If you do, it happens every tick. [Cure] Invent a MONGEN-definable symbol M.NRQT which is the number of ticks between the "want to run" time calculation. If zero, the expensive calculation is never done. Patchable on the fly by twiddling a variable in SCHED1. [Keywords] WANT TO RUN TIME [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 COMMON M.NRQT SCHED1 RQTPAT [End of MCO 14135] MCO: 14136 Name: DPM Date: 18-Oct-88:03:04:41 [Symptom] Giving up the CX resource for the wrong job. [Diagnosis] In CTXSER when setting context and saved page quotas, we get the CX resource if the target job is not ourselves. This works just fine because the purpose of the CX is to prevent a context block or PDB from changing out from under us. However at completion of the UUO function, we only give back the CX is we were changing our quotas. [Cure] Only give back the CX if the target is not ourselves. [Keywords] CONTEXTS [Related MCOs] 11102 [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 CTXSER XITQTA 704A 703A [End of MCO 14136] MCO: 14137 Name: LWS/DPM Date: 19-Oct-88:18:30:13 [Symptom] Autoconfiguring of -20F devices works by sheer luck or doesn't work at all. When it does work, the devices sometimes work. [Diagnosis] 1. We send the "request for device status" msg to -20F in the wrong format, i.e. 0 byte,,unit # byte. 2. In DCRSER and DLPSER we use the wrong half of an AC to pick up FE device unit number. 3. We "timeshare" the same word in the device DDB for two different things. [Cure] 1. Change FNCTAB dispatch of .EMRDS msg to use "line/data" format instead of "line" format. This causes msg to be sent in correct format, i.e. unit # byte,,0 byte. 2. HRRx's --> HLRx's 3. .ORG DEVLSD's --> .ORG DEVLEN's [Keywords] FE devices RSX20F printers readers [Related MCOs] None [Related QARs] None [MCO status] None [MCO attributes] KL10 only QAR answer [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 DTESER FNCTAB DLPSER DLPDT1,.ORG DCRSER DCRDT1,.ORG [End of MCO 14137] MCO: 14138 Name: LWS Date: 19-Oct-88:19:06:10 [Symptom] MCO 14126 incomplete [Diagnosis] In TPDSMM/CMM all tape kontrollers on the same channel are put in maintenance mode, but I forgot about dual ported units. Trying to put the DX20 on 1026 in maintenance mode using MTA0 as the arg to the DIAG. UUO puts the DX10 in maintenance mode. [Cure] Add code to check UDBKDB and put all kontrollers found in maintenance mode also. [Keywords] DIAGs [Related MCOs] 14126 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 TAPUUO TPDSMM,TPDCMM [End of MCO 14138] MCO: 14140 Name: JEG Date: 25-Oct-88:04:35:36 [Symptom] 1. SA10 related crashes not as useful as they could be. 2. Missing improvements in disk code. [Diagnosis] 1. SAXSER would squirrel away interesting data in the KDBs on a crash if only someone would ask it to. 2. I've been busy. [Cure] 1. Call SAXDMP from COMMON in DVCSTS. 2. Implement improved disk driver. [Keywords] SA10 [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 COMMON DVCST2 DSXKON LOTS [End of MCO 14140] MCO: 14141 Name: DPM/JMF Date: 25-Oct-88:04:55:24 [Symptom] Stopcode KAF trying to start I/O BUS printer. [Diagnosis] Hard to say, but it looks like LPTINI was never called, although it's not obvious how that could happen. Further inspection reveals that the length of the DDB is wrong. LPTCHF (PI channel flags), value 24 is the first word in the device dependant portion of the DDB. That's also the value of DEVCTR. If DEVCTR gets zeroed, the PI channel flags get wiped out and the contents put into the RH of the CONSO skip chain test. The next interrupt would not be serviced because the condition bits were all zeroed and a KAF results. Other devices could have other problem depending upon the usage of the words between the starting origin (DEVLSD, DEVLLD, etc.) and DEVLEN. [Cure] For all incorrectly defined DDBs, origin the device dependant portion at DEVLEN. [Keywords] DEVLEN [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 CD2SER CD2DDB CDRSER CDRDDB LP2SER LP2DDB LPTSER LPTDDB PLTSER PLTDDB PTYSER PTYDDB [End of MCO 14141] MCO: 14142 Name: RCB Date: 25-Oct-88:05:36:08 [Symptom] STRUUO .FSRSL (read search list) is less friendly than the GOBSTR loop that it's supposed to replace. [Diagnosis] Demanding godly privs or same job to read a search list, when GOBSTR only requires that the invoking job have the same PPN as the target job, or have some flavor of PEEK/SPY privs, or that the job be reading the SSL. [Cure] Change the STRUUO's priv checking to match that of GOBSTR. [Keywords] STRUUO .FSRSL GOBSTR consistency [Related MCOs] 13314 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 FILFND RSLSTR [End of MCO 14142] MCO: 14143 Name: RCB Date: 25-Oct-88:05:51:33 [Symptom] Too hard to tell which Autopatch tape a customer is running when we get the dumps. [Diagnosis] No way to distinguish between post-7.04 release monitors. [Cure] Change the way A00SVN and A00DLN are used in building A00VER and AXXDVN. (These are GETTAB items %CNVER and %CNDVN.) This week's monitor will be load 410 of 7.04A as far as the macros in COMMON are concerned. The load numbers will be recycled annually, at the same time as we bump the minor version number (A00SVN). This way, the version stamp on the dump will narrow down which tape it could have been from, and a check of MONVER will allow us to tell even more precisely. A00MCO should have been good enough, but it seems that some customers like to change it when they install published patches. [Keywords] Autopatch Revision control [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 410 COMMON AXXVER 705 410 [End of MCO 14143] MCO: 14144 Name: RCB Date: 25-Oct-88:06:41:41 [Symptom] Can't always connect to TSK devices on other nodes when we should be able. [Diagnosis] NETDEV (called from AUTLNK) updates our NDB with our new configuration (without benefit of interlock) but never tells anyone else in the network about our changes. [Cure] Change NETDEV to light a flag for NETSCN to recompute our configuration. If it changes, we'll mark everyone else's NDB as needing to hear about it. Later on in NETSCN, we'll try to tell them all about it. [Keywords] ERTNA% [Related MCOs] 13924 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 NETSER NETDEV,NCSCNF,NETSCN,ICMRCF NETPRM NDB.XC [End of MCO 14144] MCO: 14145 Name: DPM Date: 31-Oct-88:03:58:25 [Symptom] New: Add a couple of items that were omitted from 704 because of last minute documentation constraints. 1. Make control-T print the CPU the job last ran on. 2. Make SET WATCH FILES print the PC of the UUO. [Diagnosis] [Cure] [Keywords] CONTROL-T SET WATCH FILES [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 411 COMCON USECPU UUOCON WCHPCP 704A [End of MCO 14145] MCO: 14147 Name: JMF Date: 9-Nov-88:07:48:35 [Symptom] MX gets a protection failure when it tries to append to a mail file if its running virtual. [Diagnosis] Can page fault after doing updating ENTER and if the UUO is restarted, the combination of FO.PRV and junk in E+3 left over from the LOOKUP/ENTER results in a protection failure. [Cure] If appending in buffered mode, call OUTF early (before updating ENTER) to eliminate page faults after ENTER has been done. [Keywords] .FOAPP FO.PRV [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 412 UUOCON FOPEN2,FOPN9B [End of MCO 14147] MCO: 14148 Name: DPM Date: 14-Nov-88:04:58:54 [Symptom] Stopcode IME while performing magtape I/O. [Diagnosis] If buffered I/O is being done on a DX10 and if a the buffer overhead words (.BFSTS, .BFHDR, and .BFCNT) are split accross a page boundry such that .BFCNT resides in the page following .BFHDR, and that page happens to get destroyed, then an IME will result when MAKLST tries to read the user's word count for the buffer. No address checking is done on the word count word in this case. [Cure] Add a call to IADRCK. [Keywords] MAKLST [Related MCOs] None [Related SPRs] 36173 [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 412 TAPUUO CHNLS2 704A [End of MCO 14148] MCO: 14152 Name: RCB Date: 22-Nov-88:06:01:09 [Symptom] Files created in SYS: no longer get PRVSYS or PRYSYS as appropriate to the extension (non-.SYS or .SYS). [Diagnosis] Not sure when this broke, but SYSDEV gets cleared in LH(F) and never set again. [Cure] Fix the places that want to know or that already check to get SYSDEV right. In particular, don't just range check against SYSNDX, since that keeps STD: from lighting SYSDEV. Check the actual PPN of the device instead. [Keywords] SYSDEV PRVSYS PRYSYS [Related MCOs] None [Related SPRs] 36161 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 413 FILUUO SDVTSS,TSTDSK,FOUND0,CREAL5,CURPP1 704A [End of MCO 14152] MCO: 14153 Name: DPM Date: 28-Nov-88:09:24:30 [Symptom] Attempts to log off a job which was stopped in the process of logging out get a "No such device" error. [Diagnosis] If a job was somehow stopped while logging out, the job may have been partially destroyed. In particular, there may be no remaining context blocks. Subsequent attempts to kill the job fail because the run of the LOGIN program to log the job out will not succeed. This is because DDBSRC fails when no context block is found. [Cure] The situations surrounding this problem are pretty arcane. Typically, a job gets into this state because an idle job killer incorrectly selects a job which is already logging out. The usual methods of such programs include forcibly HALTing the job in a manner which bypasses JACCT and Control-C trapping. Hence, the resulting problem of a halted and partially destroyed job is caused by a privileged program circumventing privileged protection schemes. There are a couple of different approaches to solving this problem. The simplest is to defend against idle job killers. If the job is logging out, never allow a job to be stopped. This is most easily accomplished by testing PD.LGO in word .PDDFL of the PDB in the routine SIMCHK. PD.LGO is turned on by the LOGOUT UUO. [Keywords] LOGOUT [Related MCOs] None [Related SPRs] 35781, 36146 [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 414 CLOCK1 SIMCHK 704A [End of MCO 14153] MCO: 14154 Name: KDO Date: 28-Nov-88:19:39:01 [Symptom] Definition of the context block is esthetically unappealing. [Diagnosis] "symbol" == "previous symbol" + "a bunch" [Cure] Use .ORG instead. [Keywords] maintainability cleanliness [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 414 CTXSER .CTFLG 704A [End of MCO 14154] MCO: 14155 Name: KDO Date: 18-Dec-88:18:02:24 [Symptom] Cannot define the default circuit cost for each device type. [Diagnosis] No code. [Cure] Add the following symbols to COMDEV: %RTCTST circuit cost for TST device %RTCDTE circuit cost for DTE device %RTCKDP circuit cost for KDP device %RTCDDP circuit cost for DDP device %RTCCIP circuit cost for CI device %RTCETH circuit cost for Ethernet device %RTCDMR circuit cost for DMR device These symbols are used in the KONCST table of D36COM. [Keywords] [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 415 COMDEV D36COM KONCST ROUTER [End of MCO 14155] MCO: 14156 Name: DPM Date: 4-Jan-89:06:28:51 [Symptom] The system wide VM counters for IW and NIW page faults are half-word quantities which don't take too long to overflow. [Diagnosis] Old monitors didn't page fault too often. Now they do. [Cure] Add two new GETTABs: %VMIWS==42,,113 ;SYSTEM COUNT OF "IN WORKING SET" FAULTS %VMNIW==43,,113 ;SYSTEM COUNT OF "NOT IN WORKING SET" FAULTS Also, because SYSTAT and SYSDPY are crufty programs and not easily modified keep SYSVCT up to date, but mark it and GETTAB %VMSPF as obsolete to entice programs to use the new counters. If SYSTAT and SYSDPY are ever fixed, the monitor will cease to maintain SYSVCT, so programs shouldn't rely on %VMSPF. [Keywords] VM COUNTERS [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change UUOSYM change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 416 COMMON SYSVCT,SYSIWS,SYSNIW 704A MONPFH PFHXCI,PFHXCN UUOSYM %VMIWS,%VMNIW VMSER USRFL7 [End of MCO 14156] MCO: 14157 Name: RCB Date: 11-Jan-89:20:14:12 [Symptom] Problems with TSK devices: 1) Can't always do an "enter passive" which is restricted to a specific node. 2) The count of TSK devices is decremented more often than it's incremented. [Diagnosis] 1) The remote doesn't admit to TSKs until someone does an unrestricted "enter passive" there. 2) AUTKIL is checking the next DDB's station number value rather than that of the DDB being removed when deciding whether to decrement the device count. [Cure] 1) Always claim at least one TSK DDB if TSK service is loaded. 2) Check the right DDB in AUTKIL. [Keywords] TSK NETCNF DDBCNT [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 417 NETSER NTSC.C 705 417 AUTCON AUTKI4 [End of MCO 14157] MCO: 14158 Name: RCB Date: 11-Jan-89:21:17:51 [Symptom] Jobs using the MIC RESPONCE feature hang sometimes on a terminal, and always on a PTY. [Diagnosis] Race condition in MICLG3 which can cause us never to notify MIC that it's time to take the response, and a mistaken test in PTYSER (the JOBSTS UUO) that won't let us even try to notify MIC that the time has come. [Cure] Yes. [Keywords] MIC RESPONCE MIC UNDER BATCH [Related MCOs] 13932, 13137 [Related SPRs] 36167 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 417 SCNSER MICLG3,TOPMCL,TOPMS1,TOPMG1 705 417 PTYSER UJBST6 [End of MCO 14158] MCO: 14159 Name: RCB Date: 20-Jan-89:14:07:59 [Symptom] Fallback presentation of eight-bit characters doesn't work when a free CRLF is required by the character expansion. [Diagnosis] The code to re-eat a character for echo or output doesn't handle the case of a multi-part character expansion. [Cure] Keep track of which character (from an expansion or otherwise) caused the line wrap, so we can send the right one when the time comes to re-eat it. [Keywords] Two-part characters Three-part characters Fallback presentation [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 417 SCNSER LDBOST,XMTCH1,XMTREO,XMTREE,REEAT 705 [End of MCO 14159] MCO: 14160 Name: LWS Date: 29-Jan-89:20:49:51 [Symptom] 1. Problems detecting "data errors" on 20F card readers. 2. 20F card reader ignored after reading a card with a 9-punch in column 1. [Diagnosis] 1. Part of the problem is 20F itself. In V16-00, when a data error occurs, the bad data is passed to the -10. Then the status msg comes, but since I/O is not in progress we pitch the status msg. V16-01 of RSX20F fixes the problem of passing the bad data instead of just sending a status msg. (and fixes the problem where it always sends a status msg after any data transfer from the reader). The monitor never checks status bits that indicate a data error. The bit it checks is not set by 20F when a read/stack/pick check occurs. 2. Before processing any msg from 20F, DCRSER calls SETRGS to setup ACs and find the DDB etc. The first thing SETRGS does is test the 1st byte of the msg for the "non-existant" device bit - useful during autoconfiguration when examing a status msg. However, on a data transfer, a 9-punch in Col. 1 happens to be the same bit! [Cure] 1. Check read/pick/stack check bits in status byte also. 2. Change SETRGS entry point to SETRGX and only call it when a status msg is received. (the ONLY time we care about non-existant devices). Move SETRGS entry down a few instructions where it starts looking for a reader DDB. [Keywords] card readers RSX20F [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 420 DCRSER SETRGS,F11DVS [End of MCO 14160] MCO: 14161 Name: DPM Date: 1-Feb-89:08:24:52 [Symptom] In some configurations, an offline alternate port claims to be an RP04. [Diagnosis] This problem is highly dependent upon timing and configuration, and only affects MASSBUS disks. At system startup time, the disk drives are autoconfigured. Drive type information is gathered and properly stored in the unit data blocks. Later, ONCMOD will build the in-core structure data base and again, attempt to read the drive types. This redundant drive type check exists to guard against the operator swaping LAP plugs, thus changing an RP06 into an RP05. If the drive type register cannot be read, then incorrect data is stored in the drive type byte in the unit data block. It is not clear why the second attempt to read the drive type register fails. The DATAI to read the register returns zeros. Normally, this could happen because the other port is busy or if the last I/O operation on the other port failed to do a dual-port drive release upon completion. Since the drive is offline, no I/O was started. Also, it has been observed that if one or more online drives exist with a higher unit number, then the problem disappears. This indicates a possible hang in the controller. In addition, if the interval between checking the primary port and and the alternate port is sufficiently long, then the DATAI always succeeds. [Cure] Problems similar to this have existed for at least 3 monitor releases. The only flaw which all monitors have in common is that the failure to read the drive type is ignored and junk overwrites the drive type code in the unit data block. A simple solution is to jump around the code which stores the drive type byte. After all this time, it seems unlikely we will determine the real nature of the DATAI failure, so this work around must suffice. [Keywords] RP04 [Related MCOs] 13932, 13137 [Related SPRs] 36230 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 420 ONCMOD TRYUNI 704A [End of MCO 14161] MCO: 14162 Name: JEG/DPM Date: 6-Feb-89:05:34:53 [Symptom] Day one 7-series bug: Stopcode DAU and corrupt user core images. [Diagnosis] A user job enables for clock interrupts via APRENB. APRSUB proceeds in a normal service-a-clock-tick fashion, but notices that the user has requested a clock-interrupt, and so it exits not with POPJ but with a JRST off to APRUTP. APRUTP may decide to fall thru to APRUT2. If T4 doesn't have UE.PEF/UE.NXM on (and it won't of course) it will continue to fall thru. APRUT2 will decide there is a loop in the trap handler (and there is). At this point APRUT2 loads the double word PC into T3/T4, and saves it off to .CPAPC for the error message. Then it branches off to APRUTW. APRUTW sets up the APRLOP PC, and exits off to APRSTU. APRSTU looks at T4 expecting possibly to find UE.PEF!UE.NXM, but instead it has some PC bits left over from APRUT2. This fools APRSTU into calling DIENLK. [Cure] At APRUT2, don't clobber T4 with a PC. Use T1/T2 instead. [Keywords] CLOCK INTERRUPT [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 421 CLOCK1 APRUT2 704A [End of MCO 14162] MCO: 14163 Name: DPM/RCB Date: 7-Feb-89:07:57:02 [Symptom] A user can PIVOT away from a PPN that the CHGPPN checks will not allow him to return to. [Diagnosis] Oversight. [Cure] Always allow CHGPPN to work if returning back to the job's logged-in PPN. [Keywords] CHGPPN [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 421 UUOCON CHGPPN 704A [End of MCO 14163] MCO: 14164 Name: TL Date: 10-Feb-89:15:34:21 [Symptom] RX2 STOPCDs [Diagnosis] If the RX20 (RX211) controller is broken such that TR is not returned to STRTIO, it is possible (but unlikely) for the RX20 controller to post an error interrupt. If it does, then the error interrupt service routine will free the controller, or, worse yet, schedule IO for another drive. In either case, we return from the interrupt back into STARTIO, where we now write the drive registers out of sync with what the controller expects. This causes an error interrupt, and since no drive is (probably) active, an RX2 STOPCD. [Cure] Turn the PI system OFF while in STARTIO. On TR errors, deschedule the controller before turning it back ON. Since it's possible for the KS to accept a vectored interrupt before the deschedule code resets the interrupt enable bit, teach RX2INT to dismiss unexpected interrupts rather than STOPCD. [Keywords] RX2 STOPCD RX2SER RX20 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] KS10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 RX2SER RX2INT,STRTIO,SETPAR 704A [End of MCO 14164] MCO: 14165 Name: RCB Date: 14-Feb-89:07:45:35 [Symptom] PAGE. UUO function .PAGAC does not work right for non-existent pages in mapped sections. [Diagnosis] The code to report non-existent pages and/or independent sections is not sufficiently forgiving of dependent sections. [Cure] Always return the mapping information for dependent sections. [Keywords] Page Accessability [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Extended addressing only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 VMSER PAGAC1,PAGAC6,PAGAC7 704A [End of MCO 14165] MCO: 14166 Name: DPM Date: 14-Feb-89:07:59:06 [Symptom] Ill mem ref running SPEAR following magtape error logging. Seen mostly with multi-ported tapes, but theoretically possible on any tape. [Diagnosis] When two or more kontrollers have access to the same tape drive, the IEP and FEP blocks are timeshared. It is expected that DAEMON will finish servicing one error before the monitor queues up data for the next. In practice, this isn't always the case. [Cure] Convert TAPSER, TAPUUO, and the drivers to use system error blocks. When an error occurs, the data will be copied into the SEBs and queued up for DAEMON to write into ERROR.SYS. The biggest problem with doing this is the monitor must format the error record itself, as SEBs are merely copied into the error file without modification. No big deal. This will reduce the monitor's dependency on DAEMON. [Keywords] MAGTAPE ERROR LOGGING [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 AUTCON RDDTN 704A DEVPRM TUB S TAPSER TAPDRV TAPUUO LOTS T78KON TCXKON TD2KON TM2KON TMXKON TS1KON TX1KON [End of MCO 14166] MCO: 14168 Name: RCB Date: 20-Feb-89:12:55:04 [Symptom] Two complaints received regarding ONCMOD: KLAD pack interface isn't as friendly as it could be. Bad block typeout is sometimes a little too terse. [Diagnosis] We special-case the KLAD structure in several places, but we don't treat it specially when gathering units for a defining a structure. After all, we know that the KLAD pack is just one pack. If the user requests that bad blocks be shown for a unit, but the unit has no bad blocks recorded, then we don't type out anything about bad blocks. This leaves the user wondering whether we forgot about the request to show them. [Cure] Only ask for one spindle when gathering units for structure KLAD. Add the message "[No bad blocks found on unit ]". [Keywords] KLAD BAF BAT Bad blocks [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 423 ONCMOD GETBAT,GETUN6 705 [End of MCO 14168] MCO: 14169 Name: RCB Date: 21-Feb-89:08:45:24 [Symptom] Invalid prompt for first logical block for swapping when defining a structure. [Diagnosis] Re-use of the old value for a unit which is no longer valid after other changes to its swapping parameters. [Cure] Range check the old value, and don't use it for the default if it's invalid. [Keywords] [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 423 ONCMOD GETSW1 705 [End of MCO 14169] MCO: 14171 Name: LWS Date: 22-Feb-89:18:41:42 [Symptom] 1. Problems running DFDXC and DFDXD in user mode. Specifically when using the "Specify Channel Program" DIAG. function, .DISCP. 2. We try to load the DX10 on CPU0 from CPU1 when a DX20 diag exits. [Diagnosis] 1. The diags use the .DIAAU (Assign all units) DIAG. function to keep things nice when it starts init'ing the channel, etc. When the .DISCP DIAG. function is used, the monitor grabs the DDB for the tape drive from the PDB. However, the DDB in the PDB when the .DIAAU function is used is the last tape drive DDB. Not necessarily the DDB for the tape drive the diag is using. So the CCW list is build for the wrong drive (except when the last drive is used). After subsequent calls using .DISCP, we run out of free core for CCWs. 2. When a DX20 diag puts the controller in maintenance mode, we detect that the DX10 can also access the drives so we put it in maintenance mode also. This keeps TAPSEC happy. The diag sets CPU to CPU1 because that's where the DX20 is located. When the diag releases the DX20, or is ^C'd, TPMCMX is called to free up everything. Since we are running on CPU1, the load of the DX10 fails (cause we use the TPKRES and TPKLOD dispatches from TPMCMX). [Cure] 1. Can of worms. Because of the way the DIAG. functions work wrt to DIAKDU and DIADEV, make the diags do a .DIASU (Assign single unit) DIAG. fucntion so that the proper DDB is placed in the PDB and is found by the next .DISCP DIAG. function. In order for this to work, we have to let TPDASU (and TPDAAU for consistency) do their stuff even if F is nonzero on entry to the routine. They still make sure that the current job is the one executing the UUO if the DDB is already "owned". 2. In TPMCMX, check the KDBCAM mask against .CPBIT before calling TPKRES and TPKLOD routines. [Keywords] Diagnostics [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 423 TAPUUO TPDASU,TPDAAU,TPMCMX,TPDHVF [End of MCO 14171] MCO: 14172 Name: DPM Date: 24-Feb-89:04:01:53 [Symptom] A batch login may fail if the number of logged-in jobs minus the number of reserved batch job slots is greater than LOGMAX. [Diagnosis] The difference between LOGMAX and JOBMAX is the number of jobs reserved for emergency logins. A logging-in timesharing job may be granted access if LOGNUM will not exceed LOGMAX, and providing BATMIN job slots are reserved for batch logins. However, if the job logging in is running under batch, then BATMIN must not be included in the computation. [Cure] Don't account for BATMIN job slots when a batch job is logging in. Its inclusion is only meaningful for timesharing logins. [Keywords] BATMIN [Related MCOs] 13932, 13137 [Related SPRs] 36246 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 424 UUOCON ACCLOG 704A [End of MCO 14172] MCO: 14173 Name: DPM Date: 24-Feb-89:09:34:47 [Symptom] The old methods of DAEMON error logging leave something to be desired. [Diagnosis] Currently, most of the monitor expects DAEMON to gather additional data for ERROR.SYS beyond what it's initially given. This exercises race conditions, slows performance because jobs are sometimes stopped until DAEMON is finished, and makes DAEMON dependant upon monitor versions and data structure formats. [Cure] Start converting the old-style DAEMON calls to use System Error Blocks. SEBs eliminate the race conditions because one is queued up for each error log entry rather than always overwriting the same storage with new error data. Performance is improved by not having to prevent jobs from running while DAEMON is logging the error. This also eliminates the dependancy of DAEMON upon the monitor because the monitor will format the entire record. DAEMON merely copies SEBs into ERROR.SYS. This edit will do: DL10 error records I/O BUS LPT error records Stopcode records Software Events (POKE, RTTRP, SNOOP, and TRPSET) [Keywords] ERROR LOGGING [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 424 CLOCK1 DAEEST 704A COMDEV DL10EL ERRCON DIELOG,XFRSEB LPTSER LPTSYR RTTRP RTRET S EX.SYE,EX.DEL UUOCON POKE2,SNPIBP,TRPSTX [End of MCO 14173] MCO: 14174 Name: DPM Date: 28-Feb-89:05:45:07 [Symptom] New: To accomodate future tape service big fixes and enhancements, increase the size of the IORB. Do this by defining a set of "common" IORB definitions, to be used initially by tape service, and possibly later by FILSER. Append to the common portion, the tape-specific words. Common words: .ORG 0 IRBLNK::!BLOCK 1 ;FORWARD LINK TO NEXT IORB IRBACC::!BLOCK 1 ;ACTIVE (CURRENT) CHANNEL COMMAND IRBCCW::!BLOCK ;ADDRESSES OF CHANNEL COMMANDS IRBIVA::!BLOCK 1 ;ADDRESS OF INTERRUPT ROUTINE IRBDDB::!BLOCK 1 ;ADDRESS OF DDB BEING SERVICED IRBSIZ::! ;LENGTH OF COMMON IORB .ORG Tape-specific words: .ORG IRBSIZ TRBFNC::!BLOCK 1 ;FUNCTION DATA TRBSTS::!BLOCK 1 ;TERMINATION STATUS TRBRCT::!BLOCK 1 ;BYTE COUNT OF TRANSFER, IF DATA READ TRBLEN::! ;LENGTH OF BLOCK .ORG IRBLNK is the old TRBLNK, but a full word quantity. IRBCCW is the merger of TRBXCW and TRBEXL. IRBIVA is the old TRBIVA. TRBFNC is the old LH or TRBLNK and now can grow beyond bit 17. TRBSTS could also be made a full word quantity. [Diagnosis] [Cure] [Keywords] MAGTAPE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 424 DEVPRM 704A SCAPRM TAPSER TAPUUO T78KON TCXKON TD2KON TM2KON TMXKON TS1KON TX1KON [End of MCO 14174] MCO: 14175 Name: JEG/DPM Date: 28-Feb-89:06:06:19 [Symptom] ADP code reading. Jeff Gunter points out that SCNPIF doesn't include DSKBIT in configurations which have only a single CPU. Why is that, he said? [Diagnosis] Don't know. Looks like an oversight. While this is a common configuration, it could only cause problems when the monitor is in the middle of a SCNOFF and FILSER decides to print "problem on device" at interrupt level. The SCNOFF will not have turned off DSKCHN, thus allowing FILSER to do obscene things at inappropriate times. [Cure] Probably doesn't happen alot. Remove the conditional assembly and always include DSKBIT in SCNPIF. This is necessary only because FILSER insists on typing out at interrupt level. [Keywords] SCNSER [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 424 COMMON SCNPIF 704A [End of MCO 14175] MCO: 14176 Name: DPM Date: 7-Mar-89:05:38:49 [Symptom] It has been brought to our attention that some customer(s) want to install RM05s on their DEC-10. So be it, however, this is not recommeded and will remain UNSUPPORTED. RM05s are interesting devices. They are faster than an RP06 and consume less power, as they require only single phase power. An RM05 has 30 sectors/track (10 more than an RP06), yet they run at the same 3600 RPM. Therefore, the capacity and the transfer rate is about one third greater than an RP06. Don't look a gift horse in the mouth. For starters, the head crash rate is rather high. It seems that RM05s work best when left alone. Despite the fact that they use removable media, frequent disk pack changes greatly increase the chance of a head crash. The heads fly fairly close to an RM05 pack; much closer than in an RP06. Presumably, this is the main cause of head crashes. Also, parts for RM05s are not nearly as plentiful as are those for RP06s. [Diagnosis] Missing table entries in RPXKON. [Cure] Add entries to the tables for blocks per unit, etc. This is all that's required to make RM05s work. In all other manners, RM05s behave like an RP06. [Keywords] RPXKON [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 425 AUTCON DTRTBL DEVPRM TY.RM5 RPXKON TYPTAB UUOSYM .DCUR5 [End of MCO 14176] MCO: 14177 Name: DPM Date: 6-Mar-89:05:54:25 [Symptom] DAEMON error logging. [Diagnosis] Yes. [Cure] Convert more old-style calls to use System Error Blocks. Changes in this edit include: 1. Channel NXM & parity error logging. 7.04 records written by DAEMON contained mostly junk. 2. DECtape error logging. 3. KS10 memory error logging. Doc change: This adds one word (.CPMFL) to the CPU subtable FOR KS10 memory errors. This word is a flag which indicates the last type of error (0 = soft, 1 = hard). Also, the length of the subtable (.CPMSL) was off by one word and has now been corrected. 4. KS10 card reader & line printer error logging. [Keywords] ERROR LOGGING [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 425 APRSER MEMCHK 704A CD2SER CDRSYR COMDEV DTXEST,DTXEFL,DTXEBK COMMON .CPMFL,.CPMSL DTASER ERRS,DTASYR ERRCON CHNCO3 LP2SER LPTSYR [End of MCO 14177] MCO: 14178 Name: RCB Date: 13-Mar-89:15:12:41 [Symptom] STOPCDs AAO and IME, undeserved address checks, and undeserved checksum errors during dump-mode I/O. [Diagnosis] In the old days before 7.03, LRNGE was called to range-check an IOWD. It checked everything we needed to have checked just fine. One of the things which it checks is that the range of addresses does not cross a section boundary. Thus, it was no longer appropriate once .FOFXI/.FOFXO (extended dump I/O) were added to the FILOP. UUO. MONPFH does not check that old-style IOWD-based I/O does not cross a section boundary, nor does it check that the I/O is not done to the ACs. This can lead to AAOs. If the user's working set includes swapper-write-locked pages, then MONPFH will call LRGNE, even though it might be doing extended I/O, thus resulting in an undeserved address check error for an I/O doubleword which crosses a section boundary. If FILIO has to perform error recovery and retries during a dump-mode I/O operation which ends at a section boundary, under some circumstances it leaves DEVISN containing bogus information in the DDB. If this was also the first block in a retrieval pointer, we will then proceed to attempt to calculate the checksum based on a user address which we calculate, in part, from this junk in DEVISN(F). This can cause either an IME or an undeserved checksum error. Finally, much of the above is exacerbated by NXCMR in UUOCON, which is the common routine used to fetch and validate the next IOWD in a user's channel command list. It does not validate correctly when MONPFH passes it an IOWD which either starts with or crosses a section boundary. [Cure] Teach NXCMR how to validate all IOWDs which PFDOIO might pass. Correct all incorrect uses of DEVISN(F). Teach PFDOIO to use ZRNGE rather than LRNGE when it wants to fix up swapper-write-locked pages. Teach PFHDMP to give an address check error when an old-style IOWD crosses a section boundary. Teach CHKSUM to use GETEWD rather than GETWRD, so that it always fetches the correct word from the user's buffer. Teach PFDOIO to validate the range of address for I/O in order to be sure that I/O is not attempted to the ACs. [Keywords] DUMP I/O AAO IME ADDRESS CHECK CHECKSUM ERROR IOIMPM IO.IMP [Related MCOs] 13932, 13137 [Related SPRs] 35576, 36064 [MCO status] Checked [MCO attributes] Extended addressing only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 426 UUOCON FOPN9B,UINITC,RELEA4,NXCHIT 704A MONPFH PFHDM1,DOIO2 FILIO SATADR,MONIOY,SETLS7,POSER2,ECC2,ECC3,NOECC,CHKSUM,CSHC2B,CSHB2C FILUUO DUMPG9 [End of MCO 14178] MCO: 14179 Name: JEG/DPM Date: 21-Mar-89:05:41:31 [Symptom] FILSER doesn't usually continue from a DHD stopcode (Don't Have DA). [Diagnosis] If IOSDA is off in S (but not necessarily in DEVIOS), then a DHD will result. But if the job really does own the DA resource, it will hang, since the DA is never released. [Cure] Let the DHD return .+1. Further checks will prevent the DA from being returned for the wrong job (a RWD is likely). If we manage not to get a RWD, then the DA will be released and the monitor will continue with no problems. [Keywords] STOPCODE DHD [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 FILIO DWNDA 704A [End of MCO 14179] MCO: 14180 Name: JEG/DPM Date: 21-Mar-89:05:45:20 [Symptom] Stopcode KLPKAF following parity scans. [Diagnosis] A parity scan requires more than KAFTIM seconds to complete. If PPDSEC doesn't get called soon enough (and it won't because of the scan), it declares the KLIPA dead. [Cure] Increase KAFTIM from 10 to 35 seconds. This allows about 8 seconds per meg plus a few extra for good measure. Increase KNISER's timer (also called KAFTIM) from 30 to 35 seconds too. [Keywords] KLPKAF [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 KLPSER KAFTIM 704A KNISER KAFTIM [End of MCO 14180] MCO: 14181 Name: JEG/DPM Date: 21-Mar-89:05:49:40 [Symptom] DI hangs on RA failovers. A failover can leave several jobs stuck in "problem on device" mode for the old unit, even after lots of time passes. [Diagnosis] PCLDSK may inadvertantly get called with an "old" unit if a failover is happening while another CPU is preparing to start I/O. The "old" unit was OK, but now, KDBCAM contains zero, causing PCLDSK to get called. PCLDSK sees no CPUs (and indeed there aren't any with the old unit) and calls HNGSTP, eventually looping back to PCLDSK again with the "old" unit. [Cure] If there is an online alternate port, use it and bypass HNGSTP. [Keywords] FAILOVER [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 FILIO PCLDSK 704A [End of MCO 14181] MCO: 14182 Name: JEG/DPM Date: 21-Mar-89:05:54:13 [Symptom] If a CPU croaks before it can be warm-restarted successfully, and field service is able to fix it "on the fly", sometimes bad things (usually hangs) happen immediately following the J 400. [Diagnosis] This can happen because a CPU restart clears SP.CJn for all jobs, and then CPUZAPs the "running job" for the CPU, leaving a small window when the job can be scheduled to run on another CPU. [Cure] Change SPRINI to call CPUZAP first, and then clear SP.CJn. [Keywords] WARM RESTART [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 COMMON SPRLP1,SPRI11 704A [End of MCO 14182] MCO: 14183 Name: JEG/DPM Date: 21-Mar-89:05:59:19 [Symptom] Stopcode KAF in QUESER. [Diagnosis] It is possible for one CPU to be didling the database at UUO level with the EQ lock, while ENQMIN runs at interrupt level on another CPU. If UUO level CPU removes and releases the free core holding a block that is being scanned by ENQMIN, KAFs or other stopcdes may result. [Cure] Implement a scheme where UUO level waits for interrupt level and interrupt level punts if UUO level holds the EQ resourse. [Keywords] STOPCODE KAF [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 QUESER ENQMN2,EQLOCK,LOKINQ 704A [End of MCO 14183] MCO: 14184 Name: JEG/DPM Date: 21-Mar-89:06:03:20 [Symptom] If a CI disk contains HOM blocks which look like valid but contain a zero word for the structure name, a failover will cause PULSAR to sniff out the disk and mount a structure with no name. [Diagnosis] Monitor never checks for a zero structure name in DEFSTR. [Cure] Return "illegal structure name" error when no name is given. [Keywords] DEFINE STRUCTURE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 FILFND DEFSTR 704A [End of MCO 14184] MCO: 14186 Name: DPM Date: 24-Mar-89:08:37:20 [Symptom] Stopcode KAF in KNISER. [Diagnosis] On a very busy Ethernet wire, it is possible to spend more than 6 seconds at interrupt level taking packets off the KLNI. RSX-20F has little patience for this sort of nonsense, so it KAFs the -10. [Cure] Put an arbitrary limit on the number of packets that we'll process in a single interrupt. Experimentation has proven that trying to remove 2100 (decimal) or more packets from the queue will result in a KAF. Therefore, set the limit to 2000. Location .PBMPP (maximum packets processed) in the KDB/PCB contains the limit and can easily be patched to a different value. When the limit is exceeded, a KNIKSP (KLNI Service Paused) info stopcode will be typed on the CTY. Then the PIA will be removed for one second to let things settle down. [Keywords] KNISER KAF [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 430 KNISER KNIRQ1,KNIPAU,KNICON 704A [End of MCO 14186] MCO: 14189 Name: JEG/DPM Date: 28-Mar-89:08:22:29 [Symptom] If a program dies with infinite IPCF quotas and freecore is very low or about to expire, the system grinds to a standstill. Some jobs are stuck NApping and others get unexpected error returns. Trying to log off the offending job fails. [Diagnosis] IPCLGO does two things. It sends a logout message to QUASAR and it turns around all unreceived messages; in that order. The send to QUASAR will fail because there is no available freecore, and the logging out job owns a large chunk of it. [Cure] Reverse the order of things. First, empty the send and receive queues, then send the logout message to quasar. [Keywords] IPCF LOGOUT [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 430 IPCSER IPCLGO 704A [End of MCO 14189] MCO: 14190 Name: JEG/DPM Date: 4-Apr-89:05:14:36 [Symptom] Stopcode IME removing a structure. Other problems possible too. When allocation is in progress, or the ACCs and NMBs are in transition, and a structure is being removed, an IME is likely to occur on a busy system. [Diagnosis] TAKBLK and friends rely on DEVUNI(F) to indicate the target unit for a structure is still valid. FILSER normally depends upon TSTGEN checking UNIGEN. The window is sufficiently large to allow the SKIPN DEVUNI to work while REMSTR is removing a structure. [Cure] Change TAKBLK to call TSTGEN. Make BMPGEN get and release the DA around the update of UNIGEN. [Keywords] DISMOUNT [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 431 FILFND BMPGN1 704A FILIO TAKBL0,TAKBLJ [End of MCO 14190] MCO: 14191 Name: RCB Date: 5-Apr-89:22:16:13 [Symptom] Hung ANF traffic to a node. Especially common over an Ethernet channel. It may (sometimes) correct itself eventually, especially if it was not an Ethernet channel that was involved. [Diagnosis] After NETWRT queues an output message (PCB) to the FEK, it calls its device driver to perform the output. This can happen several times before the device driver tells the FEK routine that the output has happened. At that point, the FEK routine tell NETSER that the message has been sent. This causes the PCB to placed on a generic output-done queue for NETSCN to process. Once we get to NETSCN, we move PCBs from the this queue to a queue for the NDB for the node to which we were sending the message. The subroutine responsible for this, NTSC.O, is also responsible for keeping the NDBLMS (last message sent) field updated. It does this by noting the message number of each PCB it places into the output-pending queue in NDBLMS. However, the PCB queue from which it is taking these messages is unordered, and this can lead to having a very long list of messages, with NDBLMS reflecting only (for example) the first of them. Once this has happened, CHKNCA (check network-control ACK) will ignore an ACK for any message beyond that in LDBLMS. However, the remote is quite likely to send us an ACK for the actual last message in the ACK-pending queue. This leads to a full output queue and a refusal to transmit any further data messages, at least until the REP/NAK timer causes us to send a REP, which will result in a NAK. Because we ignored the implicit ACK present in the NAK, we will still have a queue of outstanding messages, which the NAK will cause us to retransmit all at once. Unless the device driver stutters in a friendly manner, this will merely get us into the same mess again with the same set of messages, and no progress will ever be seen. [Cure] In NTSC.O, only change NDBLMS if it's moving in a forward direction. In INCTNK, where we resend the queue in response to a NAK, reset NDBLMS to NDBLAP in order to avoid possible ACK races. [Keywords] ANF Ethernet Hung ANF [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] HOSS attention [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 NETSER NTSC.O,INCTNK 704A [End of MCO 14191] MCO: 14192 Name: RCB Date: 5-Apr-89:22:55:57 [Symptom] Terminal characteristics get handled incorrectly during a SET HOST session which is handled by NETVTM. [Diagnosis] Setting the terminal type happens after all the other characteristics get set, and clobbers them. [Cure] Save the other characteristics until after we set the terminal type in VTMCHR. [Keywords] NETVTM SET HOST terminal characteristics [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] HOSS attention [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 NETVTM VTMCHR 704A [End of MCO 14192] MCO: 14193 Name: DPM Date: 6-Apr-89:11:36:40 [Symptom] MCO 14190 went a bit too far. [Diagnosis] In trying to close the window where a structure could be removed while other things were being done to the ACC/NMB blocks, BMPGEN was modified to get and give the DA resource. However, one needs a DDB to use the DA and REMSTR doesn't have one to use. Also, BMPGEN expects F to contain a STR DB addr, not a DDB. [Cure] Can't plug the hole that tight. Remove references to the DA in BMPGEN and live with occasional IMEs. There is no structure-wide resource to take care of this situation. Too bad. [Keywords] REMSTR [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 FILFND BMPGEN 704A [End of MCO 14193] MCO: 14194 Name: KDO Date: 6-Apr-89:12:18:04 [Symptom] Invalid status returned in ETHNT. UUO User Buffer Descriptor (UBD) blocks. [Diagnosis] Missing code. [Cure] Add code. [Keywords] [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 ETHUUO ENCXDG 704A [End of MCO 14194] MCO: 14195 Name: KDO Date: 10-Apr-89:10:54:04 [Symptom] Adjacency up/down events for DECnet endnodes on multi-area LANs. [Diagnosis] DECnet is choosing a designated router outside it's area. [Cure] Ignore Ethernet Router Hello messages from outside our area. [Keywords] [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 ROUTER RHMASE 705 [End of MCO 14195] MCO: 14197 Name: DPM Date: 11-Apr-89:08:44:04 [Symptom] REFSTR creates files with strange version numbers. [Diagnosis] Sticking REFSTR's version rather than the monitor's is at best, non-standard. But when displayed by DIRECT, it looks like a bug. [Cure] Use CNFDVN instead. [Keywords] REFRESH [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 423 REFSTR RIBST1 [End of MCO 14197] MCO: 14198 Name: LWS Date: 17-Apr-89:10:18:26 [Symptom] 1. Same TM02/3 controller register dumped 9 times in TUB on error. 2. SPEAR doesn't know how to interpret TM02/3 controlled tape drive error entries or the monitor doesn't give SPEAR what it expects, take your pick. [Diagnosis] 1. RDREGS in TM2KON expects T2 to still contain controller register number on return from RDMBR. RDMBR clears all but register data in T2. 2. SPEAR expects 2 equal length blocks of error status information in the error entry (IEP and FEP data). However, TM2KONs IEP length is 1 and FEP length is 16 (octal). So we only write 1 word of "IEP" information. This causes SPEAR's interpretation of the error to be garbage (1. above doesn't help either). Note: The TUB for a TM02/3 controlled tape drive contains 2 blocks of TM2ELN words each for "IEP" and "FEP" error information. But the IEP word is set for only a length of 1. Why? I don't know. Poking the IEP word on 2476 to be the same as the FEP word causes 2 sets of error information to be dumped and SPEAR correctly interprets the error. So it seems we can change SPEAR to handle unequal length "IEP" and "FEP" error blocks, or have the monitor dump equal length blocks. [Cure] 1. PUSH/POP T2 around call to RDMBR at RDREG1. 2. Change LH of TUBIEP in TM2KON to be -TM2ELN. [Keywords] SPEAR TM02/3 TU77 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Field service attention PCO required [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 433 TM2KON TUBIEP,RDREG1 [End of MCO 14198] MCO: 14199 Name: LWS Date: 17-Apr-89:11:17:10 [Symptom] Can't assign a network device if a device of the same type doesn't exist on the local host. [Diagnosis] If the device doesn't exist on the local host there will not be an entry in GENTAB for the corresponding device. The call to CHKGEN in DVSTAS will fail and we bomb the user even though the network device does exist and is assignable. [Cure] At the non-skip return after the call to CHKGEN in DVSTAS load F with the start of the DDB chain and fall through into code that will eventually do the right stuff. But! This is not going to work correctly all the time. If no local line printers exist and we're trying to find a network printer DDB, we eventually build a DDB for the network printer and try to link it between the 'DSK' DDB and the 'SWAP' DDB - ding ding ding, IME. This happens because LNKDDB in AUTCON likes to keep the DDB chain in sorted order by device name. So 'LPT' falls between 'DSK' and 'SWAP', but 'DSK' DDBs are in the hiseg. In order to avoid the wrath of FILSER, change the name of SWPDDB to 'DSKSWP'. [Keywords] NETWORK DEVICE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] Beware file entry required New development MCO PCO required [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 433 COMMOD DEVNAM UUOCON DVSTAS [End of MCO 14199] MCO: 14200 Name: LWS/DPM Date: 20-Apr-89:10:52:06 [Symptom] Tape UDBs on KS not filled in with prototype data. [Diagnosis] AUTUDB doesn't compute ending address for BLT. [Cure] ADDI P2,(U) [Keywords] KS SPEAR [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Field service attention PCO required Single-section monitors only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 434 AUTCON AUTUD1 [End of MCO 14200] MCO: 14201 Name: RCB Date: 25-Apr-89:07:05:00 [Symptom] FILSER's error reporting leaves too big a window for DAEMON to get stale information for SPEAR to report. Not only that, but DAEMON even has to guess just what kind of error it is supposed to report. [Diagnosis] The ERRPT. UUO just doesn't give us enough to work with. We need to use system error blocks if we're going to get it right. [Cure] Do so. This adds EX.AVL to the bits which can be set in the transfer table header by the SEBTBL macro. If EX.AVL is set, the error entry will be copied to AVAIL.SYS as well as to ERROR.SYS. This also changes the way in which all disks report their errors. There is now a kontroller dispatch entry, KONELG, which is used by FILIO to format an error block and queue it up for DAEMON. [Keywords] Disk errors Error logging DAEMON System error blocks SPEAR [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] Beware file entry required New development MCO [BEWARE text] The format of the DSK KDB has changed again, with the addition of the KONELG dispatch entry for error logging. Any local disk device drivers will need to be changed accordingly. See MDEELG in FILIO for an example of how to do this. DAEMON version 23A(1026) or later must be installed before this MCO. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A FILIO 705 434 COMMON DPXKON FHXKON FSXKON RAXKON RHXKON RNXKON RPXKON DSXKON COMMOD DEVPRM S ERRCON DTASER COMDEV [End of MCO 14201] MCO: 14202 Name: RCB Date: 25-Apr-89:07:16:53 [Symptom] Jobs get stuck in event wait for system IPCF, and need manual intervention to be restarted. If they were logging out at the time, the job slot is stuck and useless. [Diagnosis] [SYSTEM]GOPHER is completely ignorant of the possibility that a system program like the account daemon might die and get logged out, thus causing its IPCF receive queue to be "returned to sender, address unknown". It just throws the returned messages on the floor, and leaves the user's job waiting for an acknowledgement message which will never come. [Cure] Educate the rodent. Check the returned message field, and validate it against the expected sequence number. If it matches, give the user an error return from SENDSP, so that a QUEUE. UUO (for example) will give the "component not running" error, and FILDAE messages will be handled as though FILDAE had never been running. [Keywords] EW hang System IPCF wait [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A IPCSER 705 434 [End of MCO 14202] MCO: 14203 Name: RCB Date: 25-Apr-89:08:11:52 [Symptom] System error blocks can eat up all of free core if DAEMON isn't running. [Diagnosis] Once they get queued, they are only deleted when some privileged program executes a SEBLK. UUO. [Cure] Add a timer. Once a minute, we will look for any blocks which are older than SEBAGE minutes and delete them. SEBAGE defaults to 10 (decimal), and can be changed with MONGEN. If SEBAGE is set to zero, the error blocks will live forever. [Keywords] System error blocks free core limits [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 434 ERRCON 704A COMMON CLOCK1 [End of MCO 14203] MCO: 14204 Name: DPM Date: 27-Apr-89:06:36:33 [Symptom] In some configurations, LINK will report NDBNNM as undefined even though ANF-10 network software is loaded. [Diagnosis] This problem is one of programming style and MACRO's tolerance for conflicting symbol definitions. NDBNNM is defined in NETPRM, which is searched by NETSER. The first several references to this symbol are properly made. However, at NDBAS1 the symbol is referenced as external. MACRO should probably flag this as a "E" error. Instead, the original value of the symbol is lost and MACRO generates global fixup requests for all references to NDBNNM. It's not clear why this problem has surfaced now, as the code at NETAS1 has not changed for several monitor releases, but correcting the reference in NETAS1 makes resolves the undefined global. [Cure] Reference NDMNNM as an internal quantity. [Keywords] UNDEFINED GLOBAL [Related MCOs] 13932, 13137 [Related SPRs] 36260 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 435 NETSER NETAS1 704A [End of MCO 14204] MCO: 14205 Name: RCB Date: 27-Apr-89:18:49:36 [Symptom] MCO 14165 didn't go far enough. PAGE. UUO function .PAGAC still isn't always right. Spy pages for sections 3-36 are sometimes reported as being unreadable. [Diagnosis] PAGA93, which finds a page number to return for a spy page in sections 3-36, doesn't preserve T2. Its caller wants T2 to contain the map entry after the call, as well as before. [Cure] Preserve the map entry in T2. [Keywords] PAGE. UUO Page accessability [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 435 VMSER PAGA93 704A [End of MCO 14205] MCO: 14206 Name: DPM Date: 28-Apr-89:05:51:47 [Symptom] Stopcode IME removing a structure (revisited). [Diagnosis] Previous MCOs didn't plug all the holes, although the window was made much smaller. [Cure] Prevent races by incrementing UNIGEN while holding the DA. Conceptually, this is easy, but BMPGEN is called with F pointing to a STR, not a DDB. Therefore, change UPDA & DWNDA to get the job number from .USJOB rather than from PJOBN. This is OK since the use of DA requires a job to be mapped to reference PJOBN anyway. [Keywords] REMOVE STRUCTURE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 435 FILFND BMPGN1 704A FILIO UPDA,DWNDA [End of MCO 14206] MCO: 14207 Name: LWS Date: 29-Apr-89:14:48:25 [Symptom] Can't create a SSL larger than .SLMXJ (maximum JSL size) structures using STRUUO. [Diagnosis] Code in SLSTRR and SLCHK always use .SLMXJ as a maximum without checking to see if its a JSL or the SSL. [Cure] Check search list type (RH(F)=0 means SSL) and use appropriate maximum value. [Keywords] SSL [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 435 FILFND SLSTRR,SLCHK 704A [End of MCO 14207] MCO: 14209 Name: DPM Date: 8-May-89:08:57:19 [Symptom] Pathological names whose first component is NUL do not necessarily behave as the NUL device. A DEVCHR, or DEVTYP of the sixbit name returns disk-only bits. The same is true if you do one of these UUOs on an open channel. However, if a LOOKUP or ENTER is done, then the right thing comes back. Also, DEVNAM never returns NUL and WATCH FILES doesn't expand the filespec correctly. [Diagnosis] The monitor believes a pathological name can only be a disk device and everybody knows that NUL is really a disk even though it claims to be all devices. But FILSER doesn't make that claim often enough. [Cure] Fix SETDDB to test for pathologcal NUL as well as assigned NUL. Change NULTST to test for DVDSK and DVTTY instead of sixbit NUL. Fix PRTDDB to print NUL instead of a logical device name. Add crock routine LNMNUL to do the grunt work when it's really necessary to know if it's the NUL device. [Keywords] NUL [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 436 COMCON PRTDDB FILUUO NULTST,LNMNUL,SETDDB UUOCON DVCHR,UDVNAM 704A [End of MCO 14209] MCO: 14211 Name: DPM Date: 15-May-89:09:29:52 [Symptom] It's difficult to measure magtape performance on a per-kontroller basis without using any counters. [Diagnosis] Never done before I guess. [Cure] Add two new counters to the KDB: TKBCRD counts characters read and TKBCWR counts characters written. [Keywords] MAGTAPE PERFORMANCE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 437 DEVPRM TKBCRD,TKBCWR T78KON TCXKON TD2KON TM2KON TMXKON TS1KON TX1KON 704A [End of MCO 14211] MCO: 14212 Name: LWS Date: 15-May-89:10:21:02 [Symptom] Undeserved ?Illegal memory reference in jobs with a shared hiseg. [Diagnosis] If a sharable hiseg is expanding and there are enough secondary map slots available to map the expansion, RDOMP is not set for any other job using the same hiseg. [Cure] In GTHMAP, if there are enough map slots for the expansion, call HRDOMP via HGHAPP so all other users of the same hiseg will have their maps redone before they run again. [Keywords] Sharable high segments [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO PCO required [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 437 SEGCON GTHMP1 704A [End of MCO 14212] MCO: 14214 Name: JAD Date: 25-May-89:07:51:12 [Symptom] Possible SCAFOO stopcodes in a maximally-configured CI network. [Diagnosis] Insufficient path blocks available for the number of CI nodes and CPUs in the CI/system configuration. There is space available for 32 path blocks, but a maximally-configured system could require much more. Problem occurs with definition of C%PBLL (number of path blocks) - it is defined as 2*C%SBLL (number of system blocks). Depending on the number of CI nodes and CPUs, this definition may leave insufficient path blocks. [Cure] Redefine C%PBLL as 6*C%SBLL - this will allow for the largest possible CI and CPU configuration. [Keywords] CI SCAFOO [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 SCAPRM C%PBLL [End of MCO 14214] MCO: 14217 Name: DPM/RJF Date: 30-May-89:06:58:55 [Symptom] Various problems suspending and resuming a system: 1. KLIPAs and KLNIs don't get reloaded. 2. KLNIs will be restarted even if they had first been removed. 3. Stopcode NULFNC during the suspend. [Diagnosis] 1. Code to call PPDINX and KNIINI is under an IFG conditional in COMMON, so it is not included in single CPU configurations. 2. When a KLNI is removed, the bit corresponding to the proper KLNI on a given CPU is set to indicate that the device is to be ignored on subsequent initialization calls. However, IPAMSK is never checked on KLNI restarts. 3. For reasons that escape me, the NULFEK is being called on system sleep/resume when it hadn't before. Apparently this never worked before, but it went unnoticed. The dispatch table does not contain the appropriate entries for these NETSER functions. [Cure] 1. Move the calls to PPDINX and KNIINI outside the IFG conditional. 2. Teach KNIINI to respect IPAMSK on KLNI restarts. 3. Add system sleep/resume entry point to NULFEK's dispatch table. [Keywords] SYSTEM SLEEP [Related MCOs] 13932, 13137 [Related SPRs] 36269 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 441 COMMON SPRIN5 KNISER KNIINI NULFEK NLFDSP 704A [End of MCO 14217] MCO: 14218 Name: LWS Date: 8-Jun-89:09:08:11 [Symptom] Undeserved memory parity errors on KLs with 4MW of memory. [Diagnosis] RH20s do undetermined things when accessing the last physical (quad)word in 4MW. This is an RH20 problem. This problem was never encountered in previous versions of the monitor and BOOT. The monitor used to put its hiseg at the very top of memory. Then BOOT occupied the top of memory. Now, BOOT is still there, but it now frees the pages at the top because they contain tape drivers that are not needed once BOOT is done. So, these pages at the top of memory are free to use by the monitor. When a user gets the last page of memory, it's fair game for I/O by an RH20. [Cure] For lack of something better to do at the moment, if the last page of a 4MW system is free, mark it as non-existant in NXMTAB and PAGTAB and set MEMSIZ to 17,,777000 instead of 20,,000000. [Keywords] 4 MW parity [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Field service attention KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 442 SYSINI MMTIN9 704 [End of MCO 14218] MCO: 14219 Name: DPM Date: 27-Jun-89:06:37:34 [Symptom] There appears to be no upper bounds on the number of extended RIBs FILSER is content to create. You can literally fill a disk with extended RIBs for a single file. When you CLOSE the file, you might as well take the rest of the day off, because FILSER has lots of bookkeeping to perform. [Diagnosis] RIBXRA contains an 8-bit field for the extended RIB number. FILSER never checks for field wrap around. The RIB number is only read back when a user specifies a negative USETI, and otherwise serves no real purpose. [Cure] Check for wrap around and impose an additional limit based on the contents of MUSTMX when RIBs are created. Set the maximum number of USETIs to 255 decimal. [Keywords] EXTENDED RIBS [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 443 COMMOD DESRBC,MUSTMX FILIO EXTRB2 704A [End of MCO 14219] MCO: 14220 Name: RCB Date: 30-Jun-89:00:15:53 [Symptom] An OPEN which specifies a logical name or a pathological name can fail or find the wrong device. [Diagnosis] The DDB search logic does not allow certain names to be found unless they are assigned to disks (i.e., funny-space DDBs). CK2CHR gets called when it should not. For that matter, LP will match a terminal assigned as LPT but not as LP. [Cure] For 2-character device names which CK2CHR changes, do the DDB searching twice. First, try the original name. If that fails or returns DSKDDB, then try again with the expanded name. If the second search fails but the first returned DSKDDB, then return the results from the first DDB search. Eliminate the hacks for CK2CHR and SY: from the search loop. [Keywords] PDP-11 names [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 444 UUOCON DDBSCC 704A [End of MCO 14220] MCO: 14221 Name: RCB Date: 1-Jul-89:01:56:17 [Symptom] KAF at PI level of the NIA20. [Diagnosis] Taking too long to empty the response queue (MCO 14186 revisited). [Cure] Check .CPTMF to try to be sure that too much time won't pass during a single KLNI interrupt. Also, move the check to after the callback so that we don't drop the buffers on the floor. Otherwise, after long enough, the protocols will run out of buffers (especially DECnet). Because .CPTMF is slightly bogus just as the system is coming up, ignore it until .CPUPT is at least 2 (ticks). Note that the counters and limits added by MCO 14186 are still present and in force. [Keywords] KAF NIA20 KNIKSP [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] Field service attention HOSS attention KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 444 KNISER KNIRQ1 704A [End of MCO 14221] MCO: 14222 Name: RCB Date: 7-Jul-89:23:34:54 [Symptom] System is annoyingly sluggish at system startup time. [Diagnosis] Trying to run dozens of copies of INITIA on random terminals all at the same time, in dozens of job slots. [Cure] Only sort of. Invent a new MONGEN-definable symbol, DSDRIC (dataset devices run INITIA CUSP), to control whether INITIA runs on dataset lines. It will default to one, which means that INITIA will continue to run on datasets at system startup. If set to zero at MONGEN time, TTYINI will not force INITIA commands on the datasets. For the curious, the reason INITIA runs on datasets at startup time is because of the existence of hardware interfaces which need to have parameters set even before a call comes in to the modem. However, most sites probably have more well-behaved interfaces, and will be able to set DSDRIC to zero. [Keywords] sluggish startup INITIA datasets [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 444 COMDEV DSCTAB 704A SCNSER TTINI2 [End of MCO 14222] MCO: 14223 Name: DPM Date: 18-Jul-89:05:32:00 [Symptom] DA28s don't work. [Diagnosis] XTCLNK assigns junk names to UDBs. Later calls to build DDBs fails because the target UDBs cannot be found. Also, XTCSER will not assemble with FTMP turned off because of references to SCNLOK and OUCHE. [Cure] Correct logic that builds UDB names. Put IFN FTMP conditionals around the reference to SCNLOK. Make OUCHE available in all KL10 configurations. [Keywords] DA28 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 445 APRSER OUCHE COMMON OUCHTB XTCSER XTCLN2,CHKTYP,MPIOWD 704A [End of MCO 14223] MCO: 14224 Name: DPM Date: 20-Jul-89:11:58:59 [Symptom] Random job tables (mostly JBTSTS) get clobbered, wierd crashes, general mayhem. [Diagnosis] Steve Perkins is running .EXE files created on the -20 again. If the .EXE directory claims to have sharable pages that aren't also marked as high segment pages, GETEXE returns flags indicating the image is sharable, but with no high segment. Parts of GET clean up assume that if the sharable bit is on, then there must be a high segment. This is true for .EXE files creates on a -10, but not otherwise. Anyway, making this assumption, SEGCON blindly picks up high seg block addresses (which are usually zero) and indexing off of zero, proceeds to write all over the monitor's low segment. [Cure] While processing .EXE directory entries, turn off the sharable bit if the high segment bit is not turned on. [Keywords] TOPS-20 EXE FILES [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 446 SEGCON WANTIT 704A [End of MCO 14224] MCO: 14225 Name: DPM Date: 25-Jul-89:05:28:01 [Symptom] SA10s don't function in an environment where DF10C-based device drivers exist (TM2KON for one). [Diagnosis] DF10C drivers fail to test for the presence of SA10 devices. Therefore, SA10s look like 18-bit DF10s. [Cure] Test SI.SAX in the CONI word in the appropriate xxxCFG routines. [Keywords] SA10 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 446 FSXKON FSXCFG RPXKON RPXCFG TM2KON TM2CFG 704A [End of MCO 14225] MCO: 14226 Name: DPM Date: 3-Aug-89:09:18:56 [Symptom] Several annoying problem that prevent SA10-based tape from working well. [Diagnosis] 1. SAXSER & TS1KON bum a bit in the KDBUNI word to indicate a software interrupt was requested. This means that KDBs can't be compared against each other, so AUTCON will build multiple KDBs for a single SA10 kontroller. 2. Tapes ported between a DX10 or a DX20 and an SA10 will have duplicate UDBs and DDBs built. This is because TD2KON and TX1KON do not know how to extract drive serial numbers. Subsequent comparisons between a drive S/N and an existing one don't match, so AUTCON beleived it's looking at two different drives. 3. The code to compare drive serial number is not interlocked in AUTCON. Under the righ circumstances, two configuring CPUs which have detected the same drive, might not notice the other. [Cure] 1. Move the software bit into KDBSTS. It's a better place for such things. 2. Fix TX1KON and TD2KON. 3. SYSPIF/SYSPIN around much of AUTDPU. [Keywords] SA10 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 447 AUTCON AUTDPU DEVPRM KD.SIR SAXPRM SA.SIR SAXSER SAXINT TD2KON TD2DRV TS1KON TS1DRV TX1KON TX1DRV 704A [End of MCO 14226] MCO: 14227 Name: DPM Date: 3-Aug-89:09:20:23 [Symptom] Possible tape hangs after a CPU restart. [Diagnosis] SPRINI doesn't clear the TAPSER interlock nesting flag. [Cure] Do so. [Keywords] INTERLOCKS [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 447 COMMON SPRI10 704A [End of MCO 14227] MCO: 14228 Name: RCB Date: 3-Aug-89:15:33:21 [Symptom] Problems setting explicit speeds on TTY lines in the ANF front ends. [Diagnosis] Trying to do autobaud even though the speed has been set to something other than the autobaud speed. [Cure] Don't do that. If the speed is set in the config.P11 file, and that speed is not the autobaud speed (currently 2400 baud), override the ABD characteristic for the line. [Keywords] Autobaud Non-autobaud TnXS ANF10 [Related MCOs] 13932, 13137 [Related SPRs] 36270, 36268 [MCO status] None [MCO attributes] Field service attention HOSS attention [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 450 CONFIG P11 704A DNTTY P11 DNLBLK P11 MACROS P11 [End of MCO 14228] MCO: 14229 Name: RCB Date: 8-Aug-89:22:22:51 [Symptom] Monitor too big and slow. Not enough free bits in JBTSTS. [Diagnosis] Lots of places in the monitor test bit JDC from JBTSTS. A few others clear it. Only DAECOM can set it. It is unreachable code, left over from the old DCORE and DUMP commands and the days when DAEMON handled virtual references for EXAMINE, DEPOSIT, and VERSION commands. The JDC bit is consequently never set, and all the tests for it are redundant. [Cure] Free up the bit in JBTSTS, and eliminate all references to it. [Keywords] PERFORMANCE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 450 COMCON 704A CLOCK1 SCHED1 SCNSER S [End of MCO 14229] MCO: 14230 Name: DPM Date: 15-Aug-89:06:36:14 [Symptom] More error logging stuff ... [Diagnosis] Yes. [Cure] Convert more old-style DAEMON error logging calls to use the System Error Blocks. This edit converts: 1. CPU attached/detached records. 2. Node online/offline records. 3. Date/time change records. Code is also inplace to handle system reload (.ERWHY) records, but because of interface problems with DAEMON and AVAIL.SYS, this call will be temporarily neutered. [Keywords] DAEMON ERROR LOGGING [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 450 COMCON SETDAT CPNSER CPUCSC NETSER NODEAM S .ERMVR SYSINI SYSRLD,SYSAVL 704A [End of MCO 14230] MCO: 14231 Name: RCB/DPM Date: 15-Aug-89:07:43:18 [Symptom] Convert DAEMON reporting of KL error chunks from RSX20F to use system error blocks. This eliminates two words in the CDB, .CPETM and .CPEAD. In order to accomplish this cleanly, there is now a new routine in IPCSER, OPRMSG, which allows one to queue up messages for ORION. If ORION is not running, the messages can optionally be sent to OPR: or the CTY. See IPCSER for the calling sequence. The behavior is controlled by bits in T1 on the call, of the form OPM.??, which are defined in S. [Diagnosis] [Cure] [Keywords] KL error chunks system error blocks system messages [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked Deferred [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 450 DTEPRM 704A DTESER S IPCSER ERRCON COMCON CLOCK1 COMMON [End of MCO 14231] MCO: 14233 Name: RCB Date: 22-Aug-89:09:55:53 [Symptom] Undeserved KNIKSP stopcodes. [Diagnosis] .CPTMF limit is exceeded at system startup time. [Cure] If .CPUPT is lower, then don't KNIKSP. [Keywords] [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 451 KNISER KNIRQ1 704A [End of MCO 14233] MCO: 14234 Name: DPM Date: 23-Aug-89:07:34:30 [Symptom] Programs using external tasks (XTCSER) hang following attempts to JAM powered off remote computers. [Diagnosis] If FTMP is turned off, the call to CHKTYP from DWNUNI says to never do typeout. DWNUNI simply returns without clearing any DA28 errors which caused the unit to be declared down. Thus, the DA28 becomes unusable for all other users. A similar situation exists where connect errors are processed. In this case, we forget to force the unit offline. [Cure] Three things. First, fix CHKTYP to work correctly with FTMP turned off. Second, if no typeout is to be done, skip around the message generation code and clear the DA28. Finally, on connect errors, always force the unit offline whether or not we'll type a message. [Keywords] DA28 ERRORS [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 451 XTCSER CHKTYP,DWNUNI,CHKCER 704A [End of MCO 14234] MCO: 14235 Name: DPM Date: 29-Aug-89:07:32:01 [Symptom] More error logging stuff. [Diagnosis] Yes. [Cure] Teach the monitor to write the following records as system error blocks: .ERCSC Configuration status change (memory on/off line) .ERKSN KS10 NXM trap .ERKPT KL10/KS10 parity trap .ERCSB CPU status block .ERDSB Device status block [Keywords] DAEMON ERROR LOGGING [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 452 APRSER PRHMF7,DAELOG,MEMELG COMCON MEMONU,MEMON8 COMMON OLDNXM,DIACSB,DIADSB LOKCON MEMOFU,MEMOF2 704A [End of MCO 14235] MCO: 14236 Name: KBY Date: 29-Aug-89:08:27:46 [Symptom] FA resource scheduling leaves something to be desired. The schedular knows how to wake up just the job that needs it, but everyone wakes up now any time it's given up. [Diagnosis] No code. [Cure] Add code (the remaining routines necessary to do the unwind properly). [Keywords] FA UNWIND [Related MCOs] None [Related SPRs] None [MCO status] Deferred [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 452 FILIO UPFA,DWNFA 704A COMMOD S CLOCK1 SRFREE [End of MCO 14236] MCO: 14237 Name: KBY Date: 29-Aug-89:08:34:37 [Symptom] Job stuck; SYSTAT shows it's locked (even though it really isn't). [Diagnosis] Due to the extra calls to SCDCHK to prevent KAFs in large PAGE. UUOs, we can potentially block in a PAGE. UUO. If pages were allocated to the job by CHGPGS (because they were available at the time), but during the block we decide to swap out the job, we could potentially lose those pages to never-never land since they are not in anyone's map. To prevent this, CHGPGS lights NSHF (but not NSWP) akin to MAPBAK so that the swapper won't touch the job. Unfortunately, if the job has a sharable high segment, someone else using it might call XPANDH (which can happen even without really wanting to expand the high seg as we tend to do this at the drop of a hat) and set JXPN for the job blocked at CHGPGS. At this point the schedular will not run the job because of JXPN and the swapper won't clear JXPN (even without swapping the job which may not be necessary) because of NSHF which won't get cleared until the job finishes running through CHGPGS (deadly embrace). [Cure] The schedular will except jobs owning disk resources from the JXPN check. Do so also with jobs having NSHF on but not NSWP (a state only the monitor can cause in limited situations such as the above). [Keywords] JXPN NSHF [Related MCOs] 13932, 13137 [Related SPRs] 36245 [MCO status] Deferred [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 452 SCHED1 CJFRCX 704A [End of MCO 14237] MCO: 14238 Name: JC Date: 1-Sep-89:13:34:18 [Symptom] TOPS-10 is missing the TRANSLate command. [Diagnosis] No one ever put it in. [Cure] Add one. [Keywords] TRANSL LOGIN commands [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 453 COMMON COMTAB [End of MCO 14238] MCO: 14240 Name: DPM Date: 5-Sep-89:05:41:06 [Symptom] More error logging stuff. [Diagnosis] Yes. [Cure] 1. Add support for .ERSNX (NXM sweep). 2. Add support for .ERSPR (parity sweep). 3. Turn on .ERWHY/.ERMRV logging. [Keywords] ERROR LOGGING [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Beware file entry required [BEWARE text] DAEMON version 23A(1027) or later is required. Earlier versions will cause .ERMRV records to be written into ERROR.SYS instead of AVAIL.SYS. When this happens, SPEAR will report an unknown record type in ERROR.SYS. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 453 ERRCON PARSWP,PARELG,NXMSWP,NXMELG,XFRSE2 S EX.NER SYSINI LLMSTR,AVLTBL 704A [End of MCO 14240] MCO: 14241 Name: DPM Date: 6-Sep-89:07:23:57 [Symptom] Stopcode OVA on a KS10 during SYSINI. [Diagnosis] EVA pages overflow BOOT address space because the high segment grew a bit. [Cure] Slide the high segment origin down 2 pages. [Keywords] HIGH SEGMENT [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 453 COMMON MONORG 704A [End of MCO 14241] MCO: 14242 Name: KDO Date: 11-Sep-89:14:05:55 [Symptom] Unusable TTY DBBs. [Diagnosis] LATSER creates a TTY DDB for host-initiated connects, but INITIA uses a different one, causing LATSER's to float free. [Cure] If it hurts everytime I do this, don't do it anymore. [Keywords] [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 454 LATSER GETTDB 704A [End of MCO 14242] MCO: 14243 Name: ERS Date: 12-Sep-89:08:18:55 [Symptom] Various. Lots of monitor too big and slow. Several places that a user mode section number are lost. And possible working-set confusion if a multi-section program had a PFH. (Probably wouldn't work anyway.) [Diagnosis] GETPC/PUTPC [Cure] Remove uses of GETPC/PUTPC. In some cases we simply put the same code in minus a couple JRSTs. In other places it gets a little more complicated. The DDT command should now include the section number in the one-word old PC in JOBDAT. Assume that an extended user is not in his PFH. (A bit of work would be involved in making an extended PFH work.) Rewrite DOINT. Net result is that we'll store the section number in the old PC portion of the interrupt block. [Keywords] GETPC User-mode Extended-addressing [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] Beware file entry required New development MCO Documentation change [BEWARE text] Some one word PCs will now contain the section number where they did not it the past. In paticular commands like DDT should preserve the section number in .JBOPC. Also, the old PC in the interrupt block should now contain the section number. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 454 COMCON USAVE,SEGRLX ERRCON DOINT VMSER USRFL6,USRFL7,GETDDT,UPAGE4,PAGA1C [End of MCO 14243] MCO: 14244 Name: DPM Date: 18-Sep-89:06:31:17 [Symptom] If a logical name points to NUL, the FILOP returned filespec will not store the correct device name following a LOOKUP or ENTER. [Diagnosis] Oversight. The retured device name is the logical name. [Cure] Call LNMNUL and return NUL if appropriate. [Keywords] NUL [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 455 UUOCON FOPFI0 704A [End of MCO 14244] MCO: 14245 Name: DPM Date: 19-Sep-89:06:27:02 [Symptom] On a very slow system, IPCF sends to jobs which logged in via FRCLIN can get receiver quota exhausted errors. [Diagnosis] The receiver hasn't had the chance to pump up its IPCF quotas. This is most easily seen on a heavily loaded KS10. [Cure] Have LOGREF set the quotas to 511. [Keywords] IPCF QUOTAS [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 455 COMCON LOGRF2 704A [End of MCO 14245] MCO: 14246 Name: ERS Date: 19-Sep-89:07:57:56 [Symptom] Monitor too big and slow. [Diagnosis] Old code for GET.EXE. [Cure] Remove it. [Keywords] [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 455 COMCON SGSET,UGTSEG [End of MCO 14246] MCO: 14247 Name: ERS Date: 19-Sep-89:08:07:26 [Symptom] GETPC/PUTPC, the second half. [Diagnosis] yes. [Cure] Yes. [Keywords] GETPC PUTPC GETPCS byebye [Related MCOs] 14243 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 455 ERRCON DOINT S GETPC,GETPCS,PUTPC CLOCK1 NOTACL,INCTM4,CIP9,STOP1H,SETPIT,SETPIU,USTART [End of MCO 14247] MCO: 14248 Name: RCB Date: 26-Sep-89:05:52:43 [Symptom] MCO 14231 revisited: Convert DAEMON reporting of KL error chunks from RSX20F to use system error blocks. This eliminates two words in the CDB, .CPETM and .CPEAD. [Diagnosis] yes. [Cure] yes. This also makes DTE. UUO function 20 (.DTERT) obsolete. [Keywords] KL error chunks system error blocks system messages [Related MCOs] 14231 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 456 DTEPRM 704A DTESER IPCSER COMMON [End of MCO 14248] MCO: 14249 Name: KDO Date: 26-Sep-89:07:27:50 [Symptom] LAT is slow to start. [Diagnosis] If the multicast message is sent before the Ethernet service routines have set the channel address, LAT servers will use the wrong Ethernet address when trying to connect to TOPS-10. [Cure] Delay the multicast message until after ETHSER does the Set-Channel-Address (NU.SCA) callback. [Keywords] [Related MCOs] None [Related SPRs] 36229 [MCO status] None [MCO attributes] KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 456 LATSER CBRDSP,LATLSC,LATSCA 704A [End of MCO 14249] MCO: 14250 Name: DPM Date: 3-Oct-89:07:45:50 [Symptom] No way to cause IPA dumps to be written cleanly. [Diagnosis] DAEMON currently does this by using system error blocks; a method which is at best an ugly crock. [Cure] Invent a way to allow the monitor to run things at UUO level. This amounts to adding a forced .EXEC command which when performed on FRCLIN will create a job slot and run a specified routine at UUO level. At completion, the control transfers to JOBKL and the job will be destroyed. This will be used to write IPA dump files. This MCO however, only implements the necessary code to create the job. The actual dump stuff will happen in a later MCO. [Keywords] DAEMON ERROR LOGGING [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 457 COMCON LOGREF COMMON COMTAB CLOCK1 CIP2 SCNSER TTFCOM 704A [End of MCO 14250] MCO: 14253 Name: DPM Date: 17-Oct-89:07:24:25 [Symptom] The monitor has an annoying habit of dumping even if the system has been up for less than 5 minutes. This is contrary to previous behavior. [Diagnosis] While it may be a desirable thing to do under some circumstances, it isn't desirable in all cases. [Cure] Make it optional. In cases where the system crashes during the first 5 minutes of uptime, dump only if the symbol ATODMP is non-zero. By default, it will be set to 1. Sites which find this behavior disgusting can set it to 0. [Keywords] DUMP [Related MCOs] 13809 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 461 COMMON ATODMP MONBTS RLDMON 704A [End of MCO 14253] MCO: 14254 Name: DPM Date: 26-Oct-89:07:43:47 [Symptom] Occasionally, structures cannot be mounted after ATTACHing disk drives or after a newly formatted pack has been defined. [Diagnosis] The routine DSKDRV is responsible for setting up a UDB following an ATTACH. If errors occurred reading device registers, the unit status is set appropriately to reflect the error condition. However, if no errors occurred, DSKDRV assumes a pack must be mounted and changes the status to 'pack is mounted'. Later, when the STRUUO is done to define a structure, it will fail because the UDB claims a pack is already mounted. In the case of a newly formatted pack, ONCMOD neglects to set the unit state to 'no pack mounted' when the HOM blocks cannot be read. [Cure] Following an ATTACH, do not change the unit status unless there were errors. When HOM blocks cannot be read, set the unit status to 'no pack mounted'. [Keywords] ATTACH DISK DEFINE STRUCTURE [Related MCOs] None [Related SPRs] 36276 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 462 FILIO DSKDR8 ONCMOD TRYHOM 704A [End of MCO 14254] MCO: 14255 Name: RCB Date: 26-Oct-89:11:38:09 [Symptom] Batch streams can hang forever when they use MIC. [Diagnosis] Scenario: MIC file enables the RESPONSE feature in order to trap error messages into a MIC variable (parameter). Some program it invokes types out an error message. MIC wants to get the entire error message into the response buffer, and not just a part of it, so it waits for the job to go to monitor level or to block in TIOWQ (TTY I/O wait) before it reads the text. To make sure the text is available for MIC to read, SCNSER refuses to allow output to happen until MIC's conditions are satisfied *and* MIC has read the response buffer. Thus, when the program that was invoked types out a reasonably short error message (so that it doesn't block in TO) and then loops in NAPQ waiting for the chunks to empty out before it decides how to type its next prompt, there is a deadlock. The program never satisfies the MIC conditions for getting the response buffer read, and thus output never happens, and thus the program is waiting for MIC waiting for the program waiting for MIC .... [Cure] Since the MIC RESPONSE buffer is only 21 octal words in length, and is ASCIZ, MIC will only ever see a maximum of 84 (decimal) characters of response text. In other words, it only expects to see one line. So, add a bit in the LDB, L1LEEL (end of error line, B6 in LDBBYT). This bit is twiddled during the same routine that notifies us of an error character. The code in XMTMIC which checks for whether to tell MIC that the response buffer is available will consider having L1LEEL set to be as good as being in TIOWQ. I.e., if we have gone back to the left margin since seeing the error character, we will tell MIC to do its thing. [Keywords] MIC under BATCH hung PTY [Related MCOs] None [Related SPRs] 36279 [MCO status] Checked [MCO attributes] Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 462 SCNSER LDBBYT,MICLG3,MICPS4 704A [End of MCO 14255] MCO: 14258 Name: DPM Date: 14-Nov-89:06:09:54 [Symptom] IPA dump file writing facility appears as a wart on the error logging code. [Diagnosis] Way back when, the only way to get UUO-level work done was to get DAEMON to do some work for you. IPA dump files were processed through the error logging code by prodding DAEMON with a SPEAR record that was suppressed from ERROR.SYS. [Cure] Now that there's a way to make UUO-level things happen, teach the monitor to write the dump files and eliminate the need for DEAMON interaction. [Keywords] IPA DUMP [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 464 AUTCON AUTDMP CLOCK1 EXPJO1 COMDEV IPADMP FILUUO UNQIFL,UNQINI 704A [End of MCO 14258] MCO: 14259 Name: DPM Date: 14-Nov-89:06:50:39 [Symptom] Inaccessible code left over from efforts to clean up error logging code. [Diagnosis] Was just waiting 'til it was all over. [Cure] Remove DAEDIE, DAEDSJ, DAEEIM, DAEERR, DAERPT, and DAESJE. Also remove the interlock word, DAELOK. Shrinks CLOCK1 by 3 blocks. [Keywords] ERROR LOGGING [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 464 CLOCK1 DAEDIE,DAEDSJ,DAEEIM,DAEERR,DAERPT,DAESJE,DAELOK 704A [End of MCO 14259] MCO: 14261 Name: DPM Date: 21-Nov-89:08:25:53 [Symptom] On KS10s, defining non-standard device parameters doesn't work. COMDEV gets assembly errors. [Diagnosis] The MDKS10 macro has a junk parameter filled in for the MASSBUS unit number. [Cure] Don't put out sixbit jibberish where a number is expected. [Keywords] MDKS10 [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 465 MONGEN MDT3A 704A [End of MCO 14261] MCO: 14262 Name: JEG/DPM Date: 28-Nov-89:04:51:45 [Symptom] Possible DI hangs or KAFs out of PCLDSK. [Diagnosis] When doing queued protocol for disks, if the primary port is offline, alot of things can go wrong requeing the I/O to another port. 1. References to UNIKON should be indexed by T1, not U. 2. References to UNIALT are OK for only for CI disks. 3. Extra JUMPN to test the results from CPUOK. 4. Merely checking KDBCAM for non-zero value doesn't guarantee the other CPU(s) are running. [Cure] 1. Index UNIKON by T1. 2. Test for a CI disk. If so, use UNIALT. Use UNI2ND for all others. 3. Remove JUMPN. We wouldn't have gotten to PCLOFL if the initial call to CPUOK was successful. 4. Make a second call to CPUOK to test the new accessibility bits from the alternate or detached port. [Keywords] DI HANG KAF [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 466 FILIO PCLOFL 704A [End of MCO 14262] MCO: 14263 Name: DPM Date: 28-Nov-89:06:45:40 [Symptom] With DAEMON no longer dependant upon the monitor version, the methods of determining what is the proper DAEMON version and who is a legal DAEMON no longer work. Also, there is still some lurking inaccessible code. [Diagnosis] Time for a change. [Cure] A SETUUO will be provided so DAEMON can set its job number in the monitor. It is function 53 (.STDAE). A corresponding GETTAB (%CNDJN, 212,,11) wil read the job number back. The SIXBIT/DAEMON/ name and JACCT bit are no longer required. In fact, DAEMON has been removed from PRVTAB. Also, remove the ERRPT. UUO as the monitor no longer leaves data for DAEMON to scavenge by this method. The UUO, as well as GETTAB table entries %LDERT, %LDPT1, %LDPT2, %LDLTH, and %LDESZ are now obsolete. Stopcode IBI gets deleted along with the code at STOP1 to try to restart DAEMON after it halts. It can never be made to work. DAEMON version 24(1030) or later is required from now on. [Keywords] DAEMON [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Beware file entry required Documentation change UUOSYM change [BEWARE text] DAEMON version 24(1030) or later is required with monitor load 466. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 466 CLOCK1 COMCON COMMON COMMOD UUOCON UUOSYM 704A [End of MCO 14263] MCO: 14264 Name: DPM Date: 30-Nov-89:07:10:40 [Symptom] Using the MONGEN option to set non-standard device parameters, defining a printer to be upper case only has no effect. The monitor treats the printer as lower case. Manually turining off DVLPTL in the DEVCHR word of the DDB makes the problem disappear and the printer behave like an upper case only printer. [Diagnosis] The routine AUTMDT scans MDTs for non-standard device parameters. If the device is specified (to MONGEN) using a device code OR a non-zero CPU number, then everything works as expected. However, if the customer defaults the device code AND the CPU number (or supplies CPU0), then a zero device specifier is inserted the MDT. A zero word signals the end of the MDT. Therefore, AUTMDT will never scan the entire table and never find the customer specified parameters. Also, it is possible for AUTMDT to exit without returning the MDT data under some circumstances. [Cure] In MONGEN, set a bit in the device specifier word of the MDT entry which indicates the word is valid. Thus, CPU0 with a defaulted device code of zero will no longer look like the table terminator. Also insure that the MDT data is always returned properly. [Keywords] MDT [Related MCOs] None [Related SPRs] 36282 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 467 AUTCON AUTMD3,MDTDV1 DEVPRM MD.VAL MONGEN ASKRE5,MDT6,MDTTAB 704A [End of MCO 14264] MCO: 14265 Name: DPM Date: 4-Dec-89:08:22:27 [Symptom] Rewinds and skip file operations time out prematurely on 3600 foot magtapes. [Diagnosis] Hung timers are based on the amount of time needed to perform a given function on a 2400 foot magtape. The values fall short for 3600 foot reels. [Cure] Increase all hung timer values by one half. [Keywords] HUNG TIMERS [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 467 T78KON HNGTBL TCXKON HNGTBL TD2KON HNGTBL TM2KON HNGTBL TMXKON HNGTBL TS1KON HNGTBL TX1KON HNGTBL 704A [End of MCO 14265] MCO: 14266 Name: DPM Date: 5-Dec-89:06:31:35 [Symptom] No way for the old and new DAEMONs to tell which version ought to be run. [Diagnosis] %CNDAE returns 704, but both the old and new DAEMONs run under different flavors of 704. [Cure] Have %CNDAE return 705. The new DAEMON will require this, but if it sees 704, it will run DAE704. [Keywords] DAEMON [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Beware file entry required [BEWARE text] DAEMON which has run with earlier versions of 7.04 should be renamed to SYS:DAE704.EXE. DAEMON version 24 should be placed on SYS as DAEMON.EXE. If there is a chance that an earlier 7.04 monitor may occasionally be run, the new DAEMON should also be copied to SYS with the name DAE705.EXE. This will allow for the proper synchronization of DAEMONs with the monitor regardless of which version of 7.04 is run. %CNDAE is a GETTAB which allows DAEMON to synchronize with monitor versions. It is intended for use only by DAEMON. Other programs such as ACTLIB, LOGIN, REACT, and WHO have incorrectly used this GETTAB to return the monitor version where another, more appropriate GETTAB, %CNDVN, should have been used. The Digital programs have been changed to use %CNDVN. Sites should make similar changes to any user-written programs which may have used %CNDAE. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 467 COMMON CNFDAE 704A [End of MCO 14266] MCO: 14267 Name: LWS Date: 11-Dec-89:17:28:27 [Symptom] Problems assigning/init'ing etc devices in monitors with no ANF support. [Diagnosis] AUTDDB does not make device names of the form DEVNNU when NN is 00. In this case it makes a name of the form DEVU, eg. LPT0 instead of LPT000. DVSTAS depends on the U of NNU being the last sixbit character in DEVNAM (bits 30-35) when searching for a DDB. GALAXY spoolers generate device names of the form DEV00U when ANF is not supported in the monitor. Using DEV00U as a device name in various UUOs fails because DEV00U will never match the device name in the DDB, which is DEVU. [Cure] Have AUTDDB always build device names of the form DEVNNU when DR.NET is lit. [Keywords] AUTOCONFIGURE [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 470 AUTCON AUTDDB 704A [End of MCO 14267] MCO: 14268 Name: DPM Date: 12-Dec-89:07:02:45 [Symptom] More error logging stuff. 1. The monitor doesn't write DECtape records. 2. The definition of record type 75 is wrong. [Diagnosis] 1. SPEAR didn't use to understand DECtape records. Now it does. 2. Record type 75 claims it's only used for IPA20 dumps. Not so. [Cure] 1. Remove references to M.DTAE (introduced during 7.04 development) as normally turned on. This will cause the monitor to write DECtape error records. 2. Redefine record 75 to be a generic device dump record with the name .ESDVD (UUOSYM) and .ERDVD (S). SPEAR also understands this record now. The monitor still doesn't write this record, but it will soon. [Keywords] ERROR LOGGING [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Documentation change UUOSYM change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 470 COMDEV M.DTAE DTASER M.DTAE S .ERDVD UUOSYM .ERDVD 704A [End of MCO 14268] MCO: 14269 Name: DPM/RCB Date: 19-Dec-89:07:32:29 [Symptom] On multi-CPU systems, at system startup, one frequently sees varying CPU uptimes and/or undeserved CPUn not running warnings on the CTY. [Diagnosis] When the clocks are turned on, only the policy CPUs uptime counter is more or less accurate. Non-policy CPUs are looping in their AC loop waiting for the system to start. During this time, they take no interrupts and therefore never update their uptime or OK word. When the system starts timesharing, the uptime words are guaranteed to be skewed and sometimes the OK words are positive, causing the warnings on the CTY. [Cure] Prior to turning on the clocks, make all CPU's uptime words agree. Also fix the OK words to be properly negative. [Keywords] UPTIME [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 471 SYSINI TIMINI,NODDT 704A [End of MCO 14269] MCO: 14270 Name: DPM Date: 26-Dec-89:07:49:18 [Symptom] After the release of 7.04, SNOOPY fails with the error "? Undefined breakpoint symbol TM0IN1". [Diagnosis] The CPU dependant code for interval timer interrupts was removed as part of 7.04 development. Because of this change, a SNOOP. UUO cannot be used to patch the interval timer code without incuring excessive overhead in the job which is doing the snooping. (It must weed out all calls except those from the target CPU.) [Cure] Add a new SETUUO (.STITP==54) to allow a job to patch the interval timer. The job must have POKE privs, be [1,2], or running with JACCT set, and contiguously locked in EVM. The call is: MOVE AC,[.STITP,,addr] SETUUO AC, no privs, bad arguments success addr: CPU mask instruction to XCT (relocated) For this to work, two CDB locations have been added. .CPITP contains the instruction to execute and .CPITJ contains the job number which patched the interval timer code. When interrupts are processed, if .CPITP is non-zero, it will be executed. A suitably privileged job may set .CPITP if .CPITJ is zero or is already owned by the job executing the SETUUO. .CPITP may be cleared by supplying a zero for the instruction to execute. These words will be forcibly cleared when a job exits prematurely (ESTOP), control-C's out (STOP1) or does a RESET UUO. For the curious, two new GETTABs have been added. %CVITP and %CVITJ return .CPITP and .CPITJ respectively, although SNOOPY or any other performance measuring program should have no need to rely on these words. [Keywords] SNOOPY [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Documentation change UUOSYM change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 472 APRSER TIMINT,SETITP,CLRITP CLOCK1 ESTOP1,STOP1 COMCON SETTBL COMMON .CPITP,.CPITJ UUOCON RESET UUOSYM %CVITP,%CVITJ,.STITP 704A [End of MCO 14270] MCO: 14271 Name: RCB Date: 30-Dec-89:06:36:16 [Symptom] Device errors and uptime statistics are getting lost. [Diagnosis] DAEMON is unreliable about finding crash dumps and reporting errors and AVAIL statistics from them. [Cure] Have the monitor do it. This adds module CRSINI to ERRCON.MAC. This also adds two new STOPCDs (both in CRSINI): CRSIAF, type INFO -- CRSINI allocation failure. CRSINI could not allocate an exec process block in order to run its UUO-level code. OLDMON, type INFO -- OLD monitor found in crash file CRSINI found that the crash file pointed to by BOOT was for an older monitor than it can process. [Keywords] DAEMON SPEAR AVAIL [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 472 SYSINI SYSINH,SYSRLD,SYSAVL,BOOTFX 704A COMMON CNFDAE S .ERCIN,.ERHSB,.EXHDR CLOCK1 CIP2,SETDJB ERRCON SEBTIM,XFRSEB,CRSINI [End of MCO 14271] MCO: 14111 Name: JMF Date: 1-Sep-88:07:17:30 [Symptom] Patches made to virtual user mode programs with FILDDT disappear. [Diagnosis] If the patch happens to get made to a write locked page, the page doesn't get written to the swapping space the next time the job gets swapped out. [Cure] If the job and the page are in core and the page is write locked, write enable the page and decrement .USWLP before copying the data from the patcher to the patchee. [Keywords] JOBPEK [Related MCOs] 13932, 13137 [Related QARs] None [MCO status] Deferred [MCO attributes] New development MCO QAR answer [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 UUOCON JOBPK3,JPKLW? VMSER RTNFS0 [End of MCO 14111] MCO: 14127 Name: JMF Date: 27-Sep-88:05:23:44 [Symptom] Non-zero section address break doesn't work as expected. [Diagnosis] 1) Section number gets lost in DATAO APR, in SSEUB. 2) SET BREAK command changing conditions but not break address zaps section number. [Cure] 1) DATAO APR,.CPAPR 2) DPB rather than various flavors of HLLxy. [Keywords] extended addressing address break [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Deferred [MCO attributes] New development MCO KL10 only Extended addressing only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 APRSER SSEU2 COMCON SETB10 [End of MCO 14127] MCO: 14131 Name: ERS Date: 13-Oct-88:09:41:14 [Symptom] All known bad areas on a disk are not known to the monitor. Possible, but unlikely IME. [Diagnosis] When we're scanning the BAT blocks we first figure out how many we have to scan. To get this we add the number of bad regions the monitor found to the number of areas the disk started with (bad regions found by the various diagnostic programs). However, the latter we get by indexing off of T3. T3 happens to point to outer space. [Cure] Set up T3. [Keywords] Bad regions Swap read errors? [Related MCOs] 13932, 13137 [Related QARs] None [MCO status] None [MCO attributes] PCO required QAR answer [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 REFSTR SCNBAT [End of MCO 14131] MCO: 14132 Name: JAD Date: 13-Oct-88:10:47:42 [Symptom] Possible inconsistent runtimes on the KL (MCO 13856 revisited). [Diagnosis] Forgot one case where "Inhibit Update" was set needlessly. [Cure] Clean it up. [Keywords] RUNTIME [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 APRSER SSEUB [End of MCO 14132] MCO: 14133 Name: JAD Date: 13-Oct-88:10:55:32 [Symptom] Protocol pause doesn't exist under secondary protocol, but DTESER doesn't check before trying to effect protocol pause. [Diagnosis] Missing test. [Cure] Test ED.PPC at SETPP before doing anything rash. [Keywords] PROTOCOL PAUSE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 DTESER SETPP [End of MCO 14133] MCO: 14134 Name: JAD Date: 14-Oct-88:10:57:48 [Symptom] (Unsupported) feature to print PC during SET WATCH FILE output gets wrong PC during RENAME. [Diagnosis] PATH UUO done by PTHFIL blows away .USMUO. [Cure] Use JOBPDO+1 for PC. [Keywords] WATCH FILE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 UUOCON WCHPCP [End of MCO 14134] MCO: 14135 Name: JAD Date: 14-Oct-88:11:04:43 [Symptom] Including expensive "want to run" time calculation is an all or nothing proposition. [Diagnosis] Either you JFCL RQTPAT or you don't. If you do, it happens every tick. [Cure] Invent a MONGEN-definable symbol M.NRQT which is the number of ticks between the "want to run" time calculation. If zero, the expensive calculation is never done. Patchable on the fly by twiddling a variable in SCHED1. [Keywords] WANT TO RUN TIME [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 COMMON M.NRQT SCHED1 RQTPAT [End of MCO 14135] MCO: 14136 Name: DPM Date: 18-Oct-88:03:04:41 [Symptom] Giving up the CX resource for the wrong job. [Diagnosis] In CTXSER when setting context and saved page quotas, we get the CX resource if the target job is not ourselves. This works just fine because the purpose of the CX is to prevent a context block or PDB from changing out from under us. However at completion of the UUO function, we only give back the CX is we were changing our quotas. [Cure] Only give back the CX if the target is not ourselves. [Keywords] CONTEXTS [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 CTXSER XITQTA 704A 703A [End of MCO 14136] MCO: 14137 Name: LWS/DPM Date: 19-Oct-88:18:30:13 [Symptom] Autoconfiguring of -20F devices works by sheer luck or doesn't work at all. When it does work, the devices sometimes work. [Diagnosis] 1. We send the "request for device status" msg to -20F in the wrong format, i.e. 0 byte,,unit # byte. 2. In DCRSER and DLPSER we use the wrong half of an AC to pick up FE device unit number. 3. We "timeshare" the same word in the device DDB for two different things. [Cure] 1. Change FNCTAB dispatch of .EMRDS msg to use "line/data" format instead of "line" format. This causes msg to be sent in correct format, i.e. unit # byte,,0 byte. 2. HRRx's --> HLRx's 3. .ORG DEVLSD's --> .ORG DEVLEN's [Keywords] FE devices RSX20F printers readers [Related MCOs] 13932, 13137 [Related QARs] None [MCO status] Checked [MCO attributes] KL10 only QAR answer [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 DTESER FNCTAB DLPSER DLPDT1,.ORG DCRSER DCRDT1,.ORG [End of MCO 14137] MCO: 14138 Name: LWS Date: 19-Oct-88:19:06:10 [Symptom] MCO 14126 incomplete [Diagnosis] In TPDSMM/CMM all tape kontrollers on the same channel are put in maintenance mode, but I forgot about dual ported units. Trying to put the DX20 on 1026 in maintenance mode using MTA0 as the arg to the DIAG. UUO puts the DX10 in maintenance mode. [Cure] Add code to check UDBKDB and put all kontrollers found in maintenance mode also. [Keywords] DIAGs [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 TAPUUO TPDSMM,TPDCMM [End of MCO 14138] MCO: 14140 Name: JEG Date: 25-Oct-88:04:35:36 [Symptom] 1. SA10 related crashes not as useful as they could be. 2. Missing improvements in disk code. [Diagnosis] 1. SAXSER would squirrel away interesting data in the KDBs on a crash if only someone would ask it to. 2. I've been busy. [Cure] 1. Call SAXDMP from COMMON in DVCSTS. 2. Implement improved disk driver. [Keywords] SA10 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 COMMON DVCST2 DSXKON LOTS [End of MCO 14140] MCO: 14141 Name: DPM/JMF Date: 25-Oct-88:04:55:24 [Symptom] Stopcode KAF trying to start I/O BUS printer. [Diagnosis] Hard to say, but it looks like LPTINI was never called, although it's not obvious how that could happen. Further inspection reveals that the length of the DDB is wrong. LPTCHF (PI channel flags), value 24 is the first word in the device dependant portion of the DDB. That's also the value of DEVCTR. If DEVCTR gets zeroed, the PI channel flags get wiped out and the contents put into the RH of the CONSO skip chain test. The next interrupt would not be serviced because the condition bits were all zeroed and a KAF results. Other devices could have other problem depending upon the usage of the words between the starting origin (DEVLSD, DEVLLD, etc.) and DEVLEN. [Cure] For all incorrectly defined DDBs, origin the device dependant portion at DEVLEN. [Keywords] DEVLEN [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 CD2SER CD2DDB CDRSER CDRDDB LP2SER LP2DDB LPTSER LPTDDB PLTSER PLTDDB PTYSER PTYDDB [End of MCO 14141] MCO: 14142 Name: RCB Date: 25-Oct-88:05:36:08 [Symptom] STRUUO .FSRSL (read search list) is less friendly than the GOBSTR loop that it's supposed to replace. [Diagnosis] Demanding godly privs or same job to read a search list, when GOBSTR only requires that the invoking job have the same PPN as the target job, or have some flavor of PEEK/SPY privs, or that the job be reading the SSL. [Cure] Change the STRUUO's priv checking to match that of GOBSTR. [Keywords] STRUUO .FSRSL GOBSTR consistency [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 FILFND RSLSTR [End of MCO 14142] MCO: 14143 Name: RCB Date: 25-Oct-88:05:51:33 [Symptom] Too hard to tell which Autopatch tape a customer is running when we get the dumps. [Diagnosis] No way to distinguish between post-7.04 release monitors. [Cure] Change the way A00SVN and A00DLN are used in building A00VER and AXXDVN. (These are GETTAB items %CNVER and %CNDVN.) This week's monitor will be load 410 of 7.04A as far as the macros in COMMON are concerned. The load numbers will be recycled annually, at the same time as we bump the minor version number (A00SVN). This way, the version stamp on the dump will narrow down which tape it could have been from, and a check of MONVER will allow us to tell even more precisely. A00MCO should have been good enough, but it seems that some customers like to change it when they install published patches. [Keywords] Autopatch Revision control [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 410 COMMON AXXVER 705 410 [End of MCO 14143] MCO: 14144 Name: RCB Date: 25-Oct-88:06:41:41 [Symptom] Can't always connect to TSK devices on other nodes when we should be able. [Diagnosis] NETDEV (called from AUTLNK) updates our NDB with our new configuration (without benefit of interlock) but never tells anyone else in the network about our changes. [Cure] Change NETDEV to light a flag for NETSCN to recompute our configuration. If it changes, we'll mark everyone else's NDB as needing to hear about it. Later on in NETSCN, we'll try to tell them all about it. [Keywords] ERTNA% [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 NETSER NETDEV,NCSCNF,NETSCN,ICMRCF NETPRM NDB.XC [End of MCO 14144] MCO: 14145 Name: DPM Date: 31-Oct-88:03:58:25 [Symptom] New: Add a couple of items that were omitted from 704 because of last minute documentation constraints. 1. Make control-T print the CPU the job last ran on. 2. Make SET WATCH FILES print the PC of the UUO. [Diagnosis] [Cure] [Keywords] CONTROL-T SET WATCH FILES [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 411 COMCON USECPU UUOCON WCHPCP 704A [End of MCO 14145] MCO: 14147 Name: JMF Date: 9-Nov-88:07:48:35 [Symptom] MX gets a protection failure when it tries to append to a mail file if its running virtual. [Diagnosis] Can page fault after doing updating ENTER and if the UUO is restarted, the combination of FO.PRV and junk in E+3 left over from the LOOKUP/ENTER results in a protection failure. [Cure] If appending in buffered mode, call OUTF early (before updating ENTER) to eliminate page faults after ENTER has been done. [Keywords] .FOAPP FO.PRV [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 412 UUOCON FOPEN2,FOPN9B [End of MCO 14147] MCO: 14148 Name: DPM Date: 14-Nov-88:04:58:54 [Symptom] Stopcode IME while performing magtape I/O. [Diagnosis] If buffered I/O is being done on a DX10 and if a the buffer overhead words (.BFSTS, .BFHDR, and .BFCNT) are split accross a page boundry such that .BFCNT resides in the page following .BFHDR, and that page happens to get destroyed, then an IME will result when MAKLST tries to read the user's word count for the buffer. No address checking is done on the word count word in this case. [Cure] Add a call to IADRCK. [Keywords] MAKLST [Related MCOs] 13932, 13137 [Related SPRs] 36173 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 412 TAPUUO CHNLS2 704A [End of MCO 14148] MCO: 14152 Name: RCB Date: 22-Nov-88:06:01:09 [Symptom] Files created in SYS: no longer get PRVSYS or PRYSYS as appropriate to the extension (non-.SYS or .SYS). [Diagnosis] Not sure when this broke, but SYSDEV gets cleared in LH(F) and never set again. [Cure] Fix the places that want to know or that already check to get SYSDEV right. In particular, don't just range check against SYSNDX, since that keeps STD: from lighting SYSDEV. Check the actual PPN of the device instead. [Keywords] SYSDEV PRVSYS PRYSYS [Related MCOs] 13932, 13137 [Related SPRs] 36161 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 413 FILUUO SDVTSS,TSTDSK,FOUND0,CREAL5,CURPP1 704A [End of MCO 14152] MCO: 14153 Name: DPM Date: 28-Nov-88:09:24:30 [Symptom] Attempts to log off a job which was stopped in the process of logging out get a "No such device" error. [Diagnosis] If a job was somehow stopped while logging out, the job may have been partially destroyed. In particular, there may be no remaining context blocks. Subsequent attempts to kill the job fail because the run of the LOGIN program to log the job out will not succeed. This is because DDBSRC fails when no context block is found. [Cure] The situations surrounding this problem are pretty arcane. Typically, a job gets into this state because an idle job killer incorrectly selects a job which is already logging out. The usual methods of such programs include forcibly HALTing the job in a manner which bypasses JACCT and Control-C trapping. Hence, the resulting problem of a halted and partially destroyed job is caused by a privileged program circumventing privileged protection schemes. There are a couple of different approaches to solving this problem. The simplest is to defend against idle job killers. If the job is logging out, never allow a job to be stopped. This is most easily accomplished by testing PD.LGO in word .PDDFL of the PDB in the routine SIMCHK. PD.LGO is turned on by the LOGOUT UUO. [Keywords] LOGOUT [Related MCOs] 13932, 13137 [Related SPRs] 35781, 36146 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 414 CLOCK1 SIMCHK 704A [End of MCO 14153] MCO: 14154 Name: KDO Date: 28-Nov-88:19:39:01 [Symptom] Definition of the context block is esthetically unappealing. [Diagnosis] "symbol" == "previous symbol" + "a bunch" [Cure] Use .ORG instead. [Keywords] maintainability cleanliness [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 414 CTXSER .CTFLG 704A [End of MCO 14154] MCO: 14155 Name: KDO Date: 18-Dec-88:18:02:24 [Symptom] Cannot define the default circuit cost for each device type. [Diagnosis] No code. [Cure] Add the following symbols to COMDEV: %RTCTST circuit cost for TST device %RTCDTE circuit cost for DTE device %RTCKDP circuit cost for KDP device %RTCDDP circuit cost for DDP device %RTCCIP circuit cost for CI device %RTCETH circuit cost for Ethernet device %RTCDMR circuit cost for DMR device These symbols are used in the KONCST table of D36COM. [Keywords] [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 415 COMDEV D36COM KONCST ROUTER [End of MCO 14155] MCO: 14156 Name: DPM Date: 4-Jan-89:06:28:51 [Symptom] The system wide VM counters for IW and NIW page faults are half-word quantities which don't take too long to overflow. [Diagnosis] Old monitors didn't page fault too often. Now they do. [Cure] Add two new GETTABs: %VMIWS==42,,113 ;SYSTEM COUNT OF "IN WORKING SET" FAULTS %VMNIW==43,,113 ;SYSTEM COUNT OF "NOT IN WORKING SET" FAULTS Also, because SYSTAT and SYSDPY are crufty programs and not easily modified keep SYSVCT up to date, but mark it and GETTAB %VMSPF as obsolete to entice programs to use the new counters. If SYSTAT and SYSDPY are ever fixed, the monitor will cease to maintain SYSVCT, so programs shouldn't rely on %VMSPF. [Keywords] VM COUNTERS [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change UUOSYM change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 416 COMMON SYSVCT,SYSIWS,SYSNIW 704A MONPFH PFHXCI,PFHXCN UUOSYM %VMIWS,%VMNIW VMSER USRFL7 [End of MCO 14156] MCO: 14157 Name: RCB Date: 11-Jan-89:20:14:12 [Symptom] Problems with TSK devices: 1) Can't always do an "enter passive" which is restricted to a specific node. 2) The count of TSK devices is decremented more often than it's incremented. [Diagnosis] 1) The remote doesn't admit to TSKs until someone does an unrestricted "enter passive" there. 2) AUTKIL is checking the next DDB's station number value rather than that of the DDB being removed when deciding whether to decrement the device count. [Cure] 1) Always claim at least one TSK DDB if TSK service is loaded. 2) Check the right DDB in AUTKIL. [Keywords] TSK NETCNF DDBCNT [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 417 NETSER NTSC.C 705 417 AUTCON AUTKI4 [End of MCO 14157] MCO: 14158 Name: RCB Date: 11-Jan-89:21:17:51 [Symptom] Jobs using the MIC RESPONCE feature hang sometimes on a terminal, and always on a PTY. [Diagnosis] Race condition in MICLG3 which can cause us never to notify MIC that it's time to take the response, and a mistaken test in PTYSER (the JOBSTS UUO) that won't let us even try to notify MIC that the time has come. [Cure] Yes. [Keywords] MIC RESPONCE MIC UNDER BATCH [Related MCOs] 13932, 13137 [Related SPRs] 36167 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 417 SCNSER MICLG3,TOPMCL,TOPMS1,TOPMG1 705 417 PTYSER UJBST6 [End of MCO 14158] MCO: 14159 Name: RCB Date: 20-Jan-89:14:07:59 [Symptom] Fallback presentation of eight-bit characters doesn't work when a free CRLF is required by the character expansion. [Diagnosis] The code to re-eat a character for echo or output doesn't handle the case of a multi-part character expansion. [Cure] Keep track of which character (from an expansion or otherwise) caused the line wrap, so we can send the right one when the time comes to re-eat it. [Keywords] Two-part characters Three-part characters Fallback presentation [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 417 SCNSER LDBOST,XMTCH1,XMTREO,XMTREE,REEAT 705 [End of MCO 14159] MCO: 14160 Name: LWS Date: 29-Jan-89:20:49:51 [Symptom] 1. Problems detecting "data errors" on 20F card readers. 2. 20F card reader ignored after reading a card with a 9-punch in column 1. [Diagnosis] 1. Part of the problem is 20F itself. In V16-00, when a data error occurs, the bad data is passed to the -10. Then the status msg comes, but since I/O is not in progress we pitch the status msg. V16-01 of RSX20F fixes the problem of passing the bad data instead of just sending a status msg. (and fixes the problem where it always sends a status msg after any data transfer from the reader). The monitor never checks status bits that indicate a data error. The bit it checks is not set by 20F when a read/stack/pick check occurs. 2. Before processing any msg from 20F, DCRSER calls SETRGS to setup ACs and find the DDB etc. The first thing SETRGS does is test the 1st byte of the msg for the "non-existant" device bit - useful during autoconfiguration when examing a status msg. However, on a data transfer, a 9-punch in Col. 1 happens to be the same bit! [Cure] 1. Check read/pick/stack check bits in status byte also. 2. Change SETRGS entry point to SETRGX and only call it when a status msg is received. (the ONLY time we care about non-existant devices). Move SETRGS entry down a few instructions where it starts looking for a reader DDB. [Keywords] card readers RSX20F [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 420 DCRSER SETRGS,F11DVS [End of MCO 14160] MCO: 14161 Name: DPM Date: 1-Feb-89:08:24:52 [Symptom] In some configurations, an offline alternate port claims to be an RP04. [Diagnosis] This problem is highly dependent upon timing and configuration, and only affects MASSBUS disks. At system startup time, the disk drives are autoconfigured. Drive type information is gathered and properly stored in the unit data blocks. Later, ONCMOD will build the in-core structure data base and again, attempt to read the drive types. This redundant drive type check exists to guard against the operator swaping LAP plugs, thus changing an RP06 into an RP05. If the drive type register cannot be read, then incorrect data is stored in the drive type byte in the unit data block. It is not clear why the second attempt to read the drive type register fails. The DATAI to read the register returns zeros. Normally, this could happen because the other port is busy or if the last I/O operation on the other port failed to do a dual-port drive release upon completion. Since the drive is offline, no I/O was started. Also, it has been observed that if one or more online drives exist with a higher unit number, then the problem disappears. This indicates a possible hang in the controller. In addition, if the interval between checking the primary port and and the alternate port is sufficiently long, then the DATAI always succeeds. [Cure] Problems similar to this have existed for at least 3 monitor releases. The only flaw which all monitors have in common is that the failure to read the drive type is ignored and junk overwrites the drive type code in the unit data block. A simple solution is to jump around the code which stores the drive type byte. After all this time, it seems unlikely we will determine the real nature of the DATAI failure, so this work around must suffice. [Keywords] RP04 [Related MCOs] 13932, 13137 [Related SPRs] 36230 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 420 ONCMOD TRYUNI 704A [End of MCO 14161] MCO: 14162 Name: JEG/DPM Date: 6-Feb-89:05:34:53 [Symptom] Day one 7-series bug: Stopcode DAU and corrupt user core images. [Diagnosis] A user job enables for clock interrupts via APRENB. APRSUB proceeds in a normal service-a-clock-tick fashion, but notices that the user has requested a clock-interrupt, and so it exits not with POPJ but with a JRST off to APRUTP. APRUTP may decide to fall thru to APRUT2. If T4 doesn't have UE.PEF/UE.NXM on (and it won't of course) it will continue to fall thru. APRUT2 will decide there is a loop in the trap handler (and there is). At this point APRUT2 loads the double word PC into T3/T4, and saves it off to .CPAPC for the error message. Then it branches off to APRUTW. APRUTW sets up the APRLOP PC, and exits off to APRSTU. APRSTU looks at T4 expecting possibly to find UE.PEF!UE.NXM, but instead it has some PC bits left over from APRUT2. This fools APRSTU into calling DIENLK. [Cure] At APRUT2, don't clobber T4 with a PC. Use T1/T2 instead. [Keywords] CLOCK INTERRUPT [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 421 CLOCK1 APRUT2 704A [End of MCO 14162] MCO: 14163 Name: DPM/RCB Date: 7-Feb-89:07:57:02 [Symptom] A user can PIVOT away from a PPN that the CHGPPN checks will not allow him to return to. [Diagnosis] Oversight. [Cure] Always allow CHGPPN to work if returning back to the job's logged-in PPN. [Keywords] CHGPPN [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 421 UUOCON CHGPPN 704A [End of MCO 14163] MCO: 14164 Name: TL Date: 10-Feb-89:15:34:21 [Symptom] RX2 STOPCDs [Diagnosis] If the RX20 (RX211) controller is broken such that TR is not returned to STRTIO, it is possible (but unlikely) for the RX20 controller to post an error interrupt. If it does, then the error interrupt service routine will free the controller, or, worse yet, schedule IO for another drive. In either case, we return from the interrupt back into STARTIO, where we now write the drive registers out of sync with what the controller expects. This causes an error interrupt, and since no drive is (probably) active, an RX2 STOPCD. [Cure] Turn the PI system OFF while in STARTIO. On TR errors, deschedule the controller before turning it back ON. Since it's possible for the KS to accept a vectored interrupt before the deschedule code resets the interrupt enable bit, teach RX2INT to dismiss unexpected interrupts rather than STOPCD. [Keywords] RX2 STOPCD RX2SER RX20 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] KS10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 RX2SER RX2INT,STRTIO,SETPAR 704A [End of MCO 14164] MCO: 14165 Name: RCB Date: 14-Feb-89:07:45:35 [Symptom] PAGE. UUO function .PAGAC does not work right for non-existent pages in mapped sections. [Diagnosis] The code to report non-existent pages and/or independent sections is not sufficiently forgiving of dependent sections. [Cure] Always return the mapping information for dependent sections. [Keywords] Page Accessability [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Extended addressing only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 VMSER PAGAC1,PAGAC6,PAGAC7 704A [End of MCO 14165] MCO: 14166 Name: DPM Date: 14-Feb-89:07:59:06 [Symptom] Ill mem ref running SPEAR following magtape error logging. Seen mostly with multi-ported tapes, but theoretically possible on any tape. [Diagnosis] When two or more kontrollers have access to the same tape drive, the IEP and FEP blocks are timeshared. It is expected that DAEMON will finish servicing one error before the monitor queues up data for the next. In practice, this isn't always the case. [Cure] Convert TAPSER, TAPUUO, and the drivers to use system error blocks. When an error occurs, the data will be copied into the SEBs and queued up for DAEMON to write into ERROR.SYS. The biggest problem with doing this is the monitor must format the error record itself, as SEBs are merely copied into the error file without modification. No big deal. This will reduce the monitor's dependency on DAEMON. [Keywords] MAGTAPE ERROR LOGGING [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 AUTCON RDDTN 704A DEVPRM TUB S TAPSER TAPDRV TAPUUO LOTS T78KON TCXKON TD2KON TM2KON TMXKON TS1KON TX1KON [End of MCO 14166] MCO: 14168 Name: RCB Date: 20-Feb-89:12:55:04 [Symptom] Two complaints received regarding ONCMOD: KLAD pack interface isn't as friendly as it could be. Bad block typeout is sometimes a little too terse. [Diagnosis] We special-case the KLAD structure in several places, but we don't treat it specially when gathering units for a defining a structure. After all, we know that the KLAD pack is just one pack. If the user requests that bad blocks be shown for a unit, but the unit has no bad blocks recorded, then we don't type out anything about bad blocks. This leaves the user wondering whether we forgot about the request to show them. [Cure] Only ask for one spindle when gathering units for structure KLAD. Add the message "[No bad blocks found on unit ]". [Keywords] KLAD BAF BAT Bad blocks [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 423 ONCMOD GETBAT,GETUN6 705 [End of MCO 14168] MCO: 14169 Name: RCB Date: 21-Feb-89:08:45:24 [Symptom] Invalid prompt for first logical block for swapping when defining a structure. [Diagnosis] Re-use of the old value for a unit which is no longer valid after other changes to its swapping parameters. [Cure] Range check the old value, and don't use it for the default if it's invalid. [Keywords] [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A 423 ONCMOD GETSW1 705 [End of MCO 14169] MCO: 14171 Name: LWS Date: 22-Feb-89:18:41:42 [Symptom] 1. Problems running DFDXC and DFDXD in user mode. Specifically when using the "Specify Channel Program" DIAG. function, .DISCP. 2. We try to load the DX10 on CPU0 from CPU1 when a DX20 diag exits. [Diagnosis] 1. The diags use the .DIAAU (Assign all units) DIAG. function to keep things nice when it starts init'ing the channel, etc. When the .DISCP DIAG. function is used, the monitor grabs the DDB for the tape drive from the PDB. However, the DDB in the PDB when the .DIAAU function is used is the last tape drive DDB. Not necessarily the DDB for the tape drive the diag is using. So the CCW list is build for the wrong drive (except when the last drive is used). After subsequent calls using .DISCP, we run out of free core for CCWs. 2. When a DX20 diag puts the controller in maintenance mode, we detect that the DX10 can also access the drives so we put it in maintenance mode also. This keeps TAPSEC happy. The diag sets CPU to CPU1 because that's where the DX20 is located. When the diag releases the DX20, or is ^C'd, TPMCMX is called to free up everything. Since we are running on CPU1, the load of the DX10 fails (cause we use the TPKRES and TPKLOD dispatches from TPMCMX). [Cure] 1. Can of worms. Because of the way the DIAG. functions work wrt to DIAKDU and DIADEV, make the diags do a .DIASU (Assign single unit) DIAG. fucntion so that the proper DDB is placed in the PDB and is found by the next .DISCP DIAG. function. In order for this to work, we have to let TPDASU (and TPDAAU for consistency) do their stuff even if F is nonzero on entry to the routine. They still make sure that the current job is the one executing the UUO if the DDB is already "owned". 2. In TPMCMX, check the KDBCAM mask against .CPBIT before calling TPKRES and TPKLOD routines. [Keywords] Diagnostics [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 423 TAPUUO TPDASU,TPDAAU,TPMCMX,TPDHVF [End of MCO 14171] MCO: 14172 Name: DPM Date: 24-Feb-89:04:01:53 [Symptom] A batch login may fail if the number of logged-in jobs minus the number of reserved batch job slots is greater than LOGMAX. [Diagnosis] The difference between LOGMAX and JOBMAX is the number of jobs reserved for emergency logins. A logging-in timesharing job may be granted access if LOGNUM will not exceed LOGMAX, and providing BATMIN job slots are reserved for batch logins. However, if the job logging in is running under batch, then BATMIN must not be included in the computation. [Cure] Don't account for BATMIN job slots when a batch job is logging in. Its inclusion is only meaningful for timesharing logins. [Keywords] BATMIN [Related MCOs] 13932, 13137 [Related SPRs] 36246 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 424 UUOCON ACCLOG 704A [End of MCO 14172] MCO: 14173 Name: DPM Date: 24-Feb-89:09:34:47 [Symptom] The old methods of DAEMON error logging leave something to be desired. [Diagnosis] Currently, most of the monitor expects DAEMON to gather additional data for ERROR.SYS beyond what it's initially given. This exercises race conditions, slows performance because jobs are sometimes stopped until DAEMON is finished, and makes DAEMON dependant upon monitor versions and data structure formats. [Cure] Start converting the old-style DAEMON calls to use System Error Blocks. SEBs eliminate the race conditions because one is queued up for each error log entry rather than always overwriting the same storage with new error data. Performance is improved by not having to prevent jobs from running while DAEMON is logging the error. This also eliminates the dependancy of DAEMON upon the monitor because the monitor will format the entire record. DAEMON merely copies SEBs into ERROR.SYS. This edit will do: DL10 error records I/O BUS LPT error records Stopcode records Software Events (POKE, RTTRP, SNOOP, and TRPSET) [Keywords] ERROR LOGGING [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 424 CLOCK1 DAEEST 704A COMDEV DL10EL ERRCON DIELOG,XFRSEB LPTSER LPTSYR RTTRP RTRET S EX.SYE,EX.DEL UUOCON POKE2,SNPIBP,TRPSTX [End of MCO 14173] MCO: 14174 Name: DPM Date: 28-Feb-89:05:45:07 [Symptom] New: To accomodate future tape service big fixes and enhancements, increase the size of the IORB. Do this by defining a set of "common" IORB definitions, to be used initially by tape service, and possibly later by FILSER. Append to the common portion, the tape-specific words. Common words: .ORG 0 IRBLNK::!BLOCK 1 ;FORWARD LINK TO NEXT IORB IRBACC::!BLOCK 1 ;ACTIVE (CURRENT) CHANNEL COMMAND IRBCCW::!BLOCK ;ADDRESSES OF CHANNEL COMMANDS IRBIVA::!BLOCK 1 ;ADDRESS OF INTERRUPT ROUTINE IRBDDB::!BLOCK 1 ;ADDRESS OF DDB BEING SERVICED IRBSIZ::! ;LENGTH OF COMMON IORB .ORG Tape-specific words: .ORG IRBSIZ TRBFNC::!BLOCK 1 ;FUNCTION DATA TRBSTS::!BLOCK 1 ;TERMINATION STATUS TRBRCT::!BLOCK 1 ;BYTE COUNT OF TRANSFER, IF DATA READ TRBLEN::! ;LENGTH OF BLOCK .ORG IRBLNK is the old TRBLNK, but a full word quantity. IRBCCW is the merger of TRBXCW and TRBEXL. IRBIVA is the old TRBIVA. TRBFNC is the old LH or TRBLNK and now can grow beyond bit 17. TRBSTS could also be made a full word quantity. [Diagnosis] [Cure] [Keywords] MAGTAPE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 424 DEVPRM 704A SCAPRM TAPSER TAPUUO T78KON TCXKON TD2KON TM2KON TMXKON TS1KON TX1KON [End of MCO 14174] MCO: 14175 Name: JEG/DPM Date: 28-Feb-89:06:06:19 [Symptom] ADP code reading. Jeff Gunter points out that SCNPIF doesn't include DSKBIT in configurations which have only a single CPU. Why is that, he said? [Diagnosis] Don't know. Looks like an oversight. While this is a common configuration, it could only cause problems when the monitor is in the middle of a SCNOFF and FILSER decides to print "problem on device" at interrupt level. The SCNOFF will not have turned off DSKCHN, thus allowing FILSER to do obscene things at inappropriate times. [Cure] Probably doesn't happen alot. Remove the conditional assembly and always include DSKBIT in SCNPIF. This is necessary only because FILSER insists on typing out at interrupt level. [Keywords] SCNSER [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 424 COMMON SCNPIF 704A [End of MCO 14175] MCO: 14176 Name: DPM Date: 7-Mar-89:05:38:49 [Symptom] It has been brought to our attention that some customer(s) want to install RM05s on their DEC-10. So be it, however, this is not recommeded and will remain UNSUPPORTED. RM05s are interesting devices. They are faster than an RP06 and consume less power, as they require only single phase power. An RM05 has 30 sectors/track (10 more than an RP06), yet they run at the same 3600 RPM. Therefore, the capacity and the transfer rate is about one third greater than an RP06. Don't look a gift horse in the mouth. For starters, the head crash rate is rather high. It seems that RM05s work best when left alone. Despite the fact that they use removable media, frequent disk pack changes greatly increase the chance of a head crash. The heads fly fairly close to an RM05 pack; much closer than in an RP06. Presumably, this is the main cause of head crashes. Also, parts for RM05s are not nearly as plentiful as are those for RP06s. [Diagnosis] Missing table entries in RPXKON. [Cure] Add entries to the tables for blocks per unit, etc. This is all that's required to make RM05s work. In all other manners, RM05s behave like an RP06. [Keywords] RPXKON [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 425 AUTCON DTRTBL DEVPRM TY.RM5 RPXKON TYPTAB UUOSYM .DCUR5 [End of MCO 14176] MCO: 14177 Name: DPM Date: 6-Mar-89:05:54:25 [Symptom] DAEMON error logging. [Diagnosis] Yes. [Cure] Convert more old-style calls to use System Error Blocks. Changes in this edit include: 1. Channel NXM & parity error logging. 7.04 records written by DAEMON contained mostly junk. 2. DECtape error logging. 3. KS10 memory error logging. Doc change: This adds one word (.CPMFL) to the CPU subtable FOR KS10 memory errors. This word is a flag which indicates the last type of error (0 = soft, 1 = hard). Also, the length of the subtable (.CPMSL) was off by one word and has now been corrected. 4. KS10 card reader & line printer error logging. [Keywords] ERROR LOGGING [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 425 APRSER MEMCHK 704A CD2SER CDRSYR COMDEV DTXEST,DTXEFL,DTXEBK COMMON .CPMFL,.CPMSL DTASER ERRS,DTASYR ERRCON CHNCO3 LP2SER LPTSYR [End of MCO 14177] MCO: 14178 Name: RCB Date: 13-Mar-89:15:12:41 [Symptom] STOPCDs AAO and IME, undeserved address checks, and undeserved checksum errors during dump-mode I/O. [Diagnosis] In the old days before 7.03, LRNGE was called to range-check an IOWD. It checked everything we needed to have checked just fine. One of the things which it checks is that the range of addresses does not cross a section boundary. Thus, it was no longer appropriate once .FOFXI/.FOFXO (extended dump I/O) were added to the FILOP. UUO. MONPFH does not check that old-style IOWD-based I/O does not cross a section boundary, nor does it check that the I/O is not done to the ACs. This can lead to AAOs. If the user's working set includes swapper-write-locked pages, then MONPFH will call LRGNE, even though it might be doing extended I/O, thus resulting in an undeserved address check error for an I/O doubleword which crosses a section boundary. If FILIO has to perform error recovery and retries during a dump-mode I/O operation which ends at a section boundary, under some circumstances it leaves DEVISN containing bogus information in the DDB. If this was also the first block in a retrieval pointer, we will then proceed to attempt to calculate the checksum based on a user address which we calculate, in part, from this junk in DEVISN(F). This can cause either an IME or an undeserved checksum error. Finally, much of the above is exacerbated by NXCMR in UUOCON, which is the common routine used to fetch and validate the next IOWD in a user's channel command list. It does not validate correctly when MONPFH passes it an IOWD which either starts with or crosses a section boundary. [Cure] Teach NXCMR how to validate all IOWDs which PFDOIO might pass. Correct all incorrect uses of DEVISN(F). Teach PFDOIO to use ZRNGE rather than LRNGE when it wants to fix up swapper-write-locked pages. Teach PFHDMP to give an address check error when an old-style IOWD crosses a section boundary. Teach CHKSUM to use GETEWD rather than GETWRD, so that it always fetches the correct word from the user's buffer. Teach PFDOIO to validate the range of address for I/O in order to be sure that I/O is not attempted to the ACs. [Keywords] DUMP I/O AAO IME ADDRESS CHECK CHECKSUM ERROR IOIMPM IO.IMP [Related MCOs] 13932, 13137 [Related SPRs] 35576, 36064 [MCO status] Checked [MCO attributes] Extended addressing only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 426 UUOCON FOPN9B,UINITC,RELEA4,NXCHIT 704A MONPFH PFHDM1,DOIO2 FILIO SATADR,MONIOY,SETLS7,POSER2,ECC2,ECC3,NOECC,CHKSUM,CSHC2B,CSHB2C FILUUO DUMPG9 [End of MCO 14178] MCO: 14179 Name: JEG/DPM Date: 21-Mar-89:05:41:31 [Symptom] FILSER doesn't usually continue from a DHD stopcode (Don't Have DA). [Diagnosis] If IOSDA is off in S (but not necessarily in DEVIOS), then a DHD will result. But if the job really does own the DA resource, it will hang, since the DA is never released. [Cure] Let the DHD return .+1. Further checks will prevent the DA from being returned for the wrong job (a RWD is likely). If we manage not to get a RWD, then the DA will be released and the monitor will continue with no problems. [Keywords] STOPCODE DHD [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 FILIO DWNDA 704A [End of MCO 14179] MCO: 14180 Name: JEG/DPM Date: 21-Mar-89:05:45:20 [Symptom] Stopcode KLPKAF following parity scans. [Diagnosis] A parity scan requires more than KAFTIM seconds to complete. If PPDSEC doesn't get called soon enough (and it won't because of the scan), it declares the KLIPA dead. [Cure] Increase KAFTIM from 10 to 35 seconds. This allows about 8 seconds per meg plus a few extra for good measure. Increase KNISER's timer (also called KAFTIM) from 30 to 35 seconds too. [Keywords] KLPKAF [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 KLPSER KAFTIM 704A KNISER KAFTIM [End of MCO 14180] MCO: 14181 Name: JEG/DPM Date: 21-Mar-89:05:49:40 [Symptom] DI hangs on RA failovers. A failover can leave several jobs stuck in "problem on device" mode for the old unit, even after lots of time passes. [Diagnosis] PCLDSK may inadvertantly get called with an "old" unit if a failover is happening while another CPU is preparing to start I/O. The "old" unit was OK, but now, KDBCAM contains zero, causing PCLDSK to get called. PCLDSK sees no CPUs (and indeed there aren't any with the old unit) and calls HNGSTP, eventually looping back to PCLDSK again with the "old" unit. [Cure] If there is an online alternate port, use it and bypass HNGSTP. [Keywords] FAILOVER [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 FILIO PCLDSK 704A [End of MCO 14181] MCO: 14182 Name: JEG/DPM Date: 21-Mar-89:05:54:13 [Symptom] If a CPU croaks before it can be warm-restarted successfully, and field service is able to fix it "on the fly", sometimes bad things (usually hangs) happen immediately following the J 400. [Diagnosis] This can happen because a CPU restart clears SP.CJn for all jobs, and then CPUZAPs the "running job" for the CPU, leaving a small window when the job can be scheduled to run on another CPU. [Cure] Change SPRINI to call CPUZAP first, and then clear SP.CJn. [Keywords] WARM RESTART [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 COMMON SPRLP1,SPRI11 704A [End of MCO 14182] MCO: 14183 Name: JEG/DPM Date: 21-Mar-89:05:59:19 [Symptom] Stopcode KAF in QUESER. [Diagnosis] It is possible for one CPU to be didling the database at UUO level with the EQ lock, while ENQMIN runs at interrupt level on another CPU. If UUO level CPU removes and releases the free core holding a block that is being scanned by ENQMIN, KAFs or other stopcdes may result. [Cure] Implement a scheme where UUO level waits for interrupt level and interrupt level punts if UUO level holds the EQ resourse. [Keywords] STOPCODE KAF [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 QUESER ENQMN2,EQLOCK,LOKINQ 704A [End of MCO 14183] MCO: 14184 Name: JEG/DPM Date: 21-Mar-89:06:03:20 [Symptom] If a CI disk contains HOM blocks which look like valid but contain a zero word for the structure name, a failover will cause PULSAR to sniff out the disk and mount a structure with no name. [Diagnosis] Monitor never checks for a zero structure name in DEFSTR. [Cure] Return "illegal structure name" error when no name is given. [Keywords] DEFINE STRUCTURE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 FILFND DEFSTR 704A [End of MCO 14184] MCO: 14186 Name: DPM Date: 24-Mar-89:08:37:20 [Symptom] Stopcode KAF in KNISER. [Diagnosis] On a very busy Ethernet wire, it is possible to spend more than 6 seconds at interrupt level taking packets off the KLNI. RSX-20F has little patience for this sort of nonsense, so it KAFs the -10. [Cure] Put an arbitrary limit on the number of packets that we'll process in a single interrupt. Experimentation has proven that trying to remove 2100 (decimal) or more packets from the queue will result in a KAF. Therefore, set the limit to 2000. Location .PBMPP (maximum packets processed) in the KDB/PCB contains the limit and can easily be patched to a different value. When the limit is exceeded, a KNIKSP (KLNI Service Paused) info stopcode will be typed on the CTY. Then the PIA will be removed for one second to let things settle down. [Keywords] KNISER KAF [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 430 KNISER KNIRQ1,KNIPAU,KNICON 704A [End of MCO 14186] MCO: 14189 Name: JEG/DPM Date: 28-Mar-89:08:22:29 [Symptom] If a program dies with infinite IPCF quotas and freecore is very low or about to expire, the system grinds to a standstill. Some jobs are stuck NApping and others get unexpected error returns. Trying to log off the offending job fails. [Diagnosis] IPCLGO does two things. It sends a logout message to QUASAR and it turns around all unreceived messages; in that order. The send to QUASAR will fail because there is no available freecore, and the logging out job owns a large chunk of it. [Cure] Reverse the order of things. First, empty the send and receive queues, then send the logout message to quasar. [Keywords] IPCF LOGOUT [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 430 IPCSER IPCLGO 704A [End of MCO 14189] MCO: 14190 Name: JEG/DPM Date: 4-Apr-89:05:14:36 [Symptom] Stopcode IME removing a structure. Other problems possible too. When allocation is in progress, or the ACCs and NMBs are in transition, and a structure is being removed, an IME is likely to occur on a busy system. [Diagnosis] TAKBLK and friends rely on DEVUNI(F) to indicate the target unit for a structure is still valid. FILSER normally depends upon TSTGEN checking UNIGEN. The window is sufficiently large to allow the SKIPN DEVUNI to work while REMSTR is removing a structure. [Cure] Change TAKBLK to call TSTGEN. Make BMPGEN get and release the DA around the update of UNIGEN. [Keywords] DISMOUNT [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 431 FILFND BMPGN1 704A FILIO TAKBL0,TAKBLJ [End of MCO 14190] MCO: 14191 Name: RCB Date: 5-Apr-89:22:16:13 [Symptom] Hung ANF traffic to a node. Especially common over an Ethernet channel. It may (sometimes) correct itself eventually, especially if it was not an Ethernet channel that was involved. [Diagnosis] After NETWRT queues an output message (PCB) to the FEK, it calls its device driver to perform the output. This can happen several times before the device driver tells the FEK routine that the output has happened. At that point, the FEK routine tell NETSER that the message has been sent. This causes the PCB to placed on a generic output-done queue for NETSCN to process. Once we get to NETSCN, we move PCBs from the this queue to a queue for the NDB for the node to which we were sending the message. The subroutine responsible for this, NTSC.O, is also responsible for keeping the NDBLMS (last message sent) field updated. It does this by noting the message number of each PCB it places into the output-pending queue in NDBLMS. However, the PCB queue from which it is taking these messages is unordered, and this can lead to having a very long list of messages, with NDBLMS reflecting only (for example) the first of them. Once this has happened, CHKNCA (check network-control ACK) will ignore an ACK for any message beyond that in LDBLMS. However, the remote is quite likely to send us an ACK for the actual last message in the ACK-pending queue. This leads to a full output queue and a refusal to transmit any further data messages, at least until the REP/NAK timer causes us to send a REP, which will result in a NAK. Because we ignored the implicit ACK present in the NAK, we will still have a queue of outstanding messages, which the NAK will cause us to retransmit all at once. Unless the device driver stutters in a friendly manner, this will merely get us into the same mess again with the same set of messages, and no progress will ever be seen. [Cure] In NTSC.O, only change NDBLMS if it's moving in a forward direction. In INCTNK, where we resend the queue in response to a NAK, reset NDBLMS to NDBLAP in order to avoid possible ACK races. [Keywords] ANF Ethernet Hung ANF [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] HOSS attention [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 NETSER NTSC.O,INCTNK 704A [End of MCO 14191] MCO: 14192 Name: RCB Date: 5-Apr-89:22:55:57 [Symptom] Terminal characteristics get handled incorrectly during a SET HOST session which is handled by NETVTM. [Diagnosis] Setting the terminal type happens after all the other characteristics get set, and clobbers them. [Cure] Save the other characteristics until after we set the terminal type in VTMCHR. [Keywords] NETVTM SET HOST terminal characteristics [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] HOSS attention [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 NETVTM VTMCHR 704A [End of MCO 14192] MCO: 14193 Name: DPM Date: 6-Apr-89:11:36:40 [Symptom] MCO 14190 went a bit too far. [Diagnosis] In trying to close the window where a structure could be removed while other things were being done to the ACC/NMB blocks, BMPGEN was modified to get and give the DA resource. However, one needs a DDB to use the DA and REMSTR doesn't have one to use. Also, BMPGEN expects F to contain a STR DB addr, not a DDB. [Cure] Can't plug the hole that tight. Remove references to the DA in BMPGEN and live with occasional IMEs. There is no structure-wide resource to take care of this situation. Too bad. [Keywords] REMSTR [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 FILFND BMPGEN 704A [End of MCO 14193] MCO: 14194 Name: KDO Date: 6-Apr-89:12:18:04 [Symptom] Invalid status returned in ETHNT. UUO User Buffer Descriptor (UBD) blocks. [Diagnosis] Missing code. [Cure] Add code. [Keywords] [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 ETHUUO ENCXDG 704A [End of MCO 14194] MCO: 14195 Name: KDO Date: 10-Apr-89:10:54:04 [Symptom] Adjacency up/down events for DECnet endnodes on multi-area LANs. [Diagnosis] DECnet is choosing a designated router outside it's area. [Cure] Ignore Ethernet Router Hello messages from outside our area. [Keywords] [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 ROUTER RHMASE 705 [End of MCO 14195] MCO: 14197 Name: DPM Date: 11-Apr-89:08:44:04 [Symptom] REFSTR creates files with strange version numbers. [Diagnosis] Sticking REFSTR's version rather than the monitor's is at best, non-standard. But when displayed by DIRECT, it looks like a bug. [Cure] Use CNFDVN instead. [Keywords] REFRESH [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 423 REFSTR RIBST1 [End of MCO 14197] MCO: 14198 Name: LWS Date: 17-Apr-89:10:18:26 [Symptom] 1. Same TM02/3 controller register dumped 9 times in TUB on error. 2. SPEAR doesn't know how to interpret TM02/3 controlled tape drive error entries or the monitor doesn't give SPEAR what it expects, take your pick. [Diagnosis] 1. RDREGS in TM2KON expects T2 to still contain controller register number on return from RDMBR. RDMBR clears all but register data in T2. 2. SPEAR expects 2 equal length blocks of error status information in the error entry (IEP and FEP data). However, TM2KONs IEP length is 1 and FEP length is 16 (octal). So we only write 1 word of "IEP" information. This causes SPEAR's interpretation of the error to be garbage (1. above doesn't help either). Note: The TUB for a TM02/3 controlled tape drive contains 2 blocks of TM2ELN words each for "IEP" and "FEP" error information. But the IEP word is set for only a length of 1. Why? I don't know. Poking the IEP word on 2476 to be the same as the FEP word causes 2 sets of error information to be dumped and SPEAR correctly interprets the error. So it seems we can change SPEAR to handle unequal length "IEP" and "FEP" error blocks, or have the monitor dump equal length blocks. [Cure] 1. PUSH/POP T2 around call to RDMBR at RDREG1. 2. Change LH of TUBIEP in TM2KON to be -TM2ELN. [Keywords] SPEAR TM02/3 TU77 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Field service attention PCO required [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 433 TM2KON TUBIEP,RDREG1 [End of MCO 14198] MCO: 14199 Name: LWS Date: 17-Apr-89:11:17:10 [Symptom] Can't assign a network device if a device of the same type doesn't exist on the local host. [Diagnosis] If the device doesn't exist on the local host there will not be an entry in GENTAB for the corresponding device. The call to CHKGEN in DVSTAS will fail and we bomb the user even though the network device does exist and is assignable. [Cure] At the non-skip return after the call to CHKGEN in DVSTAS load F with the start of the DDB chain and fall through into code that will eventually do the right stuff. But! This is not going to work correctly all the time. If no local line printers exist and we're trying to find a network printer DDB, we eventually build a DDB for the network printer and try to link it between the 'DSK' DDB and the 'SWAP' DDB - ding ding ding, IME. This happens because LNKDDB in AUTCON likes to keep the DDB chain in sorted order by device name. So 'LPT' falls between 'DSK' and 'SWAP', but 'DSK' DDBs are in the hiseg. In order to avoid the wrath of FILSER, change the name of SWPDDB to 'DSKSWP'. [Keywords] NETWORK DEVICE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] Beware file entry required New development MCO PCO required [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 433 COMMOD DEVNAM UUOCON DVSTAS [End of MCO 14199] MCO: 14200 Name: LWS/DPM Date: 20-Apr-89:10:52:06 [Symptom] Tape UDBs on KS not filled in with prototype data. [Diagnosis] AUTUDB doesn't compute ending address for BLT. [Cure] ADDI P2,(U) [Keywords] KS SPEAR [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Field service attention PCO required Single-section monitors only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 434 AUTCON AUTUD1 [End of MCO 14200] MCO: 14201 Name: RCB Date: 25-Apr-89:07:05:00 [Symptom] FILSER's error reporting leaves too big a window for DAEMON to get stale information for SPEAR to report. Not only that, but DAEMON even has to guess just what kind of error it is supposed to report. [Diagnosis] The ERRPT. UUO just doesn't give us enough to work with. We need to use system error blocks if we're going to get it right. [Cure] Do so. This adds EX.AVL to the bits which can be set in the transfer table header by the SEBTBL macro. If EX.AVL is set, the error entry will be copied to AVAIL.SYS as well as to ERROR.SYS. This also changes the way in which all disks report their errors. There is now a kontroller dispatch entry, KONELG, which is used by FILIO to format an error block and queue it up for DAEMON. [Keywords] Disk errors Error logging DAEMON System error blocks SPEAR [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] Beware file entry required New development MCO [BEWARE text] The format of the DSK KDB has changed again, with the addition of the KONELG dispatch entry for error logging. Any local disk device drivers will need to be changed accordingly. See MDEELG in FILIO for an example of how to do this. DAEMON version 23A(1026) or later must be installed before this MCO. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A FILIO 705 434 COMMON DPXKON FHXKON FSXKON RAXKON RHXKON RNXKON RPXKON DSXKON COMMOD DEVPRM S ERRCON DTASER COMDEV [End of MCO 14201] MCO: 14202 Name: RCB Date: 25-Apr-89:07:16:53 [Symptom] Jobs get stuck in event wait for system IPCF, and need manual intervention to be restarted. If they were logging out at the time, the job slot is stuck and useless. [Diagnosis] [SYSTEM]GOPHER is completely ignorant of the possibility that a system program like the account daemon might die and get logged out, thus causing its IPCF receive queue to be "returned to sender, address unknown". It just throws the returned messages on the floor, and leaves the user's job waiting for an acknowledgement message which will never come. [Cure] Educate the rodent. Check the returned message field, and validate it against the expected sequence number. If it matches, give the user an error return from SENDSP, so that a QUEUE. UUO (for example) will give the "component not running" error, and FILDAE messages will be handled as though FILDAE had never been running. [Keywords] EW hang System IPCF wait [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 704A IPCSER 705 434 [End of MCO 14202] MCO: 14203 Name: RCB Date: 25-Apr-89:08:11:52 [Symptom] System error blocks can eat up all of free core if DAEMON isn't running. [Diagnosis] Once they get queued, they are only deleted when some privileged program executes a SEBLK. UUO. [Cure] Add a timer. Once a minute, we will look for any blocks which are older than SEBAGE minutes and delete them. SEBAGE defaults to 10 (decimal), and can be changed with MONGEN. If SEBAGE is set to zero, the error blocks will live forever. [Keywords] System error blocks free core limits [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 434 ERRCON 704A COMMON CLOCK1 [End of MCO 14203] MCO: 14204 Name: DPM Date: 27-Apr-89:06:36:33 [Symptom] In some configurations, LINK will report NDBNNM as undefined even though ANF-10 network software is loaded. [Diagnosis] This problem is one of programming style and MACRO's tolerance for conflicting symbol definitions. NDBNNM is defined in NETPRM, which is searched by NETSER. The first several references to this symbol are properly made. However, at NDBAS1 the symbol is referenced as external. MACRO should probably flag this as a "E" error. Instead, the original value of the symbol is lost and MACRO generates global fixup requests for all references to NDBNNM. It's not clear why this problem has surfaced now, as the code at NETAS1 has not changed for several monitor releases, but correcting the reference in NETAS1 makes resolves the undefined global. [Cure] Reference NDMNNM as an internal quantity. [Keywords] UNDEFINED GLOBAL [Related MCOs] 13932, 13137 [Related SPRs] 36260 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 435 NETSER NETAS1 704A [End of MCO 14204] MCO: 14205 Name: RCB Date: 27-Apr-89:18:49:36 [Symptom] MCO 14165 didn't go far enough. PAGE. UUO function .PAGAC still isn't always right. Spy pages for sections 3-36 are sometimes reported as being unreadable. [Diagnosis] PAGA93, which finds a page number to return for a spy page in sections 3-36, doesn't preserve T2. Its caller wants T2 to contain the map entry after the call, as well as before. [Cure] Preserve the map entry in T2. [Keywords] PAGE. UUO Page accessability [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 435 VMSER PAGA93 704A [End of MCO 14205] MCO: 14206 Name: DPM Date: 28-Apr-89:05:51:47 [Symptom] Stopcode IME removing a structure (revisited). [Diagnosis] Previous MCOs didn't plug all the holes, although the window was made much smaller. [Cure] Prevent races by incrementing UNIGEN while holding the DA. Conceptually, this is easy, but BMPGEN is called with F pointing to a STR, not a DDB. Therefore, change UPDA & DWNDA to get the job number from .USJOB rather than from PJOBN. This is OK since the use of DA requires a job to be mapped to reference PJOBN anyway. [Keywords] REMOVE STRUCTURE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 435 FILFND BMPGN1 704A FILIO UPDA,DWNDA [End of MCO 14206] MCO: 14207 Name: LWS Date: 29-Apr-89:14:48:25 [Symptom] Can't create a SSL larger than .SLMXJ (maximum JSL size) structures using STRUUO. [Diagnosis] Code in SLSTRR and SLCHK always use .SLMXJ as a maximum without checking to see if its a JSL or the SSL. [Cure] Check search list type (RH(F)=0 means SSL) and use appropriate maximum value. [Keywords] SSL [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 435 FILFND SLSTRR,SLCHK 704A [End of MCO 14207] MCO: 14209 Name: DPM Date: 8-May-89:08:57:19 [Symptom] Pathological names whose first component is NUL do not necessarily behave as the NUL device. A DEVCHR, or DEVTYP of the sixbit name returns disk-only bits. The same is true if you do one of these UUOs on an open channel. However, if a LOOKUP or ENTER is done, then the right thing comes back. Also, DEVNAM never returns NUL and WATCH FILES doesn't expand the filespec correctly. [Diagnosis] The monitor believes a pathological name can only be a disk device and everybody knows that NUL is really a disk even though it claims to be all devices. But FILSER doesn't make that claim often enough. [Cure] Fix SETDDB to test for pathologcal NUL as well as assigned NUL. Change NULTST to test for DVDSK and DVTTY instead of sixbit NUL. Fix PRTDDB to print NUL instead of a logical device name. Add crock routine LNMNUL to do the grunt work when it's really necessary to know if it's the NUL device. [Keywords] NUL [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 436 COMCON PRTDDB FILUUO NULTST,LNMNUL,SETDDB UUOCON DVCHR,UDVNAM 704A [End of MCO 14209] MCO: 14211 Name: DPM Date: 15-May-89:09:29:52 [Symptom] It's difficult to measure magtape performance on a per-kontroller basis without using any counters. [Diagnosis] Never done before I guess. [Cure] Add two new counters to the KDB: TKBCRD counts characters read and TKBCWR counts characters written. [Keywords] MAGTAPE PERFORMANCE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 437 DEVPRM TKBCRD,TKBCWR T78KON TCXKON TD2KON TM2KON TMXKON TS1KON TX1KON 704A [End of MCO 14211] MCO: 14212 Name: LWS Date: 15-May-89:10:21:02 [Symptom] Undeserved ?Illegal memory reference in jobs with a shared hiseg. [Diagnosis] If a sharable hiseg is expanding and there are enough secondary map slots available to map the expansion, RDOMP is not set for any other job using the same hiseg. [Cure] In GTHMAP, if there are enough map slots for the expansion, call HRDOMP via HGHAPP so all other users of the same hiseg will have their maps redone before they run again. [Keywords] Sharable high segments [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO PCO required [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 437 SEGCON GTHMP1 704A [End of MCO 14212] MCO: 14214 Name: JAD Date: 25-May-89:07:51:12 [Symptom] Possible SCAFOO stopcodes in a maximally-configured CI network. [Diagnosis] Insufficient path blocks available for the number of CI nodes and CPUs in the CI/system configuration. There is space available for 32 path blocks, but a maximally-configured system could require much more. Problem occurs with definition of C%PBLL (number of path blocks) - it is defined as 2*C%SBLL (number of system blocks). Depending on the number of CI nodes and CPUs, this definition may leave insufficient path blocks. [Cure] Redefine C%PBLL as 6*C%SBLL - this will allow for the largest possible CI and CPU configuration. [Keywords] CI SCAFOO [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 SCAPRM C%PBLL [End of MCO 14214] MCO: 14217 Name: DPM/RJF Date: 30-May-89:06:58:55 [Symptom] Various problems suspending and resuming a system: 1. KLIPAs and KLNIs don't get reloaded. 2. KLNIs will be restarted even if they had first been removed. 3. Stopcode NULFNC during the suspend. [Diagnosis] 1. Code to call PPDINX and KNIINI is under an IFG conditional in COMMON, so it is not included in single CPU configurations. 2. When a KLNI is removed, the bit corresponding to the proper KLNI on a given CPU is set to indicate that the device is to be ignored on subsequent initialization calls. However, IPAMSK is never checked on KLNI restarts. 3. For reasons that escape me, the NULFEK is being called on system sleep/resume when it hadn't before. Apparently this never worked before, but it went unnoticed. The dispatch table does not contain the appropriate entries for these NETSER functions. [Cure] 1. Move the calls to PPDINX and KNIINI outside the IFG conditional. 2. Teach KNIINI to respect IPAMSK on KLNI restarts. 3. Add system sleep/resume entry point to NULFEK's dispatch table. [Keywords] SYSTEM SLEEP [Related MCOs] 13932, 13137 [Related SPRs] 36269 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 441 COMMON SPRIN5 KNISER KNIINI NULFEK NLFDSP 704A [End of MCO 14217] MCO: 14218 Name: LWS Date: 8-Jun-89:09:08:11 [Symptom] Undeserved memory parity errors on KLs with 4MW of memory. [Diagnosis] RH20s do undetermined things when accessing the last physical (quad)word in 4MW. This is an RH20 problem. This problem was never encountered in previous versions of the monitor and BOOT. The monitor used to put its hiseg at the very top of memory. Then BOOT occupied the top of memory. Now, BOOT is still there, but it now frees the pages at the top because they contain tape drivers that are not needed once BOOT is done. So, these pages at the top of memory are free to use by the monitor. When a user gets the last page of memory, it's fair game for I/O by an RH20. [Cure] For lack of something better to do at the moment, if the last page of a 4MW system is free, mark it as non-existant in NXMTAB and PAGTAB and set MEMSIZ to 17,,777000 instead of 20,,000000. [Keywords] 4 MW parity [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Field service attention KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 442 SYSINI MMTIN9 704 [End of MCO 14218] MCO: 14219 Name: DPM Date: 27-Jun-89:06:37:34 [Symptom] There appears to be no upper bounds on the number of extended RIBs FILSER is content to create. You can literally fill a disk with extended RIBs for a single file. When you CLOSE the file, you might as well take the rest of the day off, because FILSER has lots of bookkeeping to perform. [Diagnosis] RIBXRA contains an 8-bit field for the extended RIB number. FILSER never checks for field wrap around. The RIB number is only read back when a user specifies a negative USETI, and otherwise serves no real purpose. [Cure] Check for wrap around and impose an additional limit based on the contents of MUSTMX when RIBs are created. Set the maximum number of USETIs to 255 decimal. [Keywords] EXTENDED RIBS [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 443 COMMOD DESRBC,MUSTMX FILIO EXTRB2 704A [End of MCO 14219] MCO: 14220 Name: RCB Date: 30-Jun-89:00:15:53 [Symptom] An OPEN which specifies a logical name or a pathological name can fail or find the wrong device. [Diagnosis] The DDB search logic does not allow certain names to be found unless they are assigned to disks (i.e., funny-space DDBs). CK2CHR gets called when it should not. For that matter, LP will match a terminal assigned as LPT but not as LP. [Cure] For 2-character device names which CK2CHR changes, do the DDB searching twice. First, try the original name. If that fails or returns DSKDDB, then try again with the expanded name. If the second search fails but the first returned DSKDDB, then return the results from the first DDB search. Eliminate the hacks for CK2CHR and SY: from the search loop. [Keywords] PDP-11 names [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 444 UUOCON DDBSCC 704A [End of MCO 14220] MCO: 14221 Name: RCB Date: 1-Jul-89:01:56:17 [Symptom] KAF at PI level of the NIA20. [Diagnosis] Taking too long to empty the response queue (MCO 14186 revisited). [Cure] Check .CPTMF to try to be sure that too much time won't pass during a single KLNI interrupt. Also, move the check to after the callback so that we don't drop the buffers on the floor. Otherwise, after long enough, the protocols will run out of buffers (especially DECnet). Because .CPTMF is slightly bogus just as the system is coming up, ignore it until .CPUPT is at least 2 (ticks). Note that the counters and limits added by MCO 14186 are still present and in force. [Keywords] KAF NIA20 KNIKSP [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] Field service attention HOSS attention KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 444 KNISER KNIRQ1 704A [End of MCO 14221] MCO: 14222 Name: RCB Date: 7-Jul-89:23:34:54 [Symptom] System is annoyingly sluggish at system startup time. [Diagnosis] Trying to run dozens of copies of INITIA on random terminals all at the same time, in dozens of job slots. [Cure] Only sort of. Invent a new MONGEN-definable symbol, DSDRIC (dataset devices run INITIA CUSP), to control whether INITIA runs on dataset lines. It will default to one, which means that INITIA will continue to run on datasets at system startup. If set to zero at MONGEN time, TTYINI will not force INITIA commands on the datasets. For the curious, the reason INITIA runs on datasets at startup time is because of the existence of hardware interfaces which need to have parameters set even before a call comes in to the modem. However, most sites probably have more well-behaved interfaces, and will be able to set DSDRIC to zero. [Keywords] sluggish startup INITIA datasets [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 444 COMDEV DSCTAB 704A SCNSER TTINI2 [End of MCO 14222] MCO: 14223 Name: DPM Date: 18-Jul-89:05:32:00 [Symptom] DA28s don't work. [Diagnosis] XTCLNK assigns junk names to UDBs. Later calls to build DDBs fails because the target UDBs cannot be found. Also, XTCSER will not assemble with FTMP turned off because of references to SCNLOK and OUCHE. [Cure] Correct logic that builds UDB names. Put IFN FTMP conditionals around the reference to SCNLOK. Make OUCHE available in all KL10 configurations. [Keywords] DA28 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 445 APRSER OUCHE COMMON OUCHTB XTCSER XTCLN2,CHKTYP,MPIOWD 704A [End of MCO 14223] MCO: 14224 Name: DPM Date: 20-Jul-89:11:58:59 [Symptom] Random job tables (mostly JBTSTS) get clobbered, wierd crashes, general mayhem. [Diagnosis] Steve Perkins is running .EXE files created on the -20 again. If the .EXE directory claims to have sharable pages that aren't also marked as high segment pages, GETEXE returns flags indicating the image is sharable, but with no high segment. Parts of GET clean up assume that if the sharable bit is on, then there must be a high segment. This is true for .EXE files creates on a -10, but not otherwise. Anyway, making this assumption, SEGCON blindly picks up high seg block addresses (which are usually zero) and indexing off of zero, proceeds to write all over the monitor's low segment. [Cure] While processing .EXE directory entries, turn off the sharable bit if the high segment bit is not turned on. [Keywords] TOPS-20 EXE FILES [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 446 SEGCON WANTIT 704A [End of MCO 14224] MCO: 14225 Name: DPM Date: 25-Jul-89:05:28:01 [Symptom] SA10s don't function in an environment where DF10C-based device drivers exist (TM2KON for one). [Diagnosis] DF10C drivers fail to test for the presence of SA10 devices. Therefore, SA10s look like 18-bit DF10s. [Cure] Test SI.SAX in the CONI word in the appropriate xxxCFG routines. [Keywords] SA10 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 446 FSXKON FSXCFG RPXKON RPXCFG TM2KON TM2CFG 704A [End of MCO 14225] MCO: 14226 Name: DPM Date: 3-Aug-89:09:18:56 [Symptom] Several annoying problem that prevent SA10-based tape from working well. [Diagnosis] 1. SAXSER & TS1KON bum a bit in the KDBUNI word to indicate a software interrupt was requested. This means that KDBs can't be compared against each other, so AUTCON will build multiple KDBs for a single SA10 kontroller. 2. Tapes ported between a DX10 or a DX20 and an SA10 will have duplicate UDBs and DDBs built. This is because TD2KON and TX1KON do not know how to extract drive serial numbers. Subsequent comparisons between a drive S/N and an existing one don't match, so AUTCON beleived it's looking at two different drives. 3. The code to compare drive serial number is not interlocked in AUTCON. Under the righ circumstances, two configuring CPUs which have detected the same drive, might not notice the other. [Cure] 1. Move the software bit into KDBSTS. It's a better place for such things. 2. Fix TX1KON and TD2KON. 3. SYSPIF/SYSPIN around much of AUTDPU. [Keywords] SA10 [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 447 AUTCON AUTDPU DEVPRM KD.SIR SAXPRM SA.SIR SAXSER SAXINT TD2KON TD2DRV TS1KON TS1DRV TX1KON TX1DRV 704A [End of MCO 14226] MCO: 14227 Name: DPM Date: 3-Aug-89:09:20:23 [Symptom] Possible tape hangs after a CPU restart. [Diagnosis] SPRINI doesn't clear the TAPSER interlock nesting flag. [Cure] Do so. [Keywords] INTERLOCKS [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 447 COMMON SPRI10 704A [End of MCO 14227] MCO: 14228 Name: RCB Date: 3-Aug-89:15:33:21 [Symptom] Problems setting explicit speeds on TTY lines in the ANF front ends. [Diagnosis] Trying to do autobaud even though the speed has been set to something other than the autobaud speed. [Cure] Don't do that. If the speed is set in the config.P11 file, and that speed is not the autobaud speed (currently 2400 baud), override the ABD characteristic for the line. [Keywords] Autobaud Non-autobaud TnXS ANF10 [Related MCOs] 13932, 13137 [Related SPRs] 36270, 36268 [MCO status] None [MCO attributes] Field service attention HOSS attention [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 450 CONFIG P11 704A DNTTY P11 DNLBLK P11 MACROS P11 [End of MCO 14228] MCO: 14229 Name: RCB Date: 8-Aug-89:22:22:51 [Symptom] Monitor too big and slow. Not enough free bits in JBTSTS. [Diagnosis] Lots of places in the monitor test bit JDC from JBTSTS. A few others clear it. Only DAECOM can set it. It is unreachable code, left over from the old DCORE and DUMP commands and the days when DAEMON handled virtual references for EXAMINE, DEPOSIT, and VERSION commands. The JDC bit is consequently never set, and all the tests for it are redundant. [Cure] Free up the bit in JBTSTS, and eliminate all references to it. [Keywords] PERFORMANCE [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 450 COMCON 704A CLOCK1 SCHED1 SCNSER S [End of MCO 14229] MCO: 14230 Name: DPM Date: 15-Aug-89:06:36:14 [Symptom] More error logging stuff ... [Diagnosis] Yes. [Cure] Convert more old-style DAEMON error logging calls to use the System Error Blocks. This edit converts: 1. CPU attached/detached records. 2. Node online/offline records. 3. Date/time change records. Code is also inplace to handle system reload (.ERWHY) records, but because of interface problems with DAEMON and AVAIL.SYS, this call will be temporarily neutered. [Keywords] DAEMON ERROR LOGGING [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 450 COMCON SETDAT CPNSER CPUCSC NETSER NODEAM S .ERMVR SYSINI SYSRLD,SYSAVL 704A [End of MCO 14230] MCO: 14231 Name: RCB/DPM Date: 15-Aug-89:07:43:18 [Symptom] Convert DAEMON reporting of KL error chunks from RSX20F to use system error blocks. This eliminates two words in the CDB, .CPETM and .CPEAD. In order to accomplish this cleanly, there is now a new routine in IPCSER, OPRMSG, which allows one to queue up messages for ORION. If ORION is not running, the messages can optionally be sent to OPR: or the CTY. See IPCSER for the calling sequence. The behavior is controlled by bits in T1 on the call, of the form OPM.??, which are defined in S. [Diagnosis] [Cure] [Keywords] KL error chunks system error blocks system messages [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked Deferred [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 450 DTEPRM 704A DTESER S IPCSER ERRCON COMCON CLOCK1 COMMON [End of MCO 14231] MCO: 14233 Name: RCB Date: 22-Aug-89:09:55:53 [Symptom] Undeserved KNIKSP stopcodes. [Diagnosis] .CPTMF limit is exceeded at system startup time. [Cure] If .CPUPT is lower, then don't KNIKSP. [Keywords] [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] None [MCO attributes] KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 451 KNISER KNIRQ1 704A [End of MCO 14233] MCO: 14234 Name: DPM Date: 23-Aug-89:07:34:30 [Symptom] Programs using external tasks (XTCSER) hang following attempts to JAM powered off remote computers. [Diagnosis] If FTMP is turned off, the call to CHKTYP from DWNUNI says to never do typeout. DWNUNI simply returns without clearing any DA28 errors which caused the unit to be declared down. Thus, the DA28 becomes unusable for all other users. A similar situation exists where connect errors are processed. In this case, we forget to force the unit offline. [Cure] Three things. First, fix CHKTYP to work correctly with FTMP turned off. Second, if no typeout is to be done, skip around the message generation code and clear the DA28. Finally, on connect errors, always force the unit offline whether or not we'll type a message. [Keywords] DA28 ERRORS [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 451 XTCSER CHKTYP,DWNUNI,CHKCER 704A [End of MCO 14234] MCO: 14235 Name: DPM Date: 29-Aug-89:07:32:01 [Symptom] More error logging stuff. [Diagnosis] Yes. [Cure] Teach the monitor to write the following records as system error blocks: .ERCSC Configuration status change (memory on/off line) .ERKSN KS10 NXM trap .ERKPT KL10/KS10 parity trap .ERCSB CPU status block .ERDSB Device status block [Keywords] DAEMON ERROR LOGGING [Related MCOs] 13932, 13137 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 452 APRSER PRHMF7,DAELOG,MEMELG COMCON MEMONU,MEMON8 COMMON OLDNXM,DIACSB,DIADSB LOKCON MEMOFU,MEMOF2 704A [End of MCO 14235] MCO: 14236 Name: KBY Date: 29-Aug-89:08:27:46 [Symptom] FA resource scheduling leaves something to be desired. The schedular knows how to wake up just the job that needs it, but everyone wakes up now any time it's given up. [Diagnosis] No code. [Cure] Add code (the remaining routines necessary to do the unwind properly). [Keywords] FA UNWIND [Related MCOs] None [Related SPRs] None [MCO status] Deferred [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 452 FILIO UPFA,DWNFA 704A COMMOD S CLOCK1 SRFREE [End of MCO 14236] MCO: 14237 Name: KBY Date: 29-Aug-89:08:34:37 [Symptom] Job stuck; SYSTAT shows it's locked (even though it really isn't). [Diagnosis] Due to the extra calls to SCDCHK to prevent KAFs in large PAGE. UUOs, we can potentially block in a PAGE. UUO. If pages were allocated to the job by CHGPGS (because they were available at the time), but during the block we decide to swap out the job, we could potentially lose those pages to never-never land since they are not in anyone's map. To prevent this, CHGPGS lights NSHF (but not NSWP) akin to MAPBAK so that the swapper won't touch the job. Unfortunately, if the job has a sharable high segment, someone else using it might call XPANDH (which can happen even without really wanting to expand the high seg as we tend to do this at the drop of a hat) and set JXPN for the job blocked at CHGPGS. At this point the schedular will not run the job because of JXPN and the swapper won't clear JXPN (even without swapping the job which may not be necessary) because of NSHF which won't get cleared until the job finishes running through CHGPGS (deadly embrace). [Cure] The schedular will except jobs owning disk resources from the JXPN check. Do so also with jobs having NSHF on but not NSWP (a state only the monitor can cause in limited situations such as the above). [Keywords] JXPN NSHF [Related MCOs] 13932, 13137 [Related SPRs] 36245 [MCO status] Deferred [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 452 SCHED1 CJFRCX 704A [End of MCO 14237] MCO: 14238 Name: JC Date: 1-Sep-89:13:34:18 [Symptom] TOPS-10 is missing the TRANSLate command. [Diagnosis] No one ever put it in. [Cure] Add one. [Keywords] TRANSL LOGIN commands [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 453 COMMON COMTAB [End of MCO 14238] MCO: 14240 Name: DPM Date: 5-Sep-89:05:41:06 [Symptom] More error logging stuff. [Diagnosis] Yes. [Cure] 1. Add support for .ERSNX (NXM sweep). 2. Add support for .ERSPR (parity sweep). 3. Turn on .ERWHY/.ERMRV logging. [Keywords] ERROR LOGGING [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Beware file entry required [BEWARE text] DAEMON version 23A(1027) or later is required. Earlier versions will cause .ERMRV records to be written into ERROR.SYS instead of AVAIL.SYS. When this happens, SPEAR will report an unknown record type in ERROR.SYS. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 453 ERRCON PARSWP,PARELG,NXMSWP,NXMELG,XFRSE2 S EX.NER SYSINI LLMSTR,AVLTBL 704A [End of MCO 14240] MCO: 14241 Name: DPM Date: 6-Sep-89:07:23:57 [Symptom] Stopcode OVA on a KS10 during SYSINI. [Diagnosis] EVA pages overflow BOOT address space because the high segment grew a bit. [Cure] Slide the high segment origin down 2 pages. [Keywords] HIGH SEGMENT [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 453 COMMON MONORG 704A [End of MCO 14241] MCO: 14242 Name: KDO Date: 11-Sep-89:14:05:55 [Symptom] Unusable TTY DBBs. [Diagnosis] LATSER creates a TTY DDB for host-initiated connects, but INITIA uses a different one, causing LATSER's to float free. [Cure] If it hurts everytime I do this, don't do it anymore. [Keywords] [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 454 LATSER GETTDB 704A [End of MCO 14242] MCO: 14243 Name: ERS Date: 12-Sep-89:08:18:55 [Symptom] Various. Lots of monitor too big and slow. Several places that a user mode section number are lost. And possible working-set confusion if a multi-section program had a PFH. (Probably wouldn't work anyway.) [Diagnosis] GETPC/PUTPC [Cure] Remove uses of GETPC/PUTPC. In some cases we simply put the same code in minus a couple JRSTs. In other places it gets a little more complicated. The DDT command should now include the section number in the one-word old PC in JOBDAT. Assume that an extended user is not in his PFH. (A bit of work would be involved in making an extended PFH work.) Rewrite DOINT. Net result is that we'll store the section number in the old PC portion of the interrupt block. [Keywords] GETPC User-mode Extended-addressing [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] Beware file entry required New development MCO Documentation change [BEWARE text] Some one word PCs will now contain the section number where they did not it the past. In paticular commands like DDT should preserve the section number in .JBOPC. Also, the old PC in the interrupt block should now contain the section number. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 454 COMCON USAVE,SEGRLX ERRCON DOINT VMSER USRFL6,USRFL7,GETDDT,UPAGE4,PAGA1C [End of MCO 14243] MCO: 14244 Name: DPM Date: 18-Sep-89:06:31:17 [Symptom] If a logical name points to NUL, the FILOP returned filespec will not store the correct device name following a LOOKUP or ENTER. [Diagnosis] Oversight. The retured device name is the logical name. [Cure] Call LNMNUL and return NUL if appropriate. [Keywords] NUL [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 455 UUOCON FOPFI0 704A [End of MCO 14244] MCO: 14245 Name: DPM Date: 19-Sep-89:06:27:02 [Symptom] On a very slow system, IPCF sends to jobs which logged in via FRCLIN can get receiver quota exhausted errors. [Diagnosis] The receiver hasn't had the chance to pump up its IPCF quotas. This is most easily seen on a heavily loaded KS10. [Cure] Have LOGREF set the quotas to 511. [Keywords] IPCF QUOTAS [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 455 COMCON LOGRF2 704A [End of MCO 14245] MCO: 14246 Name: ERS Date: 19-Sep-89:07:57:56 [Symptom] Monitor too big and slow. [Diagnosis] Old code for GET.EXE. [Cure] Remove it. [Keywords] [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 455 COMCON SGSET,UGTSEG [End of MCO 14246] MCO: 14247 Name: ERS Date: 19-Sep-89:08:07:26 [Symptom] GETPC/PUTPC, the second half. [Diagnosis] yes. [Cure] Yes. [Keywords] GETPC PUTPC GETPCS byebye [Related MCOs] 14243 [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 455 ERRCON DOINT S GETPC,GETPCS,PUTPC CLOCK1 NOTACL,INCTM4,CIP9,STOP1H,SETPIT,SETPIU,USTART [End of MCO 14247] MCO: 14248 Name: RCB Date: 26-Sep-89:05:52:43 [Symptom] MCO 14231 revisited: Convert DAEMON reporting of KL error chunks from RSX20F to use system error blocks. This eliminates two words in the CDB, .CPETM and .CPEAD. [Diagnosis] yes. [Cure] yes. This also makes DTE. UUO function 20 (.DTERT) obsolete. [Keywords] KL error chunks system error blocks system messages [Related MCOs] 14231 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 456 DTEPRM 704A DTESER IPCSER COMMON [End of MCO 14248] MCO: 14249 Name: KDO Date: 26-Sep-89:07:27:50 [Symptom] LAT is slow to start. [Diagnosis] If the multicast message is sent before the Ethernet service routines have set the channel address, LAT servers will use the wrong Ethernet address when trying to connect to TOPS-10. [Cure] Delay the multicast message until after ETHSER does the Set-Channel-Address (NU.SCA) callback. [Keywords] [Related MCOs] None [Related SPRs] 36229 [MCO status] None [MCO attributes] KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 456 LATSER CBRDSP,LATLSC,LATSCA 704A [End of MCO 14249] MCO: 14250 Name: DPM Date: 3-Oct-89:07:45:50 [Symptom] No way to cause IPA dumps to be written cleanly. [Diagnosis] DAEMON currently does this by using system error blocks; a method which is at best an ugly crock. [Cure] Invent a way to allow the monitor to run things at UUO level. This amounts to adding a forced .EXEC command which when performed on FRCLIN will create a job slot and run a specified routine at UUO level. At completion, the control transfers to JOBKL and the job will be destroyed. This will be used to write IPA dump files. This MCO however, only implements the necessary code to create the job. The actual dump stuff will happen in a later MCO. [Keywords] DAEMON ERROR LOGGING [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 457 COMCON LOGREF COMMON COMTAB CLOCK1 CIP2 SCNSER TTFCOM 704A [End of MCO 14250] MCO: 14253 Name: DPM Date: 17-Oct-89:07:24:25 [Symptom] The monitor has an annoying habit of dumping even if the system has been up for less than 5 minutes. This is contrary to previous behavior. [Diagnosis] While it may be a desirable thing to do under some circumstances, it isn't desirable in all cases. [Cure] Make it optional. In cases where the system crashes during the first 5 minutes of uptime, dump only if the symbol ATODMP is non-zero. By default, it will be set to 1. Sites which find this behavior disgusting can set it to 0. [Keywords] DUMP [Related MCOs] 13809 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 461 COMMON ATODMP MONBTS RLDMON 704A [End of MCO 14253] MCO: 14254 Name: DPM Date: 26-Oct-89:07:43:47 [Symptom] Occasionally, structures cannot be mounted after ATTACHing disk drives or after a newly formatted pack has been defined. [Diagnosis] The routine DSKDRV is responsible for setting up a UDB following an ATTACH. If errors occurred reading device registers, the unit status is set appropriately to reflect the error condition. However, if no errors occurred, DSKDRV assumes a pack must be mounted and changes the status to 'pack is mounted'. Later, when the STRUUO is done to define a structure, it will fail because the UDB claims a pack is already mounted. In the case of a newly formatted pack, ONCMOD neglects to set the unit state to 'no pack mounted' when the HOM blocks cannot be read. [Cure] Following an ATTACH, do not change the unit status unless there were errors. When HOM blocks cannot be read, set the unit status to 'no pack mounted'. [Keywords] ATTACH DISK DEFINE STRUCTURE [Related MCOs] None [Related SPRs] 36276 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 462 FILIO DSKDR8 ONCMOD TRYHOM 704A [End of MCO 14254] MCO: 14255 Name: RCB Date: 26-Oct-89:11:38:09 [Symptom] Batch streams can hang forever when they use MIC. [Diagnosis] Scenario: MIC file enables the RESPONSE feature in order to trap error messages into a MIC variable (parameter). Some program it invokes types out an error message. MIC wants to get the entire error message into the response buffer, and not just a part of it, so it waits for the job to go to monitor level or to block in TIOWQ (TTY I/O wait) before it reads the text. To make sure the text is available for MIC to read, SCNSER refuses to allow output to happen until MIC's conditions are satisfied *and* MIC has read the response buffer. Thus, when the program that was invoked types out a reasonably short error message (so that it doesn't block in TO) and then loops in NAPQ waiting for the chunks to empty out before it decides how to type its next prompt, there is a deadlock. The program never satisfies the MIC conditions for getting the response buffer read, and thus output never happens, and thus the program is waiting for MIC waiting for the program waiting for MIC .... [Cure] Since the MIC RESPONSE buffer is only 21 octal words in length, and is ASCIZ, MIC will only ever see a maximum of 84 (decimal) characters of response text. In other words, it only expects to see one line. So, add a bit in the LDB, L1LEEL (end of error line, B6 in LDBBYT). This bit is twiddled during the same routine that notifies us of an error character. The code in XMTMIC which checks for whether to tell MIC that the response buffer is available will consider having L1LEEL set to be as good as being in TIOWQ. I.e., if we have gone back to the left margin since seeing the error character, we will tell MIC to do its thing. [Keywords] MIC under BATCH hung PTY [Related MCOs] None [Related SPRs] 36279 [MCO status] Checked [MCO attributes] Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 462 SCNSER LDBBYT,MICLG3,MICPS4 704A [End of MCO 14255] MCO: 14258 Name: DPM Date: 14-Nov-89:06:09:54 [Symptom] IPA dump file writing facility appears as a wart on the error logging code. [Diagnosis] Way back when, the only way to get UUO-level work done was to get DAEMON to do some work for you. IPA dump files were processed through the error logging code by prodding DAEMON with a SPEAR record that was suppressed from ERROR.SYS. [Cure] Now that there's a way to make UUO-level things happen, teach the monitor to write the dump files and eliminate the need for DEAMON interaction. [Keywords] IPA DUMP [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 464 AUTCON AUTDMP CLOCK1 EXPJO1 COMDEV IPADMP FILUUO UNQIFL,UNQINI 704A [End of MCO 14258] MCO: 14259 Name: DPM Date: 14-Nov-89:06:50:39 [Symptom] Inaccessible code left over from efforts to clean up error logging code. [Diagnosis] Was just waiting 'til it was all over. [Cure] Remove DAEDIE, DAEDSJ, DAEEIM, DAEERR, DAERPT, and DAESJE. Also remove the interlock word, DAELOK. Shrinks CLOCK1 by 3 blocks. [Keywords] ERROR LOGGING [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 464 CLOCK1 DAEDIE,DAEDSJ,DAEEIM,DAEERR,DAERPT,DAESJE,DAELOK 704A [End of MCO 14259] MCO: 14261 Name: DPM Date: 21-Nov-89:08:25:53 [Symptom] On KS10s, defining non-standard device parameters doesn't work. COMDEV gets assembly errors. [Diagnosis] The MDKS10 macro has a junk parameter filled in for the MASSBUS unit number. [Cure] Don't put out sixbit jibberish where a number is expected. [Keywords] MDKS10 [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 465 MONGEN MDT3A 704A [End of MCO 14261] MCO: 14262 Name: JEG/DPM Date: 28-Nov-89:04:51:45 [Symptom] Possible DI hangs or KAFs out of PCLDSK. [Diagnosis] When doing queued protocol for disks, if the primary port is offline, alot of things can go wrong requeing the I/O to another port. 1. References to UNIKON should be indexed by T1, not U. 2. References to UNIALT are OK for only for CI disks. 3. Extra JUMPN to test the results from CPUOK. 4. Merely checking KDBCAM for non-zero value doesn't guarantee the other CPU(s) are running. [Cure] 1. Index UNIKON by T1. 2. Test for a CI disk. If so, use UNIALT. Use UNI2ND for all others. 3. Remove JUMPN. We wouldn't have gotten to PCLOFL if the initial call to CPUOK was successful. 4. Make a second call to CPUOK to test the new accessibility bits from the alternate or detached port. [Keywords] DI HANG KAF [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 466 FILIO PCLOFL 704A [End of MCO 14262] MCO: 14263 Name: DPM Date: 28-Nov-89:06:45:40 [Symptom] With DAEMON no longer dependant upon the monitor version, the methods of determining what is the proper DAEMON version and who is a legal DAEMON no longer work. Also, there is still some lurking inaccessible code. [Diagnosis] Time for a change. [Cure] A SETUUO will be provided so DAEMON can set its job number in the monitor. It is function 53 (.STDAE). A corresponding GETTAB (%CNDJN, 212,,11) wil read the job number back. The SIXBIT/DAEMON/ name and JACCT bit are no longer required. In fact, DAEMON has been removed from PRVTAB. Also, remove the ERRPT. UUO as the monitor no longer leaves data for DAEMON to scavenge by this method. The UUO, as well as GETTAB table entries %LDERT, %LDPT1, %LDPT2, %LDLTH, and %LDESZ are now obsolete. Stopcode IBI gets deleted along with the code at STOP1 to try to restart DAEMON after it halts. It can never be made to work. DAEMON version 24(1030) or later is required from now on. [Keywords] DAEMON [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Beware file entry required Documentation change UUOSYM change [BEWARE text] DAEMON version 24(1030) or later is required with monitor load 466. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 466 CLOCK1 COMCON COMMON COMMOD UUOCON UUOSYM 704A [End of MCO 14263] MCO: 14264 Name: DPM Date: 30-Nov-89:07:10:40 [Symptom] Using the MONGEN option to set non-standard device parameters, defining a printer to be upper case only has no effect. The monitor treats the printer as lower case. Manually turining off DVLPTL in the DEVCHR word of the DDB makes the problem disappear and the printer behave like an upper case only printer. [Diagnosis] The routine AUTMDT scans MDTs for non-standard device parameters. If the device is specified (to MONGEN) using a device code OR a non-zero CPU number, then everything works as expected. However, if the customer defaults the device code AND the CPU number (or supplies CPU0), then a zero device specifier is inserted the MDT. A zero word signals the end of the MDT. Therefore, AUTMDT will never scan the entire table and never find the customer specified parameters. Also, it is possible for AUTMDT to exit without returning the MDT data under some circumstances. [Cure] In MONGEN, set a bit in the device specifier word of the MDT entry which indicates the word is valid. Thus, CPU0 with a defaulted device code of zero will no longer look like the table terminator. Also insure that the MDT data is always returned properly. [Keywords] MDT [Related MCOs] None [Related SPRs] 36282 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 467 AUTCON AUTMD3,MDTDV1 DEVPRM MD.VAL MONGEN ASKRE5,MDT6,MDTTAB 704A [End of MCO 14264] MCO: 14265 Name: DPM Date: 4-Dec-89:08:22:27 [Symptom] Rewinds and skip file operations time out prematurely on 3600 foot magtapes. [Diagnosis] Hung timers are based on the amount of time needed to perform a given function on a 2400 foot magtape. The values fall short for 3600 foot reels. [Cure] Increase all hung timer values by one half. [Keywords] HUNG TIMERS [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 467 T78KON HNGTBL TCXKON HNGTBL TD2KON HNGTBL TM2KON HNGTBL TMXKON HNGTBL TS1KON HNGTBL TX1KON HNGTBL 704A [End of MCO 14265] MCO: 14266 Name: DPM Date: 5-Dec-89:06:31:35 [Symptom] No way for the old and new DAEMONs to tell which version ought to be run. [Diagnosis] %CNDAE returns 704, but both the old and new DAEMONs run under different flavors of 704. [Cure] Have %CNDAE return 705. The new DAEMON will require this, but if it sees 704, it will run DAE704. [Keywords] DAEMON [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Beware file entry required [BEWARE text] DAEMON which has run with earlier versions of 7.04 should be renamed to SYS:DAE704.EXE. DAEMON version 24 should be placed on SYS as DAEMON.EXE. If there is a chance that an earlier 7.04 monitor may occasionally be run, the new DAEMON should also be copied to SYS with the name DAE705.EXE. This will allow for the proper synchronization of DAEMONs with the monitor regardless of which version of 7.04 is run. %CNDAE is a GETTAB which allows DAEMON to synchronize with monitor versions. It is intended for use only by DAEMON. Other programs such as ACTLIB, LOGIN, REACT, and WHO have incorrectly used this GETTAB to return the monitor version where another, more appropriate GETTAB, %CNDVN, should have been used. The Digital programs have been changed to use %CNDVN. Sites should make similar changes to any user-written programs which may have used %CNDAE. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 467 COMMON CNFDAE 704A [End of MCO 14266] MCO: 14267 Name: LWS Date: 11-Dec-89:17:28:27 [Symptom] Problems assigning/init'ing etc devices in monitors with no ANF support. [Diagnosis] AUTDDB does not make device names of the form DEVNNU when NN is 00. In this case it makes a name of the form DEVU, eg. LPT0 instead of LPT000. DVSTAS depends on the U of NNU being the last sixbit character in DEVNAM (bits 30-35) when searching for a DDB. GALAXY spoolers generate device names of the form DEV00U when ANF is not supported in the monitor. Using DEV00U as a device name in various UUOs fails because DEV00U will never match the device name in the DDB, which is DEVU. [Cure] Have AUTDDB always build device names of the form DEVNNU when DR.NET is lit. [Keywords] AUTOCONFIGURE [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 470 AUTCON AUTDDB 704A [End of MCO 14267] MCO: 14268 Name: DPM Date: 12-Dec-89:07:02:45 [Symptom] More error logging stuff. 1. The monitor doesn't write DECtape records. 2. The definition of record type 75 is wrong. [Diagnosis] 1. SPEAR didn't use to understand DECtape records. Now it does. 2. Record type 75 claims it's only used for IPA20 dumps. Not so. [Cure] 1. Remove references to M.DTAE (introduced during 7.04 development) as normally turned on. This will cause the monitor to write DECtape error records. 2. Redefine record 75 to be a generic device dump record with the name .ESDVD (UUOSYM) and .ERDVD (S). SPEAR also understands this record now. The monitor still doesn't write this record, but it will soon. [Keywords] ERROR LOGGING [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Documentation change UUOSYM change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 470 COMDEV M.DTAE DTASER M.DTAE S .ERDVD UUOSYM .ERDVD 704A [End of MCO 14268] MCO: 14269 Name: DPM/RCB Date: 19-Dec-89:07:32:29 [Symptom] On multi-CPU systems, at system startup, one frequently sees varying CPU uptimes and/or undeserved CPUn not running warnings on the CTY. [Diagnosis] When the clocks are turned on, only the policy CPUs uptime counter is more or less accurate. Non-policy CPUs are looping in their AC loop waiting for the system to start. During this time, they take no interrupts and therefore never update their uptime or OK word. When the system starts timesharing, the uptime words are guaranteed to be skewed and sometimes the OK words are positive, causing the warnings on the CTY. [Cure] Prior to turning on the clocks, make all CPU's uptime words agree. Also fix the OK words to be properly negative. [Keywords] UPTIME [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 471 SYSINI TIMINI,NODDT 704A [End of MCO 14269] MCO: 14270 Name: DPM Date: 26-Dec-89:07:49:18 [Symptom] After the release of 7.04, SNOOPY fails with the error "? Undefined breakpoint symbol TM0IN1". [Diagnosis] The CPU dependant code for interval timer interrupts was removed as part of 7.04 development. Because of this change, a SNOOP. UUO cannot be used to patch the interval timer code without incuring excessive overhead in the job which is doing the snooping. (It must weed out all calls except those from the target CPU.) [Cure] Add a new SETUUO (.STITP==54) to allow a job to patch the interval timer. The job must have POKE privs, be [1,2], or running with JACCT set, and contiguously locked in EVM. The call is: MOVE AC,[.STITP,,addr] SETUUO AC, no privs, bad arguments success addr: CPU mask instruction to XCT (relocated) For this to work, two CDB locations have been added. .CPITP contains the instruction to execute and .CPITJ contains the job number which patched the interval timer code. When interrupts are processed, if .CPITP is non-zero, it will be executed. A suitably privileged job may set .CPITP if .CPITJ is zero or is already owned by the job executing the SETUUO. .CPITP may be cleared by supplying a zero for the instruction to execute. These words will be forcibly cleared when a job exits prematurely (ESTOP), control-C's out (STOP1) or does a RESET UUO. For the curious, two new GETTABs have been added. %CVITP and %CVITJ return .CPITP and .CPITJ respectively, although SNOOPY or any other performance measuring program should have no need to rely on these words. [Keywords] SNOOPY [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Documentation change UUOSYM change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 472 APRSER TIMINT,SETITP,CLRITP CLOCK1 ESTOP1,STOP1 COMCON SETTBL COMMON .CPITP,.CPITJ UUOCON RESET UUOSYM %CVITP,%CVITJ,.STITP 704A [End of MCO 14270] MCO: 14271 Name: RCB Date: 30-Dec-89:06:36:16 [Symptom] Device errors and uptime statistics are getting lost. [Diagnosis] DAEMON is unreliable about finding crash dumps and reporting errors and AVAIL statistics from them. [Cure] Have the monitor do it. This adds module CRSINI to ERRCON.MAC. This also adds two new STOPCDs (both in CRSINI): CRSIAF, type INFO -- CRSINI allocation failure. CRSINI could not allocate an exec process block in order to run its UUO-level code. OLDMON, type INFO -- OLD monitor found in crash file CRSINI found that the crash file pointed to by BOOT was for an older monitor than it can process. [Keywords] DAEMON SPEAR AVAIL [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 472 SYSINI SYSINH,SYSRLD,SYSAVL,BOOTFX 704A COMMON CNFDAE S .ERCIN,.ERHSB,.EXHDR CLOCK1 CIP2,SETDJB ERRCON SEBTIM,XFRSEB,CRSINI [End of MCO 14271] MCO: 14272 Name: RCB Date: 9-Jan-90:07:08:37 [Symptom] Confusion results when DAEMON restarts before the monitor crashed and never seems to have been started after the reload. [Diagnosis] DAEMON writes an entry into ERROR.SYS before it reads the entries from the monitor. Since the monitor has read the crash file and has the stopcode information waiting for DAEMON to log, this results in the entries being written in the wrong order. Their timestamps are correct, but it still looks strange, leading to questions about just what's wrong. [Cure] Make the .STDAE SETUUO code which interlocks DAEMON startup handle queueing up a DAEMON-restarted entry into the system error blocks. If DAEMON also writes one, then the one which is out of order can safely be ignored. Eventually, DAEMON will no longer write such entries and the DAEMON restarts which SPEAR reports will always be synchronized properly with other reported events. Since this entry is only reported, and not used by any of the COMPUTE functions, it is safe to have extra or missing entries while advancing from one autopatch tape to the next. [Keywords] DAEMON restart SPEAR ERROR.SYS [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 473 CLOCK1 SETDJB 704A S .ERDPE [End of MCO 14272] MCO: 14273 Name: RCB Date: 9-Jan-90:07:16:36 [Symptom] Too hard for programs to call others via the CTX. UUO and tell whether the called programs succeeded after the UUO returns. It is usually necessary to modify the called program to pay attention to the CTX. data buffer interface in order to obtain the desired behavior in the face of errors. [Diagnosis] JOBDAT location .JBERR is not being handled properly. Errors which occur (and are counted) in the inferior context are lost. [Cure] Keep .JBERR updated in the superior context. The inferior will start with zero in the word, and anything in the word at context deletion (POP) time will be added into the superior's count. [Keywords] Contexts .JBERR [Related MCOs] 11102 [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 473 CTXSER CSRTAB,CTXPOP 704A [End of MCO 14273] MCO: 14274 Name: RCB Date: 9-Jan-90:07:26:43 [Symptom] It's the first full week of a new year, and we're almost out of load numbers. [Diagnosis] yes. [Cure] Recycle the load numbers but bump the minor version number. [Keywords] Version control [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] HOSS attention [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 410 COMMON A00DLN,A00SVN 704B [End of MCO 14274] MCO: 14275 Name: DPM Date: 23-Jan-90:05:59:17 [Symptom] Methods for recording AVAIL statistics are too complex. [Diagnosis] DAEMON maintains AVAIL.SYS which is nearly an image of ERROR.SYS. SPEAR will happily extract AVAIL data from ERROR.SYS if it is told to do so. [Cure] Remove references to the AVAIL bits in the system error block logging code. SPEAR %2(1152) will default to reading ERROR.SYS for AVAIL (compute) data. DAEMON %24(1032) will no longer write AVAIL.SYS files. [Keywords] AVAIL.SYS [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 411 APRSER MELTBL COMCON DTCTBL CPNSER CPATBL,CPDTBL ERRCON XFRSE2 FILIO CSCBEG NETSER NOFTBL,NONTBL S EH.AVL,EH.NER,EX.AVL,EX.NER SYSINI AVLTBL 704B [End of MCO 14275] MCO: 14276 Name: DPM/RCB Date: 23-Jan-90:07:14:18 [Symptom] Network ERROR.SYS entries written out of sequence. [Diagnosis] The DAEMON UUO function to append a record to ERROR.SYS is handled asynchronously to the queueing of system error blocks. [Cure] Teach the monitor to intercept this DAEMON function and put the data into a system error block. [Keywords] ERROR.SYS [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 411 UUOCON CALDAE 704B [End of MCO 14276] MCO: 14277 Name: RCB Date: 30-Jan-90:05:32:50 [Symptom] INITIA fails to identify terminals on dataset lines. In particular, it fails on a reverse-LAT connection to a KS-10. [Diagnosis] When SCNSER is told by the device driver that a local dataset has raised carrier, it sets the 'blind' flag in the dataset control table (DSCTAB) to allow for possible junk characters coming while the line state settles. This is mostly an artifact of history, since it was done for acoustical couplers. However, it causes any incoming characters in the first 1-2 seconds after carrier is seen to be ignored. If the baud rate is high enough, and the system is otherwise lightly loaded, INITIA will have started its escape sequence handling by then. The first characters of the response will be thrown away, but not all of them, and INITIA will declare the line type to be unknown. [Cure] Refuse to transmit characters to a local dataset until the 'blind' flag (DSCBLI) is clear. After all, if we're worried about acoustical couplers, we shouldn't send any data until the handset is in the cradle so that the characters will appear on the terminal. Anyway, INITIA already waits for the data to leave the TTY chunks before it starts its timers, so this will work for keeping INITIA happy. [Keywords] Reverse-LAT Datasets Terminal type interrogation [Related MCOs] None [Related SPRs] 36235 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 412 SCNSER XMTCHR 704B DZINT DZQADD [End of MCO 14277] MCO: 14278 Name: DPM/RCB Date: 30-Jan-90:06:45:30 [Symptom] Stopcode KNIKSP at system startup. [Diagnosis] Our efforts to keep make the system uptime and CPU OK words correct worked well. So well in fact that it caused undeserved KNIKSPs at system startup. The code to maintain the elapsed time during SYSINI (and timesharing for that matter) assumes that even though the PIs get turned off occasionally, we never miss more than one tick. Not so. [Cure] Do a RDTIME and compute the number of ticks which have elapsed since the last interrupt. Use this number to keep APRTIM accurate. [Keywords] KNIKSP APRTIM [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 412 SYSINI ONCINT,ONCIN0,APRCHK 704B [End of MCO 14278] MCO: 14280 Name: DPM Date: 6-Feb-90:06:53:58 [Symptom] 1. Stopcode KNIKSP during system initialization (revisited). 2. Stopcode KAF following a series of KNIKSPs under timesharing. 3. LAT boxes drop links to the -10 following a KNIKSP. [Diagnosis] 1. Like all code which tries to prevent KAFs, KNISER looks at .CPTMF to determine if we're about to KAF. During system initialization, this variable is counted up, but DTEs are never serviced. Hence, it always appears as if a KAF is about to happen and KNISER shuts down the NIA20. One cannot simply service the DTEs during SYSINI because much of the time, the clock channel is turned off. 2. While the monitor keeps on running even though many KNIKSPs have occured, repeated attempts to type out stopcode text can cause a KAF. This is an aspect of how DIE types on the CTY. 3. LAT boxes expect host systems to keep in nearly constant contact. The NIA20 is shut down for one second after a KNIKSP, and this interval is just over that limit that LAT boxes will allow before breaking contact. There's no way to tell the LAT box the -10 is going away but will be back shortly. [Cure] 1. Have the once-a-tick code service DTEs during system initialization. Most of the time, no work is done except to reset .CPTMF as primary protocol is not started until late in SYSINI. But, this is enough to keep KNISER happy. Also, in the interest of keeping common code between SYSINI and APRSER/CLOCK1, use the same scheme for maintaining elapsed time under timesharing. Rely on the meters to compute the time since the last clock interrupt. This maintains clock accuracy over times when the PI system is shut off. This adds two words to the CDB. .CPRTM is a double-word quantity that holds the RDTIME base at the last interrupt. 2. Only report the first KNIKSP in the interval set in KNIOVC. By default, this location contains a 5, which means only 1 KNIKSP every 5 seconds will appear on the CTY, but the NIA20 will still be shut down when .CPTMF is nearing a critical threshold. 3. Reduce the time the NIA20 is shut down from one second to a half second. This interval is stored in KNIZTM and can be patched on the fly if a more appropriate value is needed. KNIZTM contains the "sleep" time in ticks. [Keywords] KNIKSP CLOCKS [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 413 APRSER APRTIM,CLKDDR,DISKAL,TIMIN2,TIMIN4 CLOCK1 APRSU2,APRSU4,INITIC COMMON .CPRTM,SPRI5A DTESER STAPPC,STXPP1 KNISER .PBOVF,KNIOVC,KNIPAU,KNIRQ1,KNISEC,KNIZTM ONCMOD ONCBN1 SYSINI APRCHK,GETOPT,HAVTM5,ONCIN0,SYSIN1,TACINI 704A [End of MCO 14280] MCO: 14281 Name: DPM Date: 13-Feb-90:05:52:48 [Symptom] Stopcode EUE on a KS10 doing MTAPEs. [Diagnosis] IRBIVA in the IORB is filled in with either a zero or a callback address to be processed at interrupt completion. Under extended addressing monitors, we take care to make sure that the left half of the address contains no junk (flags) from the MTAPE dispatch table. We neglect to do the same for KS10 monitors. [Cure] Clear junk in left half word. [Keywords] MTAPE [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Single-section monitors only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 414 TAPUUO MTAPG2 704B [End of MCO 14281] MCO: 14282 Name: RCB/DPM Date: 13-Feb-90:06:36:04 [Symptom] UIL stopcode when an SA10 controller with no attached devices thinks it found a NXM when trying to access its low core logout area. [Diagnosis] Code not defensive for this case. [Cure] Yes. [Keywords] SA10 Channel NXM [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 414 SAXSER SAXINT,SAXIN1 704B [End of MCO 14282] MCO: 14283 Name: DPM/RCB Date: 13-Feb-90:06:51:30 [Symptom] Too hard to look at paging crashes. [Diagnosis] No GETTABs available to point to the start of the various queues and tables. [Cure] Add some: %VMPTB==44,,113 ;ADDRESS OF PAGTAB %VMPT2==45,,113 ;ADDRESS OF PT2TAB %VMMTB==46,,113 ;ADDRESS OF MEMTAB %VMEVM==47,,113 ;AOBJN POINTER TO EVM BITMAP %VMPTR==50,,113 ;POINTER TO FREE PAGES (PAGPTR) %VMINQ==51,,113 ;HEADER OF THE "IN" QUEUE %VMINC==52,,113 ;COUNT OF PAGES IN THE "IN" QUEUE %VMSNQ==53,,113 ;HEADER OF THE SLOW IN QUEUE %VMSNC==54,,113 ;COUNT OF PAGES IN THE SLOW "IN" QUEUE %VMIPQ==55,,113 ;HEADER OF THE IN-PROGRESS PAGING QUEUE %VMIPC==56,,113 ;COUNT OF PAGES IN THE IN-PROGRESS QUEUE %VMOUQ==57,,113 ;HEADER OF THE "OUT" PAGING QUEUE %VMOUC==60,,113 ;COUNT OF PAGES IN THE "OUT" QUEUE %VMLPT==61,,113 ;HEADER OF THE QUEUE OF LOCKING PAGES %VMLPC==62,,113 ;NUMBER OF PAGES IN THE LOCK QUEUE %VMLCT==63,,113 ;NUMBER OF AVAILABLE PAGES ACCOUNTING FOR %VMLPC [Keywords] PAGING [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change UUOSYM change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 414 COMMON .GTVM UUOSYM 704B [End of MCO 14283] MCO: 14284 Name: DPM/RCB Date: 13-Feb-90:07:20:41 [Symptom] DECnet stopcode ROUCGV at system start in routing monitors. [Diagnosis] When DECnet initializes, all events are logged by default. NML hasn't started up yet, so there's no way to suppress logging of anything. [Cure] Change the default from logging all events to logging none. This is consistant with the way VMS works. It does require, however, customers make a change to NCP.CMD to selectively enable logging events. [Keywords] DECnet events [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Beware file entry required [BEWARE text] Customers wishing to have DECnet events logged must change their NCP.CMD file to include the appropriate SET LOGGING FILE EVENT command for the event types they are interested in. [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 414 NTMAN NMXFIL 704B [End of MCO 14284] MCO: 14285 Name: DPM Date: 20-Feb-90:08:35:03 [Symptom] Stopcode KNIKSP. KLNI overloaded with incoming packets. [Diagnosis] Networks are getting busier every day. [Cure] Turn on hardware multi-cast filtering in the NIA20. Note that this works only for single CPU monitors. Don't fully understand the problems with SMP but they are definitely in DECnet. [Keywords] MULTI-CAST [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 415 ETHSER KNISER COMMON 704B [End of MCO 14285] MCO: 14287 Name: DPM Date: 27-Feb-90:05:22:15 [Symptom] If a CPU has internal memory but some external channels, the monitor will happily try to use the external channel. [Diagnosis] No one ever bothered to check for this case. [Cure] At system startup, have AUTCON make note of the fact that internal memory is in use. Then, when about to build a channel data block, check for internal memory. If it is in use, don't build the channel data block. Note: Edit 100 to BOOT fixes an identical problem. However, the monitor is not dependent upon the BOOT change or vice versa. [Keywords] EXTERNAL CHANNELS, INTERNAL MEMORY [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 416 AUTCON AUTCHN,AUTCPU,AUTMEM 704B [End of MCO 14287] MCO: 14288 Name: DPM Date: 27-Feb-90:05:34:47 [Symptom] When a continuable stopcode occurs, the time of day is off by upwards of 30 seconds when the system continues. [Diagnosis] Previous to MCO 14xxx, BOOT was called with the PI system turned off. Therefore, there were no timer interrupts to count and time of day accuracy was lost. After said MCO, the monitor relied on the meters to maintain accurate time. But the microcode updates the mega-ticks in the EPT. When BOOT is called, the EPT is switched and the counters updated in the wrong place. So, APRTIM can't maintain time of day accuracy even though it ought to work. [Cure] Pick up the incremental RDTIME values left by BOOT in its vector and adjust .CPRTM appropriately. [Keywords] TIME OF DAY [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 416 MONBTS BTCALL 704B [End of MCO 14288] MCO: 14289 Name: DPM Date: 6-Mar-90:08:34:44 [Symptom] Problems when multiple jobs try to change the policy CPU at the same time. [Diagnosis] Prior to changing the policy CPU, all other CPUs are forced to jump into their ACs. A check is made to be sure this has happened before proceeding and if not, a call to DELAY1 will be done. DELAY1 causes the job to be resheduled which could allow another job to execute the same code. [Cure] Don't call DELAY1. Instead wait with a JRST .-1. [Keywords] SET POLICY [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 417 CPNSER SETPCP 704B [End of MCO 14289] MCO: 14291 Name: ERS Date: 26-Mar-90:17:54:42 [Symptom] Add/Remove CPU doesn't. (1 of 3) [Diagnosis] ZAPDSK doesn't do a complete job. It does not check for active DRBs. [Cure] Check and delete DRBs for the job being zapped. [Keywords] ZAPDSK [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 420 FILIO ZAPDSK [End of MCO 14291] MCO: 14292 Name: KDO Date: 28-Mar-90:12:15:22 [Symptom] 1. UIL stopcodes 2. Undeserved MOPIFC (INF) stopcodes 3. LLMOP UUO only useful for OPERATOR jobs. [Diagnosis] 1. Zeroes in a dispatch table. 2. Too many undefined MOP functions. 3. Call to PRVJ instead of PRVBIT. [Cure] 1. Avoid jumps to location zero. 2. Support more MOP function codes. 3. Call PRVBIT. [Keywords] [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] Documentation change KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 421 LLMOP RCSFCD,ULLMOP [End of MCO 14292] MCO: 14293 Name: KDO Date: 28-Mar-90:12:17:18 [Symptom] Monitor too big. [Diagnosis] LLMOP must be loaded. [Cure] Provide an unsupported feature test switch (FTEMOP) to turn off LLMOP. [Keywords] [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] Documentation change KL10 only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 421 MONGEN COMDEV ULLMOP,LLMINI,LLMMIN,LLMPSI [End of MCO 14293] MCO: 14294 Name: ERS Date: 3-Apr-90:08:59:37 [Symptom] Add/Remove CPU doesn't. (2 of 3) [Diagnosis] ZAPDSK must change it's mapping to touch other job's DDBs. However, if we start at UUO level we'll come back with the wrong stack. Life is a serious downer after this. [Cure] If we're at UUO level make sure we switch to the NULL job stack before we change the mappings. Then switch back to the proper stack later. [Keywords] ZAPDSK ADD/REMOVE CPU [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 421 FILIO ZAPDSK [End of MCO 14294] MCO: 14295 Name: DPM Date: 10-Apr-90:05:16:04 [Symptom] If a DECnet node comes online as a non-router, then switches to a routing node, our Phase IV implementation cannot handle this change and issues a ROUNAV stopcode. ROUNAV means there is no adjacency vector for the node which now claims to be a router. This is a fairly common occurance these days, as DECnet router boxes behave in this manner. When this sequence of events occur, all further communications with the offending node are impossible. If the node was the area router, the only recourse is to reload the -10. [Diagnosis] When the ajacency block is built, the vector is not filled in because the node is a non-router and cannot possibly use the vector. Despite this, core is allocated for the vector (in a perverted sort of way), but it is never used. [Cure] Remember the vector address whenever the adjacency block is built. If the node is a router, also fill in the working copy of the vector address. Later, when a node changes its state to a router, if the adjacency vector pointer is zero, pick up the saved copy of the vector and use it. For the paranoid, if the copy is zero, then issue the ROUNAV. [Keywords] ROUNAV [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 D36PAR AJRVC ROUTER RTRBAV,RTRMAJ 704B [End of MCO 14295] MCO: 14296 Name: RCB Date: 10-Apr-90:07:43:46 [Symptom] Half-implemented feature of supporting ISO/Latin-1 in parallel with DEC/MCS doesn't really work. [Diagnosis] No code to support the ISO fallback mappings in the CHTRN. UUO. As a result, if someone with an 8-bit username logs in on a terminal which supports the ISO character set, LOGIN will convert the name to junk. [Cure] Add the code. This adds bit CH.ISO to the possibilities for the flags halfword in the UUO. If set, the fallback mapping will be that of ISO Latin Alphabet number 1 (ISO 8859-1). If clear, DEC/MCS will continue to be used. [Keywords] 8-bit ASCII ISO Latin-1 [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO Documentation change UUOSYM change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 SCNSER 704B UUOSYM CHTRN. [End of MCO 14296] MCO: 14297 Name: RCB/JAD Date: 10-Apr-90:07:51:45 [Symptom] The SNOOP. UUO doesn't work when trying to analyze code in non-zero sections. [Diagnosis] The code is too stupid to handle NZS references, even though the UUO argument block likes it fine. [Cure] Enlighten the code. [Keywords] SNOOP. UUO Multi-section code [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 UUOCON 704B [End of MCO 14297] MCO: 14298 Name: RCB Date: 10-Apr-90:07:56:34 [Symptom] Per-CPU GETTABs for non-existent CPUs can appear to succeed when they shouldn't. [Diagnosis] The extra tables are present in the monitor, but point to zeros. [Cure] Fix them to point to NULGTB instead, so that the GETTAB UUO will know better than to return junk. [Keywords] [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 COMMON 704B [End of MCO 14298] MCO: 14299 Name: RCB Date: 10-Apr-90:07:59:14 [Symptom] Monitor too slow. Commands (at least) can be delayed longer than they ought to be. [Diagnosis] SIMCHK tries to check for a monitor PC at UUODON, but only gets it right when the UUO's return is in section zero. [Cure] Yes. [Keywords] [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Extended addressing only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 CLOCK1 SIMCHK 704B [End of MCO 14299] MCO: 14300 Name: RCB Date: 10-Apr-90:08:04:29 [Symptom] User-mode diagnostics can flood the monitor with error packets faster than DAEMON can process them. We run out of freecore. [Diagnosis] The diagnostic is trying to be safe by clear its user-I/O bit whenever it doesn't need it for a while, and then doing a TRPSET to get it set again. Each TRPSET UUO makes an error entry. [Cure] Add bit UP.TUR (TRPSET UUO reported) to .USBTS, and use it to keep track of whether we've reported any recent TRPSET UUOs. If we report a TRPSET, we set this bit. If a TRPSET UUO is processed, and the bit is already set, and the AC is zero, then we'll skip logging that call. A RESET UUO will clear the bit. [Keywords] User-mode diagnostics [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 422 UUOCON TRPSTU 704B S UP.TUR [End of MCO 14300] MCO: 14301 Name: RCB Date: 11-Apr-90:16:47:09 [Symptom] KAF on one CPU in a multi-CPU system can migrate to each of them. [Diagnosis] When BECOM0 is trying to send the message to the new CTY to inform the operator that a new CPU has taken over policy, it violates the rules with respect to the SCNSER interlock. This results in a nested attempt to obtain the interlock during BECOM0, which results in another KAF, this time on the new policy CPU. The problem then repeats until we run out of CPUs. [Cure] Don't do typeout at APR PI level. Make a clock queue entry to send the message at PI 7, in (new) routine BECOM7 in CPNSER. [Keywords] Simultaneous KAFs [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Multi CPU only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 423 CPNSER BECOM0 704B [End of MCO 14301] MCO: 14302 Name: RCB Date: 12-Apr-90:02:29:58 [Symptom] MCO 14956 incomplete. [Diagnosis] Some lurking (old) bugs were made easier to exercise. Bad conversions happen for certain characters under the right(?) combinations of bit selections. [Cure] Always update the character attribute bits we're checking after we've changed our notion of the current character. [Keywords] 8-bit ASCII CHTRN. UUO [Related MCOs] 14296 [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 423 SCNSER CHTRN8,CHTRN2 704B [End of MCO 14302] MCO: 14303 Name: ERS Date: 17-Apr-90:05:58:23 [Symptom] Job hung. [Diagnosis] The job in question is the current job on a CPU that fails. After the CPU fails the job may still be the "current job" on that CPU. [Cure] When a CPU dies gracefully, have it submit a request to have it's current job informed of it's unfortunate state and left in a more friendly state. [Keywords] Hung job CPU stopcodes [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] Multi CPU only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 433 ERRCON ZAPZAP [End of MCO 14303] MCO: 14305 Name: RCB Date: 17-Apr-90:08:44:31 [Symptom] Stopcodes PFL and IME seen, others possible. [Diagnosis] When doing a MERGE or GETSEG in a core image where section 0 is full, we find a page in section 0 to 'hide' for a while so that the save/get code can have its directory page. Later, we're supposed to put it back the way we found it. However, the MERGE and GETSEG cases clobber the register in which its location is remembered. This causes us to use a bogus number later as a disk address or as a physical page number. In any case, if an error is encountered, we will simply lose the page forever. [Cure] Add a word to the UPT, .USSDP, and use it instead of P4 when trying to restore the user's saved page. Clear it when we're done with it, just for the sake of paranoia. Make GTSAVP a little more robust, and then call it from SGRELE if it looks necessary. [Keywords] PFL IME KAF corrupted core image [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 423 S .UPPAT 704B COMCON SAVFIN,RELDR6,RELDR1,SAV3,GTSAVP,SEGRLX SEGCON GETFI6 [End of MCO 14305] MCO: 14306 Name: DPM Date: 3-Jul-90:08:33:24 [Symptom] New: Define a new HOM block bit (HOMHWP==1B34) which indicates the structure must be hardware write protected before it can be mounted. This is most useful for archive disks. Add a new question to the CHANGE STRUCTURE command: Always mount structure hardware write protected (NO,YES) The default answer is the current setting, normaly NO. PULSAR version 5(544) or later respects this bit. [Diagnosis] [Cure] [Keywords] ARCHIVE,HOMHWP [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] Documentation change [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 424 COMMOD HOMHWP,STRHWP ONCMOD TRYHO4 704B [End of MCO 14306] MCO: 14307 Name: RCB/LWS Date: 10-Jul-90:08:38:57 [Symptom] Jobs using MDA-controlled tapes hang in event wait for the labeller process. Others possible. [Diagnosis] Certain UUOs, especially IPCFM., sometimes call GETWRD or a similar routine with a JCH in J. While the GETWRD series of routines has code to accept this, the case of an error and a subsequent call to MONPFH fail on this case. The end result is that a UUO will fail with an undeserved addressing error. In the particular case reported from the field, PULSAR got an error in an IPCFM. UUO to read the system PID index for a given PID. This caused PULSAR to ignore the labeller request from [SYSTEM]IPCC since PULSAR believed that the IPCF message was not from a trusted process. Thus, the job hung waiting for PULSAR. [Cure] Fix the GETWRD routines to handle JCHs in J during page faults. Don't invoke the PFH with anything but a job number in J. [Keywords] JCH in GETWRD MONPFH [Related MCOs] None [Related SPRs] 36293 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 425 DATMAN GETWRY,PUTWRZ 704B MONPFH PFHGWF [End of MCO 14307] MCO: 14308 Name: RCB Date: 10-Jul-90:10:38:28 [Symptom] Stopcode NPJ while migrating swapping space. [Diagnosis] While scanning jobs to try to remove all pages from the unit being taken down, CHKMIG scans from job number 1 up through HGHJOB. If there is a gap in the list of assigned job numbers, and if the last job before that gap was in core (swapped in) but virtual, then PFHMIG will be called for the unassigned job number. [Cure] Don't try to migrate nonexistent jobs. Test JNA in JBTSTS for each job we start to process at CHKMI1. [Keywords] Migration Swapper PFN [Related MCOs] None [Related SPRs] 36298 [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 425 SCHED1 CHKMI1 704B [End of MCO 14308] MCO: 14309 Name: ERS Date: 4-Sep-90:17:15:13 [Symptom] APRENB will sometimes give incorrect results. [Diagnosis] If the user tries to write to the high-seg the PC returned in .JBTPC is incorrect. In this case we have stompped on T1 during our call to USRFLT. [Cure] Restore the double word PC after the call to USRFLT. [Keywords] APRENB USRFLT Double-word PC [Related MCOs] None [Related SPRs] 36302 [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 426 APRSER SUILM1 [End of MCO 14309] MCO: 14310 Name: DPM/RCB Date: 28-Sep-90:04:29:56 [Symptom] There is the possibility of data corruption or physical disk damage on RA80s and RA81s when the heads are not moved for some "long" period of time. [Diagnosis] It seems that tiny foreign particles inside the HDA can become magnetized over time. If the heads are stationary long enough those magnetized particles can become attached to the surface of a disk and may actually write the disk. It is not clear if this can result in permanent damage to the HDA or simply destroy the data that was on it. [Cure] Read a random block every 20 minutes. This change utilizes the already existing code to read HOM blocks, except that a random block number is substituted. FILIO will insure that each read is not too "close" to the previous. If the newly selected random block is within 2% of the size of the disk (or 16704 blocks for an RA81), then another block is picked. The time interval is skewed by 1 second across drives to minimize I/O collisions. The UDB contains 2 new words: UNIRBT contains the initial time interval in the left half word and the running timer in the right half word. the granularity is in seconds. UNIRBN contains the last selected random block number. [Keywords] RA DISKS [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Field service attention [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 426 DEVPRM UNIRBT,UNIRBN FILIO CHKRRB,FINRRB,SETID3 RAXKON RAXSEC,RAXUP6 704B [End of MCO 14310] MCO: 14311 Name: DPM Date: 28-Sep-90:04:52:47 [Symptom] Stopcode IME when some DX20 tape errors are encountered. [Diagnosis] If a microprocessor error occurs the code to read and log the DX20 registers for error analysis always assumes there must have been some outstanding I/O request. It loads T1 up with a "saved" IORB address to check if the request was for data or positioning. If no IORB existed, then T1 contains a zero and the subsequest tests generate an illegal indirect page fail. If there is a microprocessor error, the liklehood of it happening without an IORB is pretty high. [Cure] If there is not a valid IORB, don't check for a data request. [Keywords] DX20 TAPE ERRORS [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 426 TD2KON RDIRE1 704B [End of MCO 14311] MCO: 14312 Name: KBY Date: 1-Oct-90:07:34:23 [Symptom] Stopcode KAF before MCO 14252 (still possible but less likely); monitors with lots of big (are there any other kind?) virtual jobs run way too slow. The OUT queue is most monitors is exceptionally long, which in itself isn't all that bad, except most of the pages can never be reclaimed. [Diagnosis] It turns out there are actually two problems. The first is that when a job deletes a page which is on the OUT queue, the page doesn't go to the free core list then; we just zero the MEMTAB entry so the page can't be found again (locally speaking, this is faster, easier, and less complicated than returning the page, particularly when the code was implemented). Although this is OK for small numbers of pages, a large (is there any other kind?) virtual job which CORE 0's can leave a lot of pages stranded there this way. It is true we will reclaim the pages if we need memory, but if we don't (because the customer is just paranoid in setting low physical limits and never believed what we told him about VM), but if we don't, those virtual jobs which remain have to wade through a particularly long OUT queue for any page with a disk address. The second problem is that if we "page out" a page which was write-locked by the monitor and already on the swapping space, we move the page to the OUT queue and write the disk address in the map, but zero the MEMTAB entry so that we can never find the page on the OUT queue anyway. This is the biggest generator of "useless" pages on the OUT queue on 1026. In a previous dump from Rohm, there were approximately 4400 pages on the OUT queue, only 230 of which were reclaimable (the dump was a "legitimate" KAF wading through the queue a number of times by all appearances). On 1026, it's more like 1300 pages without this MCO vs. ~150 with. [Cure] 1. Return pages to the free list when deleting them from the OUT queue. This was hard before LKPSF, but now the hard work is already done by locating the page on the OUT queue in the first place. 2. Set MT.JOB and P2.VPN (the latter for the courtesy of SPY programs) when "paging out" a write-locked page. [Keywords] KAF CORE 0 [Related MCOs] 14252 [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 426 VMSER PAGOM4,RMVPG1 704A VMSER PAGOM4,RMVPG1 [End of MCO 14312] MCO: 14313 Name: RCB Date: 1-Oct-90:15:03:18 [Symptom] PSIs never taken for TTYs connected to MPX channels. This used to work. [Diagnosis] Broken as a side effect of the change to make image mode on TTYs under MPX work. Overzealous clearing of old variables clobbered the up pointer from the TTY DDB to the MPX DDB. [Cure] Yes. [Keywords] MPX TTY PSIs [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 SCNSER TTYMPX,RECIN6 MPXSER 704B [End of MCO 14313] MCO: 14314 Name: RCB Date: 2-Oct-90:05:58:07 [Symptom] SEGOP. throws away hisegs when it shouldn't and causes IMEs. [Diagnosis] Clearing the "what are we doing" bits in SGAEND too soon, and calling PUTWRD with a hiseg number in J while depositing to a hiseg. [Cure] Yes. This makes GETWRD/PUTWRD even more robust with respect to having trash in J than they were before. Now, J only has to be valid as a JCH when the UPMP (.USJOB) hasn't been setup yet. [Keywords] SEGOP. .SGGET Losing hisegs ERFNF% IME [Related MCOs] None [Related SPRs] None [MCO status] Checked [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 427 SEGCON GETH5A,INPSE2,INPMG2 704B COMCON SOPGE6 DATMAN GETWRA,GETWRY,PUTWRA,PUTWRZ [End of MCO 14314] MCO: 14315 Name: DPM Date: 5-Oct-90:06:09:15 [Symptom] Undefined globals attempting to link a monitor without DECnet. [Diagnosis] Oversight. [Cure] Add dummy definitions for .SAVn and DDIINE to COMDEV, and conditionals around references to .CPTPN in COMMON.. [Keywords] DECNET [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 430 COMDEV .SAVN,DDIINE 704B [End of MCO 14315] MCO: 14316 Name: RCB Date: 9-Oct-90:04:34:47 [Symptom] Crashes not getting copied and DAEMON information not getting logged at system startup. [Diagnosis] EXPJOB fails, blowing away the CRSCPY command. It carefully checks for having no outstanding command on FRCLIN and its MIC interlock being free, but it does not notice that there's a job (INITIA) already running on the line. The forced .EXEC loses with '?Please type ^C first' which clears the type-ahead for the CRSCPY command. [Cure] Add a call to COMQ before deciding to try to force the .EXEC command. [Keywords] CRSCPY AVAIL SPEAR [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 430 COMMON DAEJOB CLOCK1 STDAEM,EXPJOB [End of MCO 14316] MCO: 14317 Name: KBY Date: 13-Oct-90:12:19:50 [Symptom] Multiple guess: The infamous PQW bug is: (a) A rare species; very difficult to find (b) Of world-wide distribution, but most common in Germany (c) Finally extinct (d) All of the above Other scenarios are possible, but PQW is the most common. Virtual page number spray into the P2.VPN field of certain pages, specifically those whose physical page number lies between JOBMAX+M.CPU and JBTMAX+M.CPU (i. e. pages numbers that correspond to high segment numbers expressed as SPT offsets). Potentially causes problems whenever a new high segment is initiallized as the spray always happens then, but usually only causes problems when the specified pages occur are on the IN or SN queues at the time of initialization. [Diagnosis] When initializing the new high seg in the user's address space, NREMAP eventually gets called to move the pointers to the high seg map and convert the read-in pages into indirect pointers to the secondary map, and also to move the pointers to the correct high segment origin. MV1PG is called for the latter, and part of the code it calls (primarily for the usage of the MOVPGS code) checks to see if the page being moved has PM.OIQ on; if so it changes the P2.VPN field to reflect the new virtual page the page is being moved to. The problem is, by this time the map is supposed to contain indirect pointers, where the PM.OIQ bit isn't valid any more (it overlaps into the secondary map offset field in the secondary pointer). Thus, if the address offset into the secondary map has the PM.OIQ bit on as part of that offset (=40), the physical page corresponding to the high seg number + M.CPU (=the SPT offset) gets sprayed with all, but ending with the last virtual page number of the high seg being initialized that has the PM.OIQ bit on in the offset. This means that low numbered pages are affected and tend to get sprayed with relatively high virtual page numbers. Things go downhill from there. An additional problem is that NREMAP doesn't actually set up the indirect pointers correctly anyway; these presumably get fixed up later by some call to REDOMP before the user ever tries to use them. [Cure] Make NREMAP set the indirect pointers up correctly. Make MVPMT return if the pointer isn't a direct (PM.DCD) type pointer. [Keywords] PQW P2.VPN spray [Related MCOs] None [Related SPRs] 36285 [MCO status] None [MCO attributes] None [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 430 VMSER MVPMT,NREMA6 704A [End of MCO 14317] MCO: 14324 Name: KBY/RCB Date: 7-Nov-90:09:47:05 [Symptom] KAF uncovered on KS (430 doesn't). [Diagnosis] Losing DRP flag in GVFWDS from call to get a large (multi-page) block of funny space. [Cure] Chnage some SPUSH/SPOP macros to PUSH/POP instructions to preserve the flag even on the KS. [Keywords] KAF KS10 [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] KS10 only Single-section monitors only [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 431 VMSER GVFWD3,GVFWD4 [End of MCO 14324] MCO: 14325 Name: DPM/RCB Date: 7-Nov-90:09:47:05 [Symptom] Time is short. [Diagnosis] The end is near. [Cure] It's time to fade away. The members of the TOPS-10 group, their management, and their close relatives would like to thank you for your support over the past twenty-six years. It has been a pleasure working with you. We hope you have fond memories of your association with us and your DECsystem-10. Ann Barr Spider Boardman Dave Braithwaite Joann Creely Tony Dziedzic Dave Eklund Linda Feldeisen Jim Flemming John Francini Bob Frohreich Ruth Fong Tim Litt Don Mastrovito Kevin O'Kelley Julie Pratt Christine Quiriy Ned Santee Larry Sendlosky Kimo Yap [Keywords] [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 431 COMMON A00DLN [End of MCO 14325] MCO: 14327 Name: RCB/ERS Date: 29-Nov-90:11:34:23 [Symptom] STOPCDs IME, IBZ, UIL during or shortly after system startup. Uninterruptable loops at PC 0 also seen. [Diagnosis] Bugs in FLPUDB/DETUDB, and in LLMOP. LLMOP was trying to modify data in section three while running in section zero, and FLPUDB was doing a POPJ with U still pushed on the stack, due to an alternate entry point for ONCMOD for ONCBND's use. The latter has been observed to lead to a UIL, and could account for the PC 0 looping (as an MUUO, if the MUUO trap location got zeroed as a side-effect of this bug). [Cure] SE1ENT and stack re-phasing as appropriate. [Keywords] IME UIL IBZ PC 0 UUO loop System startup [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO Field service attention [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 ONCMOD ONCBND FILIO DETUDB LLMOP LLMINI [End of MCO 14327] MCO: 14328 Name: RCB Date: 29-Nov-90:17:19:36 [Symptom] FRCLIN INITIA hangs in NA state. System startup never completes. [Diagnosis] TTYDET does not wake up COMCON when it should. [Cure] Add a check for break characters into CNCMOD, and call COMSET if it look appropriate. [Keywords] Hung startup FRCLIN INITIA [Related MCOs] None [Related SPRs] None [MCO status] None [MCO attributes] New development MCO [Validity] Monitor Load Module Tags ------- ------ ------ ------ 705 432 SCNSER CNCMOD [End of MCO 14328]