PDP-10 Archive: stanford/5-swskit/parity.mem from SRI_NIC_PERM_SRC_3

Trailing-Edge - PDP-10 Archives - SRI_NIC_PERM_SRC_3_19910112 - stanford/5-swskit/parity.mem
There are 3 other files named parity.mem in the archive. Click here to see a list.
As part of release 5, TOPS-20's handling of parity errors has been
improved. The following is a concise description of the changes.

The monitor no longer enters secondary protocol upon
detecting an MB parity error or an AR/ARX parity error. Instead, the
analysis of the condition is made while primary protocol
is still in force. If the error is recoverable, then the
job 0 SYSERR fork will produce the CTY report. As such there
are a few differences to note:

	1. The information has been changed. The PC is no longer
	reported for an MB parity error.

	2. Because the typeout is in primary protocol, the
	characters may be interpsersed with other CTY output.

	3. The format of the numbers is slightly different.

In general the information presented is the same as before,
although the formatting may differ slightly.

Should the monitor BUGHLT before the SYSERR fork is able to process
the information, the monitor will typeout a report of
each parity error in secondary protocol as part of the BUGHLT processing.

Other signiificant changes:

	1. A channel write parity error will not perform
	a memory scan as part of the APR interrupt.
	However, PHYSIO will be informed when
	such an error occurs, and subsequent channel logouts
	will result in a memory scan of the pages written by the
	channel until the parity error is found and corrected.
	Furthermore, for each operation that is found to
	have deposited bad parity in memory, the process will
	be informed of a "device error" and any words
	that were written into memory
	with bad parity will be set to all zero. This last change
	eliminates the risk of spurious read parity errors
	occurring when a process attempts to read one or more of
	the bogus words.

	2. Write parity errors of any nature will not cause pages to be
	placed offline.

	3. MB parity errors not caused by a channel write will
	cause only the page containing the detected bad word to
	be scanned. No error will cause a full memory scan.

	4. In general, MOS data write parity errors will not
	be reported to TGHA. The exception to this occurs
	when multiple data write parity errors occur so close
	together that it is not possible to correlate all of them
	with the ERA latch.

	5. MOS errors will either be reported to TGHA or be
	ignored by unlatching the memory controller.

	6. Parity errors will only produce service interruptions
	(secondary protocol) if the error results directly in a BUGHLT.

In general, the philosophy applied in making these changes was to
minimize service interruptions due to error analysis and to
pinpoint the error as accurately as possible. In cases where
spurious or secondary errors were known to occur (e.g. channel write
parity errors) that could confuse field service or TGHA, the monitor
was changed to either eliminate the spurious error or to
correctly report its origin.