PDP-10 Archive: kcc-6/kcc/crtsym.doc from SRI_NIC_PERM_FS_1

Trailing-Edge - PDP-10 Archives - SRI_NIC_PERM_FS_1_19910112 - kcc-6/kcc/crtsym.doc

There are 6 other files named crtsym.doc in the archive. Click here to see a list.

		CRTSYM - KCC C RunTime Symbol information

	This file documents the various special symbols that KCC's
C runtime routines use in order to achieve various effects.  The user
program never sees these.
	NOTE: MUCH OF THIS FILE IS STILL AN IMPLEMENTATION SPEC, NOT
	DOCUMENTATION!  --KLH

Symbol conventions:

	The current PDP-10 software allows symbols to have 6 characters
from the set A-Z, 0-9, ., %, $.  KCC maps 0-9 to 0-9, a-z and A-Z to A-Z,
and '_' to '.'.

	KCC supports a non-standard extension to C whereby any characters
enclosed within accent-grave ('`') marks are treated as a valid C identifier.
This allows the user to specify identifiers containing the characters '$'
and '%', as well as any arbitrary character, although KCC will print a
warning if a character not in the PDP-10 set is seen.

	Examples: `$FOO`, `OPENF%`, `$$BP`

KCC internal symbol conventions:

	KCC internal symbols are in two main categories, the C library
and the C runtime.  The C library internals simply use standard C
identifiers, with an underscore '_' prefixed to those names which
should not be seen by users.  The C runtime symbols, however, without
exception must use PDP-10 symbols which cannot conflict with any C
identifier.  These runtime symbols follow these conventions:

	$$$xxx	The name of some major runtime module ($$$CPU, $$$CRT).
	$$xxx	A load time constant.
	$xxx	An address (either code or data).
	$nnn	Reserved for KCC code generation labels.
	%%xxx	Assembler Macro
	%xxx	Assembler Macro "instruction"


TOPS-20 monitor symbol conventions:

	The following comments were extracted from the TOPS-20 monitor's
MONSYM.MAC file and describe the general form of monitor symbols.
This is helpful to know when figuring out what names to use for
user-defined symbols so that they will not conflict with existing
system-defined ones.

		----------------------
	Certain conventions are observed regarding the construction of
symbols as follows: ("x", "g", and "s" represent any alphanumeric)

        xxxxx.	= an opdef or macro defininition
	xxxxx%	= a JSYS instruction (JSYSes existing prior to v4 are also
			defined without the %)
        .ggsss	= a constant value
        gg%sss	= a mask, i.e., a bit or bits which specify a field

Symbols containing multiple periods may be used internally by some macros.
Symbols containing "$" are not used or defined by DEC and are reserved
for customer use.  Note however that certain macros in MACSYM create
symbols containing "$" by prefixing a "$" to a name supplied by the
user.  DEFSTR, MSKSTR, and the stack variable macros all do this.

For JSYS argument values, the symbol is usually divided into a general
part (gg) relating to the JSYS, and a specific part (sss) identifying the
function.  E.g. GJ%OLD, GJ%FNS, and .GJDEF are arguments to GTJFN%.

The symbols .OFxxx, .SZxxx, and .PSxxx are reserved by RMS-20.

	---------------------------------------------

KCC Runtime Support modules:

CPU:	$$$CPU is the dummy entry point for the CPU type selection module.
CRT:	$$$CRT is the dummy entry point for the C Run-Time support module.

KCC contains an internal table of all the symbols available from the
CRT and CPU modules.  If any reference is made to them, the appropriate
assembler request will be generated.

The CRT and CPU modules are ALWAYS requested, by asking for the symbols
$$$CRT and $$$CPU.

KCC Version specification:

	A definition of the symbol $$CVER is inserted in the preamble
of all compiled modules, and contains a value describing the general
version of KCC which compiled the module.  This value is derived from the
variables "cvercode" and "cverlib", defined in CCDATA; those variables can
be patched if necessary.
	During loading, if any module has a definition of $$CVER that
differs from any other, the linking loader will complain with a MDS
(Multiply Defined global Symbol) error:
	%LNKMDS Multiply-defined global symbol $$CVER
	        Detected in module PRINTF from file C:LIBC.REL
	        Defined value = 1000001, this value = 2000001

	Whenever KCC is changed to emit code which is not compatible
with the code of previous versions of KCC, then $$CVER should be changed,
by changing the initializations of "cvercode" and "cverlib" in CCDATA.
This ensures that code produced by two incompatible versions of KCC is
never inadvertently loaded together.
	The exact value of $$CVER is somewhat irrelevant to its purpose,
but presumably it could contain some useful information.  For example,
it is now structured as:
	<KCC code version>,,<LIBC version>

The inclusion of a libc version reflects the fact that the standard .h files
or library routine specifications may change.

Notes on CPU-dependent compilation:

	[Could use the same mechanism for SYS-dependent compilation???]

	The following main types of PDP-10s are described by the DEC
processor reference manual (appendix E, Processor Compatibility):
	   (2)  (3)   (n)  # char names
	1. KA = KA  = KA10
	2. KI =	KI  = KI10
	3. KS = KS  = KS10
	". KS = KLS = KL10S	(Single section KL10A)
	4. KL = KL0 = KL10	(Extended KL10B, section 0)
	5. KX = KLX = KL10X	(Extended KL10B, section n)

	From the user viewpoint, the KS10 and KL10S (single section
KL10) are the same, therefore they are both considered as type (#3 above).

The main difference between type KS and type KL is that the KS does
not support G format floating point whereas the KL does.

CPU differences relevant to KCC:
		    KX	has many differences for extended addressing.
	      KS+KL+KX	do not have KA-style double fp.
	KA+KI+KS	do not have G floating point format instructions.
	KA+KI		do not have:
				ADJBP,ADJSP
				DADD,DSUB,DMUL,DDIV	(double integer ops)
				any of the EXTEND instructions (XBLT, etc)
	KA		does not have:
				DMOVx
				DFAD, DFSB, DFDV, DFMP
				FIX, FIXR, FLTR
	KA+KI		have: KA-style double precision fp

	Note: KA-style software double precision floating point involves
	these instructions: DFN,UFA,FADL,FSBL,FMPL,FDVL

	There are some minor diffs in double-prec fp (DF[AD/SB/DV/MP])
	between KS+KL+KX and KI.

	KCC inserts symbol definitions in the postamble of each
compiled module that describe what CPUs the code will work properly for.
The symbols are of the form
		$$CPxx where xx = KA, KI, KS, KL, KX.
All   POSSIBLE CPUs will have NO symbol defined.
All IMPOSSIBLE CPUs will have their $$CPxx symbol set to 0.

	When loading a program, a specific CPU module defines the
symbols for all POSSIBLE CPUs to be set to 1.  (KCC could do this
itself with LINK switches someday.)  All impossible CPUs will have NO
symbol defined.

	The effect of this is that loading for a specific CPU type (or types)
will succeed if all modules fail to define that symbol (ie it is possible).
However, the load will FAIL if any module does define that symbol, because
its value will be 0 as opposed to the loader's initial definition of 1,
and the loader will complain about a multiply defined global ("MDS" error).

	The only problem with this scheme is that if a new CPU type is
introduced (consequently a new $$CPxx symbol), then all old modules
will appear to be OK since none of them will define that symbol (as 0 or
anything else).  The solution to this is to change KCC's $$CVER symbol
if a new CPU type is ever added, which will force recompilation of
all modules.

	KCC's current default CPU type consists of KS+KL+KX.  That is,
the binary will run on all three types.  KA and KI require recompiling
with different code definitions.

Notes on loading for KX (non-zero section program)

For a program whose modules were all compiled with type $$CPKX
permitted, the symbol $$SECT, defined at load time, determines whether
the program is built to run single-section or multi-section.  This
symbol, as well as several others, are all defined by a specific CPU
module.

Symbol $$SECT is always defined by KCC or CPU to be the section number to
load into; this is always 0 unless the desired CPU type for the program is
$$CPKX.  Actually code is loaded into and started at section 0; the
startup moves it to the appropriate section.  The stack will be at
$$SECT+1, and allocated memory at $$SECT+2.

Symbols $$BPmn are defined as the LH values for byte pointers, where m
= byte size, and n = byte # within word.  Thus symbols exist for the
following values of mn:
	90, 91, 92, 93
	80, 81, 82, 83
	70, 71, 72, 73, 74
	H0, H1	(halfwords, 18-bit)
Whenever $$SECT is 0, these values correspond to local-format byte pointers.
Otherwise, the values specify one-word global-format byte pointers (OWGBPs).

There are certain other symbols such as $$BPPS.  These are all listed in
CPU.C with brief descriptions.

Loading mechanics:

	We want to defer the final decision on $$CPxx until load time.
This decision has to be communicated to the loader somehow, either by
	(1) default,
	(2) EXEC command line
	(3) KCC command line
	(4) LINK commands

LINK command specification:
	Everything ultimately boils down to this.  There are the following
ways of setting the symbols and stuff:
	(a) explicit module specification (certain names, certain orders)
	(b) LINK switches
	(c) LINK commands within the modules (just "main", or all?)

	Can use /REQUIRE:name to make a symbol request (like ld's -u).
	Can use /DEFINE:name:decimal to define a global symbol.
	Can use filename/SEARCH to search a library.

KCC command line specification:
	The -i switch can be interpreted as meaning "load for $$CPKX"
(ie a multi-section KL program).


Problem with using COMPIL-type EXEC commands:
	for COMPILE (no LINK) - how to set up so that a LOAD does the right
		things?
	for LOAD/EXECUTE - if KCC invoked, can mung the LNK tmpcor command
		file.  If not invoked, same problem as above.

Could include LINK commands in whatever module contains "main".
Or could just forget it, and always use KCC itself to invoke LINK,
with its special knowledge.  (Similar to the way UNIX CC invokes LD).


PLAN:
	Set up a new module, called CPUxx where xx is the machine type
to load for.  This module defines $$CPxx, $$SECT, $$BPmn.
	All modules (including that one) will continue to request a
search of LIBC.REL.  This may change someday.
	The default CPUxx module will be the FIRST one in LIBC.REL.
This is not referenced if some other CPUxx module has already been seen
because all the symbol refs have already been satisfied.

	All supported CPU types will be represented as C:CPUxx.REL files.
To specify a non-default type, either KCC or the user (directly) includes
a request to load C:CPUxx.

	To generate the default CPUxx, just compile with the default C-ENV.
	To generate the others, compile with explicit switches.