PDP-10 Archive: c/old/kc/cc.doc from SRI_NIC_PERM_FS_1

Trailing-Edge - PDP-10 Archives - SRI_NIC_PERM_FS_1_19910112 - c/old/kc/cc.doc
There are 8 other files named cc.doc in the archive. Click here to see a list.
CC is a C compiler  subset that runs on  SAIL and SCORE.  Currently,  it
supports most of Version 7 constructs that are described in:

   Kernighan,B.W.,D.M.Ritchie,  "The C Programming Language",
              Prentice-Hall, 1978. 

Floating point operations has not been  implemented yet.

------------------------------------------------------------------------
To run CC, simply say

		.R CC; foo

where "foo" is the C source file to be compiled. If an extension is  not
explicity given, the compiler will first look for the file "foo.c",  and
failing that, it will look for the file "foo".

CC produces a FAIL source that can be assembled and loaded by the  usual
means. CC emits code that will load in the neccessary runtime  functions
from SYS:CLIB.REL.  Included  in  the  package are  most  of  the  STDIO
routines given in the reference above.

------------------------------------------------------------------------
CC is basically recursive descent,  with expressions parsed by  operator
precedence. The C Preprocessor is  part of the compiler. Currently,  the
following preprocessor functions are defined:

		#include
		#macro
		#ifdef
		#endif.

Auto and macro identifiers are unique  to 9 characters. Upper and  lower
cases are distinguished. Extern and  static identifiers are unique  only
to 6 characters and upper and  lower cases are not distinguished  (don't
blame me, FAIL limits me to that).

Short, int and long are all 36-bit words. This may change in the future,
but you can always be sure that int shall always be 36-bits and long  is
at least as large as an int and int is at least as large as a short. Int
pointers are implemented simply as 36-bit machine addresses.

A scalar char is stored as a  36-bit word. Char arrays are packed 4  per
word. The arrays  (and pointers) are  represented as standard  DEC-10/20
9-bit bytes (and byte pointers).  [Don't worry, char pointers to  scalar
char, i.e.   &foo, works  the  way it  should...]  Char  pointers  point
exactly at the byte  in question, and  not at the  previous byte, as  in
TOPS-20 JSYS conventions (beware, those  who want to interface  directly
with TOPS-20).  Standard STDIO file operations converts 7-bit byte files
to the internal 9-bit bytes so you will not have to do anything special.
In addition, a hack was put in  into the runtime to allow files  created
with the E  directory page to  be properly read.  SOS-created files  may
cause problems.

Float and double  are not  yet implemented.  Struct,  union and  typedef
are, so all is not lost. 

The shift operations (<<,  >>, >>= and <<=)  are interpreted as  logical
shifts.

[This paragraph  is  intended for  those  who wish  to  write  assembler
program interfaces.  Those who live in  the 20-th century may skip  this
paragraph.]  Arguments to functions are passed by pushing them onto  the
stack.  The stack pointer is  the usual P (AC017).   AC0 is used as  the
function return value.  Arguments are pushed onto the stack so that  the
first argument is  closest to  the return  address after  a PUSHJ.   The
value of the number of arguments is not pushed.  Variable-argument calls
are possible if everything depends  on the information contained in  the
first argument, such as done in printf().  Upon entering a function, the
first argument is addressed via -1(P),  the second via -2(P) and so  on.
No frame pointer is used, nor  is one neccessary.  The compiler  updates
its internal  representation of  the stack  pointer along  the way.  The
caller is  responsible for  popping the  arguments back  off the  stack,
since it is  the only  one who  has knowledge  of how  many arguments  a
function is called with.  The callee is responsible for allocating stack
space for auto (local) variables  and releasing them before executing  a
return.

Some type  coercions (casting)  are implemented.  These are  (char)  <->
(int), (char *)  <-> (int *)  and (char *)  <-> (struct *).

(int *) -> (char *) and (struct *) -> (char *) make good sense.  But, if
you are  demented enough  to try  (char *)  -> (int  *) or  (char *)  ->
(struct *), it will try its best, which may still not be what you  want.
It simply zeros out the left half of the byte pointer and hands that  to
you as  the int  or struct  pointer.  I.e.,  the int  or struct  pointer
points to the word  that contains the character  pointed to by the  char
pointer.

Int and struct pointers are assured to be zero on the left half-word  so
that you can do meaningful comparisons.

The only claim I am willing to  make at the moment is that the  compiler
will compile itself. Since I write pretty straightforward code, this may
mean little. It has also been used to write two moderate-sized  programs
(one is  a  wire-wrap  compiler  and  the other  is  one  that  takes  a
description of a logic diagram and produces a schematic diagram in PRESS
format, both at SCORE), so the compiler does work in some fashion.   The
code it generates is not super, nor is it rotten. Written in C, it  runs
moderately fast, 15,000 to 30,000 source lines per minute, depending  on
the amount of white-space in your program.

Please report bugs  to KC%SU-AI or  CSL.SP.KCHEN%SU-SCORE, whichever  is
more convienient.  It may take  some hours before I  find time to fix  a
reported bug.  You will  have to code around  the bugs in the  meantime.
Those who are weak of heart are advised to stick to PASCAL.