PDP-10 Archive: c/kcc5/omail.txt from SRI_NIC_PERM_FS_1

Trailing-Edge - PDP-10 Archives - SRI_NIC_PERM_FS_1_19910112 - c/kcc5/omail.txt

There are no other files named omail.txt in the archive.

 4-Sep-84 21:51:09-PDT,4644;000000000001
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Tue 4 Sep 84 21:51:05-PDT
Date: Tue 4 Sep 84 21:50:57-PDT
From: Len Bosack <[email protected]>
Subject: (Oh no!) I've been looking at KCC's code again
To: [email protected]
cc: [email protected]

I not sure how much some of these are worth (or if you/David want to
fix them), but...
All of these are from <KCC.CTEX>C1.C:

 lastbop=(S4) -1;
 doingleaders=false;
 deadcycles=(S4)0;
 halfbuf=1024/(U2)2;
 dvilimit=1024;
	MOVNI	4,1
	MOVEM	4,lastbo	;maybe SETOB? Happens several times...
	SETZB	6,doingl
	SETZB	10,deadcy
	MOVEI	12,2000
	IDIVI	12,2		;the constant folder got confused?
	MOVEM	12,halfbu
	MOVEI	15,2000
	MOVEM	15,dvilim

 trie[(U2)0].rh=(U2)0;
 trie[(U2)0].V.R1.b1=0;
 trie[(U2)0].V.R1.b0=0;
 for(k=1;k<=127;k++)M(trie[k],trie[(U2)0]);
	SETZ	5,
	IMULI	5,3		;ahem. Constant folder/ address folder?
	SETZB	4,trie(5)
	SETZ	11,		;this happens quite a bit.
	IMULI	11,3
	SETZB	10,trie+2(11)
	SETZ	15,
	IMULI	15,3
	SETZB	14,trie+1(15)
	MOVEI	5,1
	MOVEM	5,-1(17)
$115::
	SETZ	7,		;really loses in loop.
	IMULI	7,3
	MOVE	12,trie(7)
	MOVE	14,-1(17)
	IMULI	14,3
	MOVEM	12,trie(14)
	AOS	7,-1(17)
	CAIG	7,177
	JRST	$115

  if(tally<trickcount){
   trickbuf[(U1)(tally%(S4)64)]=s;
   }
$133::
	MOVE	6,tally
	CAML	6,trickc
	JRST	$127
	MOVE	12,-1(17)
	IDIVI	6,100			;if the sign of modulus is always +,
	ADJBP	7,[331100,,trickb]	;shouldn't this be an ANDI 6,77 ?
	DPB	12,7
	JRST	$127
    while(true){
     if(xord[n]<48||xord[n]>57){
      writeS(output,"! TEX.POOL check sum doesn\'t have nine digits.",
        46);
      writeln(output);
      aclose(&poolfile);
      R=false;
      goto L10;
      }
$374::
	MOVE	12,-4(17)
	ADJBP	12,[331100,,xord]
	LDB	14,12
	CAIGE	14,60
	JRST	$376
	MOVE	4,-4(17)		;the answer is already in 14
	ADJBP	4,[331100,,xord]	;happens a lot (>20 times) in C1
	LDB	6,4
	CAIG	6,71
	JRST	$375
$376::
	PUSH	17,[56]
	XMOVEI	11,$377
	IOR	11,$BYTE+4
	PUSH	17,11
	PUSH	17,output
	PUSHJ	17,writeS
	ADJSP	17,-3
	MOVE	3,output
	PUSH	17,0(3)
	PUSH	17,[12]
	PUSHJ	17,putc
	XMOVEI	7,poolfi
	MOVEM	7,-1(17)
	ADJSP	17,-1
	PUSHJ	17,aclose
	ADJSP	17,-1
	SETZB	11,0(17)
	JRST	$347

  if(x>=(S4)0){
   R=x/n;
   remainder=x%n;
   }
  else{
   R= -( -x/n);
   remainder= -( -x%n);
   }
$500::
	SKIPGE	5,-3(17)
	JRST	$501
	IDIV	5,-4(17)	;note remainder still in 6
	MOVEM	5,0(17)
	MOVE	15,-3(17)
	IDIV	15,-4(17)
	MOVEM	16,remain
	JRST	$476
$501::
	MOVN	7,-3(17)
	IDIV	7,-4(17)
	MOVN	7,7		;this case may be too hard for this technology
	MOVEM	7,0(17)
	MOVN	14,-3(17)
	IDIV	14,-4(17)
	MOVN	15,15
	MOVEM	15,remain

printword(w)
PARM1 memoryword *w;
{
 /* begin printword */
 printint((*w).pint);
 printchar(32);
 printscaled((*w).pint);
 printchar(32);
 printscaled(roundR((real)65536*(*w).gr));
printw:
	MOVE	4,-1(17)
	PUSH	17,0(4)
	PUSHJ	17,printi
	ADJSP	17,-1		;how about MOVE r,[40] -> Movei r,40
	PUSH	17,[40]		; MOVEM r,(17)
	PUSHJ	17,printc	;happens a lot.
	ADJSP	17,-1
	MOVE	12,-1(17)
	PUSH	17,0(12)
	PUSHJ	17,prints
	ADJSP	17,-1
	PUSH	17,[40]
	PUSHJ	17,printc
	MOVE	5,-2(17)
	MOVE	6,0(5)
	FSC	6,20
	MOVEM	6,0(17)		;got this one due to expr
	PUSHJ	17,roundR
	MOVEM	1,0(17)		;likewise
	PUSHJ	17,prints
	ADJSP	17,-1

Date: 5 Sep 1984  09:52 EDT (Wed)
From: David Eppstein <[email protected]>

Most of those look like they're caused by null type conversions
getting in the way.  One of you should probably change it so that null
conversions never get put explicitly into the parse tree, they just
immediately cause the data type of the top of the tree to change.

Starting today I may actually be busier, so I may not have time to do
any of that.  Tell me what happens.

Date: 5 Sep 1984  09:55 EDT (Wed)
From: David Eppstein <[email protected]>

That IDIVI 6,100 should stay that way.  It has no way of knowing that
6 contains a positive number, and if 6 contained negative then IDIVI
and ANDI come out with different results.

Date: 5 Sep 1984  09:57 EDT (Wed)
From: David Eppstein <[email protected]>

This time re: the MOVE/ADJBP/LDB duplication.
Looks like findcse() should be taught about ADJBP, and foldbyte()
(the ADJBP optimizer) should be taught about findcse().

Date: 5 Sep 1984  10:13 EDT (Wed)
From: David Eppstein <[email protected]>

Note that
	MOVN	7,7
	MOVEM	7,0(17)
should become
	MOVNM	7,0(17)
	MOVN	7,7
and the MOVE flushing in genrelease() should make sure to also flush MOVN.
 5-Sep-84 11:33:20-PDT,609;000000000001
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Wed 5 Sep 84 11:33:17-PDT
Date: Wed 5 Sep 84 11:29:40-PDT
From: David Eppstein <[email protected]>
Subject: KCC improvement
To: [email protected], [email protected], [email protected]

	MOVEI	3,40
	DPB	3,[331100,,xchr+10]
	MOVEI	6,41
	DPB	6,[221100,,xchr+10]
	MOVEI	11,42
	DPB	11,[111100,,xchr+10]
	MOVEI	14,43
	DPB	14,[1100,,xchr+10]

should become

	MOVE	3,[40041,,42043]
	MOVEM	3,xchr+10

(progressing through larger and larger MOVEI+DPB until it covers the full word).
-------
 5-Sep-84 19:43:34-PDT,1218;000000000001
Mail-From: SATZ created at  5-Sep-84 10:00:15
Date: Wed 5 Sep 84 10:00:15-PDT
From: Greg Satz <[email protected]>
Subject: cc correspondence
To: "*PS:<KCC.CC>CC.TXT.1"@SU-SIERRA.ARPA
Phone: (415) 497-1004

KRONJ, TTY127, 5-Sep-84 9:41AM
my feeling is the right thing to do is not be able to get that sort of
error.  like for cases, instead of this table use some extra fields in
the case stmts to make a linked list.  or for things like nodes and
strings and symbols and types, keep pools of free ones and allocated them
with sbrk as needed. pseudo-code is ok because the bottom of the ring
can merely be emitted when space gets tight (but it used to not even do that).

KRONJ, TTY127, 5-Sep-84 9:48AM
it's easy enough to do a SETO+ADJBP instead of a MOVE if you want a
ILDB pointer.  Trouble with ILDB is you can't access it without
changing it, e.g. *p is just a LDB now but with ILDB form it would
be MOVE+ILDB.

oh, another change that would need to happen to be dynamic is that
the type hash table would need to go to hash buckets instead of
double hashing.  types never get deallocated so there would be
no need of a free type pool or complicated removal from the hash table.
-------
29-Aug-84 18:32:17-PDT,4131;000000000001
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 29 Aug 84 18:32:01-PDT
Received: from CWR20B by CUCS20 with DECnet; 29 Aug 84 21:31:50 EDT
Date: Wed 29 Aug 84 21:32:04-EDT
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: runtimes
To: SATZ%SU-SIERRA@CUCS20
In-Reply-To: Message from "Greg Satz <[email protected]>" of Tue 28 Aug 84 20:29:13-EDT

The runtimes go well, but I have not progressed as far as I wanted to
be by now with them, largely because of a death in the family which required
me to travel to the Wash D.C. area for a few days since we last corresponded.

However, I have been back a couple of days and am getting back into the
swing of things again.  I doubt that I'll be done enough to send anything
out by the end of the week, and I'm going to be travelling over the Labor
Day weekend.  However, there should not be any impediment to getting them
finished/documented by the end of the following weekend.  With luck, I 
might be able to get you a copy where the basics work by next Wednesday.

The document was useful, it has about 90% of what I need, though I have 
yet to use it.  I also need to figure out how the routine which calls
"main" works in order to finish setting things up.  In fact, in the
interest of speeding things up, I can put the equivalent routines for
a BLISS-based environment where you can FTP them away, and you all can
get started with them if you don't want to wait for me.  The BLISS-based
stuff should be very similar to what would be needed with C.

By way of background, there are basically two parts to this thing: a
UNIX kernel emulator [PAUNIX] (currently 7th edition), and a set of libraries
(one for each language) that provides language-callable routines that 
set up the argument lists and calls for the emulator itself.  These
latter items are exactly analogous to the C library on a vanilla UNIX
system -- the system calls are usually special instructions (TRAP on
11's, CHMK on VAX's) that are set up by the libraries.  PAUNIX itself
is where virtually all of the guts of this thing is, it responds to
MUUOs (specifically opcode 40) just like the UNIX kernel responds to
TRAP x or CHMK x on 11's and VAX's.  PAUNIX is independent of the 
language the user uses (although it's written in BLISS itself), just
like PA1050 doesn't care.  PAUNIX lives off the PA1050 support in the
monitor (compatibility entry vectors and the like) so the implications
are that you can't mix TOPS-10 and UNIX (shouldn't be a practical problem),
LUUOs are still available, and PAUNIX can work with anything -- MACRO,
BLISS, etc.  

Since the monitor wants to load up PA1050 when you first hit a TOPS-10
MUUO, the routine which starts up a user program has to beat it to the
punch by setting its compatibility entry vector up so that the monitor
will just dispatch directly to PAUNIX, hence the need for the main code.
The main code should do a GET% on PAUNIX, then jump to it so that it
can initialize the UNIX process environment (signal handling, standard
I/O, etc.) and then restart the program.  The BLISS code shows all this.

I will assume that you're interested in at least looking at the BLISS
library code, and will shortly put it somewhere you can FTP it away from.

I have a request for the C compiler (prompted because the document said
it was subject to change), and that is that char pointers always be left
in the ILDB form (at present, a LDB gets what they point to).  PAUNIX
expects it's character pointers to be in an ILDB form, and thus either
it would have to be changed or the C-interface library would have to back
up the byte pointers while constructing the argument list.  As they
are, the pointers are at sufficient variance with what everything else
does that it might prove difficult to interface the C code with anything,
even the monitor.

Let me know if you have any questions about what I've said here, I'll
do my best to answer them promptly.  I'll also work at getting you even
a minimal function system ASAP.

	Rob
-------
 9-Sep-84 10:57:19-PDT,502;000000000001
Mail-From: KRONJ created at  9-Sep-84 10:57:18
Date: Sun 9 Sep 84 10:57:18-PDT
From: David Eppstein <[email protected]>
Subject: KCC improvement
To: [email protected], [email protected], [email protected]

I was bored today, so I fixed type coercions that don't produce any code
to also not interfere with constant folding and such things.  C1.C now
produces somewhat better code, and the part of the compiler responsible
for such things is now somewhat more readable.
-------
10-Sep-84 18:19:03-PDT,2286;000000000001
Return-Path: <@seismo.ARPA:[email protected]>
Received: from seismo.ARPA by SU-SIERRA.ARPA with TCP; Mon 10 Sep 84 18:18:56-PDT
Return-Path: <[email protected]>
Received: from rlgvax.UUCP by seismo.ARPA with UUCP; Mon, 10 Sep 84 21:20:37 EDT
Received: by rlgvax.UUCP; Mon, 10 Sep 84 19:26:02 EDT
Date: Mon, 10 Sep 84 19:26:02 EDT
From: Guy Harris <[email protected]>
Message-Id: <[email protected]>
To: [email protected]
Subject: Re: V7 C vs. PCC
In-Reply-To: your article <[email protected]>

> I have a C compiler for the DEC-20. It faithfully parses what is
> specified in Kernighan and Ritchie's book. The problem with this is that
> it can't parse much code that comes from Berkeley....
> 
> >From K+R, page 197:
> 	"The names of members and tags may be the same as ordinary
> 	variables. However, names of tags and members must be mutually
> 	distinct.
> 
> 	Two members may share a common initial sequence of members; that
> 	is, the same member may appear in two different structures if it
> 	has the same type in both and if all previous members are the
> 	same in both."
> 
> My question becomes:
> 	Why did PCC diverge from this condiition? What other features
> are in PCC that aren't in K+R's V7 compiler?

This change was made in UNIX System III (TM blah blah blah legal bull****).
4.1BSD (and 4.2BSD) uses the same C compiler as VAX-11 S3 (modulo "flexnames"
and a bugfix relating to quantities less than 32 bits long in registers;
the fix was not to put them in registers).  The change was made because
it's a royal pain having structure member names being in a global name space;
that's too big a name space, so they are now local to each structure.

The "void" pseudo-data-type was also added in the S3 compiler.  You can
cast an expression to "void", which tells "lint" (although, it seems, not
the compiler) that the result of the expression is being discarded; this
shuts "lint" up about return values of functions not being used.  You can
also declare a function as returning "void" in which case it is known not
to return a value.

You can now say "unsigned char" and get 8 (or 7 or 9, whichever, in your
case) bits of unsigned data.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy
11-Sep-84 09:15:23-PDT,282;000000000001
Mail-From: KRONJ created at 11-Sep-84 09:15:21
Date: Tue 11 Sep 84 09:15:21-PDT
From: David Eppstein <[email protected]>
Subject: common subs with adjbp are now folded
To: [email protected]
cc: [email protected]

...making c1.fai even better than before.
-------
14-Sep-84 08:16:58-PDT,1008;000000000001
Mail-From: KRONJ created at 14-Sep-84 08:16:56
Date: Fri 14 Sep 84 08:16:56-PDT
From: David Eppstein <[email protected]>
Subject: printf()
To: [email protected]

Ok, %o and %x now are unsigned.  My solution was to do them in separate
routines using shifts and ands.  Decimal unsigned printout still doesn't
work - I'm not sure what the best approach to that is.

One thing I had been doing when adding new modules to the library was
keeping them sorted, such that if module foo called routines from
module bar, foo would be before bar in the list.  Presumably this
would make things load quicker because only one pass through the
library would be needed.  You might want to do this for your recent
modules too.

Before you next reload the compiler: The assembly output is going to look
pretty ugly with 777777777777(17) where it used to have -1(17) and so on.
Perhaps the routines to output such things (in ccout) should be improved
to get rid of the sign themselves...
-------
14-Sep-84 15:49:52-PDT,1344;000000000011
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Fri 14 Sep 84 15:49:50-PDT
Date: Fri 14 Sep 84 18:49:44-EDT
From: David Eppstein <[email protected]>
Subject: another small program to port
To: [email protected]

would be pwd.  I think it only requires one as-yet unimplemented runtime
(the one that does all the work of course).  My feeling is the format
of the printed directory should be close to that for Unix - some programs
may expect to be able to add slashes after the name or some such feature.
So for instance it might look like "ps:/kcc/unix/bin"... or maybe you could
hack _gtjfn() to allow "/ps/kcc/unix/bin" the way PCC does.

Date: 16 Sep 1984  12:33 EDT (Sun)
From: David Eppstein <[email protected]>

    Date: Saturday, 15 September 1984  19:23-EDT
    From: Greg Satz <SATZ at SU-SIERRA.ARPA>

    as far as PWD would go: on Unix it gets the path in chunks
    by opening "." and ".." till it gets to the root. It is real ugly
    and slow. Yes, it would be worth doing a unix => tops-20 and
    vica versa filespec converter.

No, look at <UNIX.WORK>PWD.C.  It does one runtime call.
_gtjfn() already does a reasonable job of unix=>t20,
but there are some features (like ~ and structure names)
that it doesn't know about...
15-Sep-84 23:15:56-PDT,6350;000000000001
Date: Sat 15 Sep 84 23:15:56-PDT
From: Greg Satz <[email protected]>
Subject: more KRONJ sends
To: *"PS:<KCC.CC>MAIL.TXT.1"@SU-SIERRA.ARPA
Phone: (415) 497-1004

KRONJ, TTY146, 14-Sep-84 8:24AM
you got yacc to work?

KRONJ, TTY146, 14-Sep-84 8:26AM
it is global, but you forgot the entry stmt for it in time.c

KRONJ, TTY146, 14-Sep-84 8:27AM
(so if time were loaded it would get found but link doesn't know to load time).
  yeah, it is a pain, but needed because fail and kcc are both very one-pass

KRONJ, TTY146, 14-Sep-84 8:28AM
at least it's only necessary for library modules - normal programs can do witho
ut it

KRONJ, TTY146, 14-Sep-84 8:30AM
in clib./mic

KRONJ, TTY146, 14-Sep-84 8:30AM
not alpha order, order by who calls which

KRONJ, TTY146, 14-Sep-84 8:31AM
that might be useful

KRONJ, TTY146, 14-Sep-84 8:35AM
sources for working unix programs should probably get moved
from <kcc.unix.work> to <kcc.unix.src> ...

KRONJ, TTY146, 14-Sep-84 8:44AM
no, since the stack grows up rather than down it's kinda hard.
what sbrk does is steal a piece from the far end of the stack
(or in extended addressing from its own sections).  so it
follows the unix version in returning an appropriate sized
piece of memory, but there is not much guarantee what order the
pieces come in.

KRONJ, TTY146, 14-Sep-84 8:44AM
why should anyone need brk?

KRONJ, TTY146, 14-Sep-84 8:45AM
yeah, it really confuses the unix malloc() - had to write my own.

KRONJ, TTY146, 14-Sep-84 8:47AM
if we always use extended addressing and fix code and stack in
sections one and two (rather than nonextended or letting them
be in any section the way it is now) we could do brk fairly
easily...

KRONJ, TTY146, 14-Sep-84 8:51AM
the main problem is on those machines the stack grows down so memory
can grow up into it.  here the stack grows up so memory grows down.

KRONJ, TTY146, 14-Sep-84 8:51AM
was that supposed to be an empty send?

KRONJ, TTY146, 14-Sep-84 8:54AM
in this implementation the data size before losing is half a section.
i suppose we could make brk steal from above the symbol table or
something.  then all it would interfere with would be ddt.

KRONJ, TTY146, 14-Sep-84 8:55AM
(and for extended addressing, set up stack in sec. 1 and code in sec. 2
(would it work to have stk in 0 and code in 1?))

KRONJ, TTY146, 14-Sep-84 8:58AM
is there any good way to find the top of high seg (for non extended addr)?
i have a hardware manual nearby - maybe it will say.

KRONJ, TTY146, 14-Sep-84 8:59AM
it seems to have a brk but i don't know how it fits in with our memory
structure.  for a good time look at how i set up the stack pointer.

KRONJ, TTY146, 14-Sep-84 9:03AM
global stack ptr's gotta be to a non-zero section.  so if we do brk
that way the code and stack have to be in sections 1 and 2 (either order)
so they don't get in the way later when allocated memory grows.

KRONJ, TTY146, 14-Sep-84 9:08AM
if we stick stack in sec.1 rather than sec.31 then we don't have
to know how many sections there are (i.e. it will work for a
machine that uses full 30-bit addrs with no change).

for non-extended what's wrong with growing data above the top
of code?

KRONJ, TTY146, 14-Sep-84 9:13AM
it already maintains two separate cases.  both of the new cases
would be simpler than both of the old cases (all they have to
watch out for is section overflow, rather than the current
problems of hacking up the stack pointer so the stolen space
is no longer accessible, and finding free sections with
arbitrary sections in use already).  if you fill section 0
you lose.

KRONJ, TTY146, 14-Sep-84 9:13AM
just like if you fill all 31 extended sections you lose

KRONJ, TTY146, 14-Sep-84 9:16AM
code and static data, right.  i don't know how to make fail and link
do any better.

KRONJ, TTY146, 14-Sep-84 9:19AM
and it has the advantage that you don't have to recompile or anything
to change a program from non-extended to extended.

how does that work?  the monitor has to go through all sorts
of hackery to get more than one section, and it still only
gets one section of code.  is there a way to get more than
one with macro?

actually, one section of code isn't particularly limiting.  the
limiting part is  that the static data also has to be in the
same section.

KRONJ, TTY146, 14-Sep-84 9:27AM
actually they only need to use sbrk(), since the space is never
going to be freed.  the other problem is the compiler often
makes assumptions that code is in the same section as static
data.  i suppose if it kept track of how much data it was making
it could know to start doing extended addressing things and
making OWGBPs and EFIWs and all those good things.  currently
it has to assume that the program should run both extended and not,
so everything has to be local (or constructed from tables in
the runtimes that make byte pointers).

KRONJ, TTY146, 14-Sep-84 9:28AM
and the compiler can't always know, because you could put things
in separately compiled modules.  sigh.

KRONJ, TTY146, 14-Sep-84 9:30AM
it would be difficult.  i'm not sure it could be done efficiently at
all if you add the constraint that the program should still run unextended.

KRONJ, TTY146, 14-Sep-84 9:31AM
and if you don't make that constraint, you have to tell the compiler
to use extended addressing for each module you use.  that includes
having separate sets of runtimes for extended and not.

KRONJ, TTY146, 14-Sep-84 9:32AM
on the whole, it's probably easier just to convert those large arrays
to pointers initialized with a call to sbrk.

KRONJ, TTY146, 14-Sep-84 9:35AM
oh, another advantage of the memory allocation scheme we were discussing
(using brk and overflowing sections):  it would allow allocated arrays
to be bigger than a section.  i don't think the current one does that.

KRONJ, TTY146, 14-Sep-84 9:38AM
shouldn't be too hard to change...all you have to do is change
sbrk() and the code that sets up extended addressing in the first
place, both i think in tops20.fai.  you could also flush the
routine to find an empty section, since with the new scheme
you just increment the break and create all sections needed above
the previous high water mark.

KRONJ, TTY146, 14-Sep-84 9:38AM
ok
-------
17-Sep-84 19:07:53-PDT,4297;000000000011
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 17 Sep 84 19:07:41-PDT
Received: from CWR20B by CUCS20 with DECnet; 17 Sep 84 22:08:00 EDT
Date: Mon 17 Sep 84 22:07:56-EDT
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: runtime status
To: SATZ%SU-SIERRA@CUCS20
In-Reply-To: Message from "Greg Satz <[email protected]>" of Mon 17 Sep 84 21:08:05-EDT

Well, the runtimes go well, but not as well as I had hoped (or else
you'd have something to play with).  I am now testing all the I/O
calls from UPM section 2, with the exception of pipes.  I'm trying
to get enough going so that I can give you a version which will do:

	access, alarm, brk, chdir, chmod, close, creat,
	dup (and dup2), exit, getpid, getuid, ioctl (limited in
	form and for tty's only), kill (though it's not terribly
	useful without fork()), lseek, open, pause, read, 
	signal, stat (fstat), sync, time(ftime), times
	(though again of marginal use without fork()), umask,
	unlink, utime, and write.  Also, all of the super-user
	calls are done (whoopee), they just return EPERM errors.

This works out to most everything that is useful without having fork()
and exec?().  The only hangup with doing those has been time -- I decided
to concentrate on I/O first.  I/O should be working well to disk and
TTY by the time you get it.  If you try to do I/O to a directory you'll
(unless I get a lot done) get a panic.  I do expect I/O to directories
to work eventually -- it should thus be possible to port a shell.  If
someone wants to port the Bourne shell, it should not be too hard.  However,
the Bourne shell assumed that all char *'s used for its free space
were even, and used the LSB's to hold status information.  Clearly,
this won't work with the PDP-10.  When I ported the Bourne shell to the
Harris machines, there were spare bits in the byte pointer used by Harris
which I could use for the same purpose, however, no such luck with the
PDP-10 -- all the bits are defined.  (Unless, of course, you always use
OWGBP's and assume section numbers will never get bigger than 37(8)!)

All of the above are coded, and I'm testing now.  Given how well I've
hit dates so far, my confidence in giving you a date I can hold is
draining fast, but I'm shooting for this weekend.

I wasn't doing any of the UPM(3) stuff, just an emulator which responds
to a set of entry points with behavior similar to what the kernel does
with its corresponding entry points.  I'll be happy to try doing them
when I get the rest of the thing finished.

As for memory management, I struggled for a while to figure out how I
should emulate the three segment address space normally provided by
UNIX.  At the moment, I don't try.  brk is implemented, but it is a
no-operation, unless you reduce the known break.  That is, the first
time you do brk(x), it will remember x.  When you later do brk(x'), if
x'-x is negative, it will delete the pages between them.  If x'-x is
non-negative, it does nothing.  In either case, it remembers x'.  One
of the big problems with enforcing the 3 segment model is that the
a.out header of a UNIX executable provides the kernel with the base
information needed for doing it, whereas I have nothing.  The emulator
shouldn't really go poking around in the Job Data Area because it's
not proper to assume that it's been set up right.  What I finally
decided to do was punt.  However, if it is desirable to enforce the
3-segment model, I think the mechanism might work as follows:

     1.	The emulator will support a non-UNIX-defined entry point
	which provides the information one would find in an a.out
	header about the segments.
     2. The routine which calls main(), will invoke this prior
	to calling main() to provide the emulator with the base
	information.
     3. The emulator will enable the non-existent page PSI channel,
	and will manage the segments just like UNIX would do on an
	interrupt.  That is, extend the stack to some limit, or invoke
	a memory fault as a signal (segmentation or bus error, I forget
	which makes sense).

--

In any case, I'm sorry this has taken so long -- I'm hopeful you will
find it very worthwhile in the long run.

	Rob

-------
18-Sep-84 06:57:52-PDT,518;000000000001
Mail-From: KRONJ created at 18-Sep-84 06:57:52
Date: Tue 18 Sep 84 06:57:52-PDT
From: David Eppstein <[email protected]>
Subject: getpid()
To: [email protected]

now that you have a getpid() that does something more reasonable
than merely returning .FHSLF, you can probably use the UNIX mktemp()
rather than the one I rewrote to make up for that deficiency.

question: is anyone going to care that the value returned by
getpid() is different than that returned and used by fork() and wait()?
-------
18-Sep-84 13:30:05-PDT,8618;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Tue 18 Sep 84 13:29:37-PDT
Date: 18 Sep 1984  16:29 EDT (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bug-KCC@Sierra
Subject: KCC memory management

    Date: Tuesday, 18 September 1984  15:04-EDT
    From: Greg Satz <SATZ at SU-SIERRA.ARPA>

    Was the final idea to generate EXTENDED always and to put the code
    in section 0, the stack in 1 and let memory grow upwards? I have
    the sends.txt in a message in <kcc.cc>mail.txt and I think that
    is what we settled on. YACC is not working because of memory,
    and neither will SORT, so we need to settle this.

I don't see why we can't still keep unextended addressing.
My idea of what we settled on was:

Unextended:

  brk() and sbrk() start allocating memory above the top of code (code
  being in the high seg as it currently is).  when it runs into the
  top of section zero you lose (needs an explicit check unless you
  want to start randomly munging ACs).  if you run into DDT you also
  lose, but we probably don't want to check for that case.

  We can't salvage a program that has run out of memory by jumping
  into extended addressing, because by then there are too many local
  pointers running around that would need to be made global (not to
  mention moving the stack into a different section and figuring out
  which places point into the stack and need to be changed).

  What we do now in the unextended case is allocate memory from the
  far end of the stack; unfortunately this forces allocated memory to
  grow down rather than up.  With the new scheme it would grow up the
  way UNIX programs expect and brk() could be made to work.

Extended:

  Neither the code nor the stack can be in section zero.  If the stack
  pointer points to section zero then it will be treated as local to
  the section the code is in, and if the code is in section zero it
  will not be able to access data or stack in any other sections.  So
  they should both be in sections one and two, and to be consistent
  with most of the rest of the extended addressing world we should
  probably put code in section one and stack in section two.

  brk() and sbrk() will allocate memory in sections three and above,
  creating sections as needed.  when you run into the top of extended
  memory you lose; probably this can be noticed by a jsys error when
  you attempt to create the new section.

  The extended case is essentially what we have already except that we
  are adding a restriction on which sections code and stack go in.
  This also means that we could allocate blocks of memory larger than
  a section; this is not currently the case.

In neither of these cases is the order of data, code, stack, and
allocated memory exactly the way it is on UNIX.  brk() depends on
allocated memory growing up, and I think the UNIX malloc() might
depend on allocated memory being above data.  I hope the above schemes
got the order right for malloc() so we can flush my version (although
then we might have ATT licensing problems), and I also hope nobody
else has any harder to satisfy dependencies.

If we use these schemes brk() will need to know what section it is
running in, and sbrk() should be rewritten in terms of brk().  Neither
of these is particularly difficult.  sbrk() can probably be written in
C and moved to RUNTM.C; it is called in RUNTM so it shouldn't get its
own module.  brk() should probably stay in TOPS20.FAI, since it is in
assembly and should always be included in the loaded runtimes (for the
call to sbrk in RUNTM).

As far as I can tell the only other change needed would be to make the
extended addressing startup force section one for the code and section
two for the stack.  The routine and data structures currently used to
find a free section can be flushed.

Date: Tue 18 Sep 84 13:53:46-PDT
From: Greg Satz <[email protected]>

I followed that much, but what I didn't get was where to put the
stack in the non-extended case. Should it go at the end of section 0
or for some determined length after the code?

Date: Tue 18 Sep 84 22:03:47-PDT
From: Greg Satz <[email protected]>

	From: David Eppstein <[email protected]>
	Subject: Where to put the stack

	Why not keep it where it is now: between the top of data and the
	bottom of code.

I am not sure what you mean here. The Unix loader creates a binary with
three components: text, data, and bss. The text is the code. Data is
initialized data. And bss is blank data. The kernel at startup reads
offsets into the file for each of these areas and lays them out in
memory depending on whether things are supposed to be shared or not,
etc. On the Vax, it starts the stack at 7fff0000 hex and on the PDP-11
it starts it in the last 8k segment (out of 8 segments) and both work
downward. Unix defines a bunch of constants (end(3)) that point to
various boundaries so things like brk and sbrk can work. For example,
_end always points to the end of the program. _etext points to the end
of the text segment.

For the non-extended case, we have code, initialized data and
non-initialized, all in the beginning of section zero. The stack starts
right after the code now. Where shall I start the growing memory?  How
much stack space should I allocate or if memory space should be after
the code and stack after the memory, how much memory space?  I guess I
should pick some number, but I wondered if there was some way that
wasn't so arbitrary.

Date: Tue 18 Sep 84 22:11:01-PDT
From: Greg Satz <[email protected]>

I should add one more thing just to be complete: all of the arguments
passed into the program via argv and all of the environment variables
are stored in the very end of stack space. Just before the end of the
address space, working upward, an array of pointers to the argvs are
stored, then a zero, then an array of pointers to the environment
strings are stored, then another zero. The strings themselves, null
terminated of course, follow the pointers. The last word of the address
space, following the strings, should be a zero.

I don't think it is necessary that we emulate this behavior.

Date: 19 Sep 1984  09:44 EDT (Wed)
From: David Eppstein <[email protected]>

    "The stack starts right after the code now".

This is blatantly untrue.  The stack starts right after data, and
extends to the beginning of code (usually 400000).  I admit that the
code in TOPS20.FAI to set up the stack is a little opaque, but that's
the result it comes up with.

Thus if you grow allocated memory after code, you won't have any
conflict because the stack is before code.  Or alternately you could
move the stack to after code and allocate memory before code, but why
make unnecessary changes?

If I have made myself unclear, hear is a picture:

           ______________________
     0    |                      |
          |   Registers          |
          |______________________|
    20    |                      |
          |   Job Data Area      |
          |______________________|
   140    |                      |
	  |   Initialized and    |
          |   unitialized data   |
          |______________________|
          |                      |
          |   PAT..              |
          |______________________|
          |                      |
          |   Symbol table       |
          |______________________|
          |                      |
          |   Stack              |
          |______________________|
          |                      |
          |   Allocated memory   |
	  |   in current scheme  |
	  |   (taken from stk)   |
          |______________________|
400000    |                      |
          |   Code               |
          |______________________|
          |                      |
          |   Allocated memory   |
          |   in proposed scheme |
          |______________________|
740000    |                      |
	  |   DDT (or more       |
	  |   allocated memory)  |
          |______________________|

Extended addressing is just the same, except that this is now section
one, and the stack and allocated memory are in different sections.
There is some switch you can give LINK to make the code start at some
other address than 400000.

Currently the space for the argc/argv stuff is created with sbrk().
23-Sep-84 16:49:00-PDT,392;000000000001
Mail-From: SATZ created at 23-Sep-84 16:48:57
Date: Sun 23 Sep 84 16:48:57-PDT
From: Greg Satz <[email protected]>
Subject: kcc char arrays and pointers
To: [email protected]
cc: [email protected]
Phone: (415) 497-1004

Why did you make string constants ASCIZ while char arrays the 9 bit?
This foils attempts to deal with strings like:
strncpy(foo, "hello", 6);
-------
23-Sep-84 22:20:35-PDT,4018;000000000001
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Sun 23 Sep 84 22:20:32-PDT
Date: Sun 23 Sep 84 22:20:05-PDT
From: Len Bosack <[email protected]>
Subject: Oh no! I may bash the C compiler...
To: [email protected], [email protected]

Here are some thoughts after looking at how things work. I'm not really
comfortable mucking around yet...

initialize()
{
 xchr[32]=' ';
 xchr[33]='!';
 xchr[34]='\"';
 xchr[35]='#';
initia:
	ADJSP	17,3
	MOVEI	3,40
	DPB	3,[331100,,xchr+10]
	MOVEI	6,41			;make these MOVEI r,040041
	DPB	6,[221100,,xchr+10]	; HRLM r,xchr+10
	MOVEI	11,42
	DPB	11,[111100,,xchr+10]
	MOVEI	14,43			;likewise MOVEI r,042043
	DPB	14,[1100,,xchr+10]	; HRRM r,xchr+10

Can the MOVEI be something else, like SETx?
How to do screw case where P field pattern is 33,00,22,11 or 00,33,11,22?
Note the optimization to turn LDB/DPB r,[xx2200,,addr] into HxRx r,addr
 already exists.
Should the general case be take care of for all adjacent byte fields? Just
 adjacent same-size fields?
What about OWG pointers?

Then we do:
	MOVEI r,040041
	HRLM r,xchr+10
	MOVEI r,042043	;turn into MOVE r,[040041042043]
	HRRM r,xchr+10	; MOVEM r,xchr+10

The order of the HRLM and HRRM shouldn't matter.
Should also do HRL r,+HRR r,-> MOVE r, in any order.

What to do:
In CODE1(?), when you go to emit a DPB:
 Does the specified reg have a constant in it?
  if no, punt;
  if yes, search backwards for another DPB to an adjacent byte in the same word
   punt if none.
   Does the found DPB have a constant in its reg?
   if no, punt;
   if yes, fixup the constant load for the found DPB to build the wider const,
    then fixup the found DPB pointer to cover both bytes, then delete the
    current constant load and don't emit anything more. Let the peepholer look
    some more.

After the above is seen to work (the results may be halfword instructions or
fullword byte operations. I think the P sequence 33,22,11,00 gives halfword
instrs while 22,11,33,00 will end up with a fullword BP.)

Add a check for fullword byte operations. Use MOVE/M as simple replacement.
Let peepholer keep going, as something may now fold with it.

Add cases to check for halfword instrs that can be merged.
(Do the current LDB/DPB -> HxRy folds do it right? Would the results be
merged? Do they need to call CODEx instead of smashing the OP?)

Date: 24 Sep 1984  10:50 EDT (Mon)
From: David Eppstein <[email protected]>

You probably want to make the change either in code2() or localbyte().
All DPBs to byte pointers that are constructed on the spot (as opposed
to being already in some variable) should get optimized into locals
through those two routines.  Those byte pointers can have index fields
and indirect bits, so turning them back into word pointers is not
completely trivial.

The sources have moved to Sierra, so if you do anything on Score be
sure to update things...

While you're playing with halfwords, you might also want to think
about turning
	SETZ	R,
	HRRM	R,X
into
	HLLZS	X

and combining that with the other halfword operations.  You would have
to be careful to keep a zero in R, but make that instruction go away
(probably in genrelease() where it already flushes useless MOVEs).
Any newly introduced opcodes should be taught to CCCSE.  Another thing
to do in that respect is make changes to incompatible parts of memory
not be treated as possible aliases.  For instance, a HRRM can not
possibly affect a HLRZ, and similarly with disjoint byte pointers.

Date: 24 Sep 1984  10:53 EDT (Mon)
From: David Eppstein <[email protected]>

The other thing you have to be careful to do is keep the MOVEI of the
second constant around when you do the two-dpb fold, but switch it
with the DPB so that genrelease() will know to get rid of it once it
knows the value is not re-used.  I don't think you have to change
genrelease() to be able to do this.
24-Sep-84 10:12:15-PDT,973;000000000001
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 24 Sep 84 10:12:12-PDT
Received: from CWR20B by CUCS20 with DECnet; 24 Sep 84 13:15:10 EDT
Date: Mon 24 Sep 84 13:13:46-EDT
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: runtime status
To: SATZ%SU-SIERRA@CUCS20
In-Reply-To: Message from "Greg Satz <[email protected]>" of Tue 18 Sep 84 14:47:27-EDT

Just a quick status update.  I am running a few final tests (though today
has been difficult because Monday is often drained away in meetings), and
after they are complete I will set up a distribution somewhere from which
you can FTP them away.  I will also try (though I may send this later)
to provide a "modified" section 2 of UPM Volume I to describe particulars
of the implementation for people trying to use it.

Will advise when all this is ready, just wanted you to know that daylight
is finally becoming visible.

	Rob
-------
24-Sep-84 08:29:19-PDT,467;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 24 Sep 84 08:29:13-PDT
Date: Mon 24 Sep 84 11:30:18-EDT
From: David Eppstein <[email protected]>
Subject: what to do with %
To: [email protected]

how about _ for periods, lower case x for %, and lower case s for $?
Then you would get symbols like
	GJxFOU	sGTLCL	FLDDB_ ...
None of these are very likely to get accidentally used...
-------
24-Sep-84 14:41:08-PDT,968;000000000001
Mail-From: SATZ created at 24-Sep-84 14:40:59
Date: Mon 24 Sep 84 14:40:58-PDT
From: Greg Satz <[email protected]>
Subject: Possible compiler bug
To: [email protected]
cc: [email protected]
Phone: (415) 497-1004

I spent most of yesterday trying to figure out why yacc isn't working
and I think I found something strange. In the work.yacc directory,
look in the file y1.c and y1.fai. In the routine closure() there is
a statement "c -= NTBASE;"; I think it corresponds to the label 326::
in the fail file. The variable c is -12 (octal) off the stack. The
code generated after the label is "addb 14,-12(17)". Unfortunately,
register 14 contains zero and not -NTBASE (which is 4096 decimal).

You could probably dig this out faster but I thought I would
let you know in case you have any free time. Also, if you do have
time and want to implement some of Len's suggestions, please do.
I probably won't be able to get to them soon.
-------
24-Sep-84 16:56:16-PDT,1185;000000000001
Mail-From: KRONJ created at 24-Sep-84 16:56:13
Date: Mon 24 Sep 84 16:56:12-PDT
From: David Eppstein <[email protected]>
Subject: Definite KCC bug
To: [email protected]
cc: [email protected]

<KCC.UNIX.WORK.YACC>TEST.C compiles wrong.  Various similar
permutations do the same thing.  If that doesn't narrow th
problem down enough for you to get it, here is my guess:
In code0() when it makes an addition it sometimes likes to
rearrange a previous sequence of moves and adds.  So it takes
	MOVNI	R,const
	ADDB	R,var
	ADDI	R,array
changes the order of the first two (not noticing the B on the ADD):
	ADDB	R,var
	MOVNI	R,const
	ADDI	R,array
and folds the last two to come up with the final result
	ADDB	R,var
	XMOVEI	R,array-const

First, when it does the first switch it should keep the ops
in the same order, so you should be seeing
	MOVEB	R,var
	SUBI	R,const

Second, it shouldn't be doing the switch at all because of the B.
If it were a simple ADD this switch would cause the constants
to get folded together, which is a good thing, but switching an ADDB
obviously changes the program's behavior, which is bad.

Hope this helps.
-------
24-Sep-84 19:26:34-PDT,491;000000000001
Mail-From: SATZ created at 24-Sep-84 19:26:33
Date: Mon 24 Sep 84 19:26:33-PDT
From: Greg Satz <[email protected]>
Subject: optimization
To: [email protected]
Phone: (415) 497-1004

IS the compiler supposed to work with the -n switch on (i.e. without
optimizing). Granted the code is pretty ugly and you probably never
want to do it, but for testing and debugging purposes. Yacc gets an
input error on a known good input file when compiled with optimization
off.
-------
25-Sep-84 07:01:49-PDT,755;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Tue 25 Sep 84 07:01:46-PDT
Date: 25 Sep 1984  10:04 EDT (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Cc:   [email protected]
Subject: optimization
In-reply-to: Msg of 24 Sep 1984  22:26-EDT from Greg Satz <SATZ at SU-SIERRA.ARPA>

The problem is likely that some optimizations don't check the flag and
optimize anyway.  Then if they depend on other optimizations to work
right you will get bogus results.

It might make debugging easier if you could control individual
optimizations rather than all optimizations as a block...
27-Sep-84 08:24:58-PDT,5584;000000000001
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Thu 27 Sep 84 08:24:38-PDT
Received: from CWR20B by CUCS20 with DECnet; 27 Sep 84 11:25:48 EDT
Date: Thu 27 Sep 84 11:25:09-EDT
From: Rob Gingell <GINGELL@CWR20B>
Subject: Emulator is available for FTPing
To: Satz%SU-SIERRA@CUCS20
cc: Lougheed%SU-SIERRA@CUCS20, Gingell@CWR20B

At long last, a pre-pre-release of  PAUNIX is somewhere where you  can
get it.  On COLUMBIA-20, you'll find the following directories  (which
should be accessible to you).

SNARK:<G.GINGELL.PAUNIX.BLISS>		BLISS-36 programming support
SNARK:<G.GINGELL.PAUNIX.PAUNIX>		Sources for PAUNIX
SNARK:<G.GINGELL.PAUNIX.SUBSYS>		PAUNIX.EXE, a test program,
					and skimpy documentation

The documentation file in the  .SUBSYS> directory (PAUNIX.MEM) can  be
considered  an  addendum  to  Section  2  of  Volume  I  of  the  UNIX
Programmer's Manual.  It contains a  call-by-call list of any  gotchas
or important notes of which I am  aware (such as describing a call  as
"not implemented yet".  There is more documentation being prepared, in
particular dealing with interfacing this thing to any given  language.

The release is "pre-pre-" because I  am not making any claims that  it
is complete.  As  promised, there  is no  fork support  at this  time.
Neither is there  support for I/O  to directories or  pipes.  Both  of
these are planned, as well as  features from 4.2bsd and System V  (the
current thing is just supposed to be 7th Edition).  I am interested in
anything and everything you have to  say about what it does,  doesn't,
should, and shouldn't do.

In setting up the distribution on COLUMBIA-20, I find that this  thing
will not work under TOPS-20 5.4.  All of my work and testing has  been
done on V6, and the stuff does work on one of Columbia's systems which
is running V6.   The problem seems  to be that  5.4 does not  properly
support extended  addressing  compatibility  entry  vectors  and  MUUO
dispatching.  I  have not  investigated further,  but I  am under  the
impression that SIERRA is running V6 so this should not be a  problem.
If it is, let me  know and I'll try to  work something out so you  can
use it.

You also need to have BLISS-36 V4 to build the software.  PAUNIX  uses
a  customized  version   of  the  BLISS   "Object-Time  System",   but
customizing the OTS  is something which  DEC expects people  to do  --
they always  ship the  source  to it  and  the BLISS-36  User's  Guide
contains instructions  on how  to do  it.  The  PAUNIX.CTL file  which
builds the software  will build a  customized OTS if  necessary --  it
expects to find the part DEC supplies on SYS:.

The code and  documentation has  a copyright  notice on  it.  You  can
consider this message a "letter of transmittal" which grants  Stanford
University to do anything they like with this release except:

	1. distribute it to any third party; and/or
	2. remove the copyright.

The copyright notice will eventually be changed in one of two ways:

	1. relaxed to allow free access to the software except
	   for re-sale and/or removal of the copyright (essentially
	   what is done with KERMIT); or
	2. the rights will be transferred to some third party.

My personal belief is that "1"  will be what will happen, however  the
University wants  to go  through the  exercise of  seeing whether  its
interest will  be served  by  having someone  else  sell it  with  the
University getting  royalties  (the University  does  not want  to  go
through the licensing of others  itself).  I think the examination  is
just a prudent step on  their part to see  "what's what" with it,  and
the general expectation is that we'll  just end up giving it away  and
retaining credit for  having sponsored/done  it.  However,  as I  have
said to Kirk  in the past,  and reaffirm now,  access to the  software
while we have anything to do with it will continually entitle Stanford
(and the few  other places  to which  we will  give it  early) to  the
PAUNIX they get from us directly  and will not impede your ability  to
get it from us as we continue  to produce new releases.  I don't  know
what your plans for the "C" compiler entail in this regard, I'm  under
the impression that at least between CWRU and Stanford we will work  a
trade on them (?).

A couple of final observations.  I am uncertain about the  performance
of the package -- I haven't done enough with it to decide if there are
any  problems  (though  I  am  sure  there  must  be  at  least  minor
inefficiencies).  My tests indicate that if one does a file copy using
small buffers to read/write, that it is certainly more expensive  than
the EXEC's  COPY command  (for instance).   However, large  read/write
buffers  begin  to   approximate  the  COPY   command  for  CPU   time
performance.  Of course, the fact that small buffers take longer isn't
surprising, the concern is that most UNIX programs I have seen like to
do things using 512 byte buffers, which is "small" in this discussion.
So, in addition to finishing  off the "not-implemented" stuff I'll  be
looking at performance in the coming days.  Finally, it appears that I
will be on the west coast for a few days in mid to late October.   (Of
course, I will also be out for DECUS, and if we go "public" with it, I
will want to dump it into the tape-copy process there.)  Perhaps I can
bring the next thing on tape and we'll get a chance to talk about it.

	Rob
-------
30-Sep-84 15:25:31-PDT,1391;000000000011
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sun 30 Sep 84 15:25:19-PDT
Received: from CWR20B by CUCS20 with DECnet; 30 Sep 84 18:28:08 EDT
Date: Sun 30 Sep 84 18:26:31-EDT
From: Rob Gingell <GINGELL@CWR20B>
Subject: C Compiler
To: Satz%SU-Sierra@CUCS20

   Saw you were on Columbia's machine the other night, hope you got every-
thing ok.

   I was wondering if it was possible for me to get a hold of the C compiler.
What I would like to do is to take some relatively "kernel intensive" C
program (like the Bourne shell) and get it running under the emulator.
That would give the emulator a good going over and provide a good generator
for the performance data I want to gather on it.  Although at some point
it'd be nice to get a full "release" of the compiler, sources, etc., for
the present all I need are sufficient .EXE's/.Rel's whatever to just run
it to compile big programs.  A side effect of this should be that I'll
get the C-callable library for the emulator written quick, although I
can start that without the C compiler with the information you've given
me so far.  If you're in the middle of things with it don't worry about
it 'cause its just something that would make life a little easier but
not something I just have to have in order to keep going.

   Take care,

	Rob
-------
10-Oct-84 09:48:37-PDT,1129;000000000001
Mail-From: SATZ created at 10-Oct-84 09:48:30
Date: Wed 10 Oct 84 09:48:30-PDT
From: Greg Satz <[email protected]>
Subject: Re: C Compiler
To: GINGELL%[email protected]
cc: [email protected]
In-Reply-To: Message from "Rob Gingell <GINGELL@CWR20B>" of Sun 30 Sep 84 15:25:31-PDT
Phone: (415) 497-1004

Sorry for the length of time for this response. I wanted to reply once
Kirk had booted V6 so I could give you some feedback about PAUNIX;
however, V6 has been nothing but trouble. We had to rebuild the bit
table on our 2-pack rp07 the other day. Kirk, I think, has got us down
to only one or two crashes a day, but he is still complaining about
magic bits being set in the monitor.

Enough complaining and excuses. I would like to give you the compiler,
and you are welcome to it. There are still two known bugs lurking around
which prevents me from feeling terribly good about giving it out yet. I
wanted to get to them this week, but doubt that I actually will.

Expect to hear from me sometime next week with more status. If you
have any questions ro still want a copy, let me know.
-------
12-Oct-84 06:34:38-PDT,970;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Fri 12 Oct 84 06:34:09-PDT
Date: 12 Oct 1984  09:20 EDT (Fri)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bill Palmer <[email protected]>
Cc:   bug-kcc@sierra
Subject: calling conventions and pascmd
In-reply-to: Msg of 12 Oct 1984  07:09-EDT from Bill Palmer <whp4 at SU-SIERRA.ARPA>

The documentation for all that should be in CC.DOC, wherever you are
finding CC.EXE.  The calling conventions are really simple.  You could
probably do a PASCMD equivalent purely within C (no assembly) by using
the setjmp and longjmp runtimes for reparse, and the jsys runtime to
call COMND%.  To make it work for UNIX too you would want to just use
setjmp, longjmp, and terminal I/O, but if you want the same code to
work for both UNIX and KCC you will have to make terminal input work for KCC.
13-Oct-84 06:33:40-PDT,342;000000000001
Mail-From: WHP4 created at 13-Oct-84 06:33:37
Date: 13 Oct 1984  06:33 PDT (Sat)
Message-ID: <[email protected]>
From: Bill Palmer <[email protected]>
To:   [email protected]
Subject: printf failure

printf("%d",-34359738368);

prints:

--3435973836(

Said number is 1B0, of course.  

				Bill
13-Oct-84 06:46:30-PDT,690;000000000001
Mail-From: WHP4 created at 13-Oct-84 06:46:27
Date: 13 Oct 1984  06:46 PDT (Sat)
Message-ID: <[email protected]>
From: Bill Palmer <[email protected]>
To:   [email protected]
Subject: printf failures, cont'd

Well, a little single-stepping reveals the trailing '(' problem.

We do IDIVI 10,12  with 10 containing SETZ 0.  11 gets a value of -10
stuck in it, which added to '0' produces '('.  This was pretty easy to
track down once I figured out that these strings had 9-bit bytes, but
my poor tired mind was mighty confused for a while there....

				Bill

p.s. Maybe just declare that to be a pathological case and document it
as a feature?
13-Oct-84 07:53:05-PDT,1431;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sat 13 Oct 84 07:52:14-PDT
Date: 13 Oct 1984  10:54 EDT (Sat)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bill Palmer <[email protected]>
Cc:   [email protected]
Subject: printf failures, cont'd
In-reply-to: Msg of 13 Oct 1984  09:46-EDT from Bill Palmer <whp4 at SU-SIERRA.ARPA>

Sounds right.  Clearly the problem is that negating 1B0 produces 1B0.
Thus printf sees that it is negative, so negates it and prints a minus sign.
Since the result comes out almost right, here's a scheme for doing it right:

    print number (x)
    {
	quotient = x / 10;
	remainder = x % 10;
	if quotient <> 0 then print number (quotient);
	if remainder < 0 then {
	    if quotient = 0 then print '-';
	    remainder = - remainder;
	}
	print digit for remainder;
    }

I used to do something like this thinking it would work for printing
unsigned numbers. (it doesn't.  KCC needs to be fixed to know about
unsigned arithmetic.)  The disadvantage of this scheme is it
depends on the PDP-10 style of dividing negative numbers, and the C
specification doesn't have any requirements beyond
    (x / a) * a + x % a = x.
So if this code were moved to some other machine it might stop
working.  The other possibility is a special case for 1B0.
21-Oct-84 13:05:34-PDT,1158;000000000001
Mail-From: WHP4 created at 21-Oct-84 13:05:33
Date: Sun 21 Oct 84 13:05:33-PDT
From: Bill Palmer <[email protected]>
Subject: oh well, it wasn't quite that simple...
To: [email protected]

Here's a few sends from kronj on the subject of my "fix"...

KRONJ, TTY151, 21-Oct-84 12:58PM
what that piece of code does is turn ADDB represented internally
as one opcode into ADD+B i.e. the ADD instruction with the B modifier.
This canonicalizes it for later optimizations.  The problem is sometime
later someone is seeing the ADD and not noticing that it is B.  So your
change will appear to fix the problem by not making ADD+B that way;
unfortunately there are other ways of getting ADD+B that could still
get through in more obscure cases.

KRONJ, TTY151, 21-Oct-84 1:01PM
yeah, what it's doing is turning the opcode type into MINDEXED | BOTH,
making it ready for the next case to handle it.

KRONJ, TTY151, 21-Oct-84 1:03PM
i thought it was obvious enough from the lack of blank line between them.
but the more comments the better... sure

So, like I told Herr Eppstein, it's back to the old listing....

					Bill
-------
29-Oct-84 00:25:54-PST,865;000000000001
Mail-From: SATZ created at 29-Oct-84 00:25:51
Date: Mon 29 Oct 84 00:25:51-PST
From: Greg Satz <[email protected]>
Subject: byte ptr bug within structures
To: [email protected]
Phone: (415) 497-1004

Using this structure:

#define SMAX 20

struct user {
	char	u_name[SMAX];		/* username */
	char	u_account[SMAX];	/* account */
	char	u_used;			/* seen in passwd file */
	char	u_indir;		/* seen in current directory */
	struct	user *u_next;		/* ptr to next entry */
};

and this statement of code:

    dirargs[_CDDAC] = up->u_account - 1;

Generates this sequence:

	MOVE	14,-4(17)
	ADDI	14,4
	IOR	14,$BYTE+3
	MOVEM	14,-422(17)

Unfortunately, what is originally moved in AC 14 (331100,,ADDR) already
contains a byte pointer to u_name and thus the IOR of $BYTE+3 (1100,,0)
is a noop leaving the wrong thing in AC 14.
-------
29-Oct-84 08:06:01-PST,1374;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 29 Oct 84 08:05:55-PST
Date: 29 Oct 1984  11:03 EST (Mon)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Cc:   [email protected]
Subject: byte ptr bug within structures
In-reply-to: Msg of 29 Oct 1984  03:25-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

I don't understand.  That looks correct to me.  Here is the generated
code you sent, annoted again so you can tell me why I'm confused:
	MOVE	14,-4(17)	;Get &(*up) - up is (struct *) so this
				;generates address of actual structure
	ADDI	14,4		;Offset four words - one less than (SMAX+3)/4
				;to put us at last word of u_name field
	IOR	14,$BYTE+3	;Make byte pointer to one before first byte of
				;u_account (ILDB pointer to start of u_account)
	MOVEM	14,-422(17)	;Store in dirargs[_CDDAC].  We are using this
				;as (char *), so dirargs must be (char *[]).

Before byte pointer optimization, this would have looked something like
	MOVE	14,-4(17)	;Get address of struct
	ADDI	14,5		;Word address of u_account field
	IOR	14,$BYTE	;Turn into byte address (LDB style)
	SETO	15,		;Minus one
	ADJBP	15,14		;To make into ILDB style address
	MOVEM	15,-422(17)	;Save in dirargs[_CDDAC]
29-Oct-84 10:17:10-PST,1544;000000000001
Mail-From: SATZ created at 29-Oct-84 10:17:07
Date: Mon 29 Oct 84 10:17:07-PST
From: Greg Satz <[email protected]>
Subject: Re: byte ptr bug within structures
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Mon 29 Oct 84 11:03:00-PST
Phone: (415) 497-1004

yes, the code LOOKS right, but what is originally put in AC 14 is not
just some address of memory, but a byte pointer! Here is a photo
of my ddting around:

[PHOTO:  Recording initiated  Mon 29-Oct-84 10:13AM]

2#get mc:makact.EXE.15 
2#dd
DDT
main 422/   PUSH 17,GETDNU+43   
MAIN+423/   CALL JSYS   ^H
MAIN+422/   PUSH 17,GETDNU+43   ^H
MAIN+421/   ADJSP 17,-1   ^H
MAIN+420/   MOVEM 4,-1(17)   ^H
MAIN+417/   XMOVEI 4,-446(17)   ^H
MAIN+416/   MOVEM 14,-422(17)   ^H
MAIN+415/   IOR 14,$BYTE+3   ^H
MAIN+414/   ADDI 14,4   ^H
MAIN+413/   MOVE 14,-4(17)   ^H
MAIN+412/   SETZB 7,-424(17)   .$b   $g
% Creating f
$1B>>MAIN+412/   SETZB 7,-424(17)   $x
   7/   0   21753/   0
MAIN+413/   MOVE 14,-4(17)   $x
   14/   SKIPL 2,DFLOT0#+351   22373/   SKIPL 2,DFLOT0#+351
MAIN+414/   ADDI 14,4   14/   SKIPL 2,DFLOT0#+351   =331100,,410414   $x
   14/   SKIPL 2,DFLOT0#+355   4
MAIN+415/   IOR 14,$BYTE+3   14/   SKIPL 2,DFLOT0#+355   =331100,,410420   $x
   14/   SKIPL 2,DFLOT0#+355   $BYTE+3/   UHEAD+737,,0
MAIN+416/   MOVEM 14,-422(17)   14/   SKIPL 2,DFLOT0#+355   =331100,,410420   
^Z
2#po

[PHOTO:  Recording terminated Mon 29-Oct-84 10:14AM]
-------
29-Oct-84 11:37:30-PST,930;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 29 Oct 84 11:37:17-PST
Date: 29 Oct 1984  14:34 EST (Mon)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Cc:   [email protected]
Subject: byte ptr bug within structures
In-reply-to: Msg of 29 Oct 1984  13:17-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

So obviously one of the following is happening:

 (1) You are passing a byte pointer when you wanted a struct pointer
     The most likely source of this is the fact that all the memory
     allocation routines return byte pointers, and if you don't give
     the right declarations or an explicit coercion they will never
     get converted.

 (2) You are correctly passing a byte pointer but incorrectly trying
     to use it as a struct pointer.
29-Oct-84 20:01:15-PST,972;000000000001
Mail-From: SATZ created at 29-Oct-84 20:01:13
Date: Mon 29 Oct 84 20:01:13-PST
From: Greg Satz <[email protected]>
Subject: Re: byte ptr bug within structures
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Mon 29 Oct 84 14:34:00-PST
Phone: (415) 497-1004

By default, all C routines are declared to return type int. It turns out
that I forgot to declare calloc() to be char *. I did have the required
coercion in front of the calloc() to make it a pointer to type user.

Correct me if I am wrong, but I think this is what the compiler did:

Since calloc() returned type int, the compiler never knew there was a
byte pointer in the left half to remove, so it didn't. When it went to
IOR in the new BYTE pointer, it became a NOOP because the byte pointer
that wasn't supposed to be there was. I need lint!!!

PS. Dan Newell will be taking over the compiler.
-------
 4-Nov-84 11:03:36-PST,2291;000000000011
Mail-From: KRONJ created at  4-Nov-84 08:45:45
Date: Sun 4 Nov 84 08:45:45-PST
From: David Eppstein <[email protected]>
Subject: ADDB bug finally fixed
To: [email protected]

I got tired of the ADDB bug's existence, so I fixed it.
The problem was of course that in foldplus(), where it switches
two operations to free them for other optimizations, it was checking
one but not both of the operations for the BOTH flag.  I made it check
the other, and the problem went away.  Compare the two versions of
CCOPT.C for details.  The new EXE is in <KCC.CC>.

In case you run across a similar problem later, here's how I go about
solving this sort of thing:

(1) Generate a small test input that exhibits the bug.  This is done
    by taking the (usually large) program that you found the problem
    in, and cutting code to make it as small as possible while
    retaining the problem.

(2) Run the compiler with DDT breakpoints at critical places.  You
    will have to GET CC, DDT, set your breaks, then SAVE and run the
    saved file on your input.  Do PUSHJ 17,FLUSHC$X at various
    different places to clear out the peephole buffer before and after
    various optimizations, and compare the results.

(3) You should now have a fairly good idea where the bug is.  Go
    through the listing by hand with the data from your test input,
    looking for the bug or for more key places to set breakpoints for
    the next iteration.

While I'm listing procedures, here's one for building a new compiler
after some change has been made.

(1) DO CC to make a new binary.

(2) Do it again to run the new compiler on itself.

(3) Do DIR,CHECKSUM to make sure that the two new binaries are
    identical.  If they are, this compiler is not guaranteed to be
    bug-free but at least is probably safe to use for further compiler
    development.  If there was some new optimization that made them
    different, DO CC again and make sure that they are the same this time.

(4) If they were not the same, you have introduced a new bug or
    uncovered an old one.  Do DIR,CHECKSUM on all the REL files to see
    which ones changed.  Make FAI files with both the new and old
    compilers to see what the changes are.  Fix the bug...
-------
 4-Nov-84 15:58:11-PST,491;000000000001
Mail-From: SATZ created at  4-Nov-84 15:58:06
Date: Sun 4 Nov 84 15:58:06-PST
From: Greg Satz <[email protected]>
Subject: Re: ADDB bug finally fixed
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Sun 4 Nov 84 11:03:36-PST
Phone: (415) 497-1004

Thanks a bunch Dave! I tried making a new Yacc and this time it
almost worked. There is still a problem somewhere which I still
need to track down.
-------
10-Nov-84 15:11:31-PST,1103;000000000001
Mail-From: SATZ created at 10-Nov-84 15:11:27
Date: Sat 10 Nov 84 15:11:26-PST
From: Greg Satz <[email protected]>
Subject: another possible bug -- from yacc
To: [email protected]
Phone: (415) 497-1004

this bug looks like bad output from a switch statement in an infinite
for loop. These examples are in <kcc.unix.work.yacc>bug.*. The code that
is generated is prematurly exiting the loop when it should be
continuing. Here is the C source example:

	for (;;) {
	    switch(bar()) {

	    case '\n':    i++;
	    case ',':     continue;

	    case '$':     break;

	    default:      error( "barf" );
	    }
	    break;
	}

and here is the anotated output code:

$2::
	PUSHJ	17,bar		;case on this routine
	CAIE	1,44		;is it a "$", totally done
	CAIN	1,54		;is it ",", totally done (**ERROR**, we
				; should be jrsting back to 2::)
	JRST	$1		;branch to end of program
	CAIE	1,12		;if newline, do i++
	JRST	$4		;otherwise, output error
$3::
	AOS	4,0(17)
	JRST	$2
$4::
	XMOVEI	6,$5
	IOR	6,$BYTE+4
	PUSH	17,6
	PUSHJ	17,error
	ADJSP	17,-1
$1::
-------
11-Nov-84 14:06:09-PST,1884;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sun 11 Nov 84 14:05:56-PST
Date: 11 Nov 1984  17:03 EST (Sun)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Cc:   [email protected]
Subject: another possible bug -- from yacc
In-reply-to: Msg of 10 Nov 1984  18:11-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

The code in countcases() that finds labels for cases was using
brklabel when it saw a continue statement; it should have been
using looplabel.  A silly mistake, easily fixed.

The only code that changed in the compiler after this was fixed
was in foldplus(), the part of the peepholer that handles addition
opcode folding, spindling, and mutilating.  Unfortunately the code
there worked better when the continues were incorrectly compiled as
breaks than when they were correctly compiled as continues.  This was
showing up as incorrectly compiled code in some symbol table hashing
routine.  I then fixed foldplus() to work better than either the old
incorrectly compiled code or the old correctly compiled code.

In doing this I noticed that the old code had been incorrectly
compiling a couple of hairy array references in the lexer (something
to do with macro arguments).  The new code compiled them correctly,
but although the generated code was correct it wasn't very efficient.
Much more hacking at foldplus() finally produced good code there too.
Happily this didn't produce any more changes in generated code.

So anyway, your switch statement in yacc should probably work better now.
The new compiler is installed in <KCC.CC> and <KCC.C>.

Dan: you might want to update your sources in <KCC.ATBAT>, whatever
that directory is supposed to be.  I haven't touched anything there.
11-Nov-84 20:21:35-PST,618;000000000001
Mail-From: SATZ created at 11-Nov-84 20:21:32
Date: Sun 11 Nov 84 20:21:32-PST
From: Greg Satz <[email protected]>
Subject: more yacc problems
To: [email protected]
Phone: (415) 497-1004

With Dave's latest changes, yacc generates the correct output
for the test case, however one statistic was wrong. In trying to
determine why, I came across this problem.

Why does this code compile? the The 4.2 compiler gives "unknown size"
errors:

struct wset {
	int *pitem;
	int flag;
	struct looksets ws;
	struct bar ofoo;
};

main()
{
	printf("sizeof = %d\n", sizeof(struct wset));
}
-------
12-Nov-84 16:13:18-PST,817;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 12 Nov 84 16:13:10-PST
Date: Mon 12 Nov 84 19:10:50-EST
From: David Eppstein <[email protected]>
Subject: KCC up and running on Columbia-20
To: [email protected]

I imported KCC here today.  Once I sorted out which versions of the sources
I was using, the only problem in compilation was that the MONSYM.FUN here
was dated 1980; I needed to make a new one for the latest monitor (TOPS20.FAI
requires some symbols not found earlier than release 4).  The compiler
compiles itself and the runtimes adequately; I haven't tried any other
program.  Don't know how much use it will get, but at least it will be more
pleasant than PCC if I ever get an urge to write a program in C...
-------
13-Nov-84 19:35:28-PST,1953;000000000001
Mail-From: WHP4 created at 13-Nov-84 19:35:25
Date: Tue 13 Nov 84 19:35:25-PST
From: Bill Palmer <[email protected]>
Subject: sends from kronj on debugger ideas
To: "*PS:<KCC.CC>MAIL.TXT.1"@SU-SIERRA.ARPA
cc: [email protected]

KRONJ, TTY151, 21-Oct-84 2:05PM
gee, you don't think ddt is good enough?
the main thing pasddt lacks is the ability to call functions - that
would be useful.  you would have to make kcc give you some sort of
symbol table, so you could pick out defines and variables instead of $
labels and all that, and be able to make a decent guess at the type
of things.  you also need to make the compiler force out the peephole
buffer between each statement so you can tell them apart.  but i guess
those are all merely implementation issues and not features.
i will certainly think about it.

KRONJ, TTY151, 21-Oct-84 2:18PM
i like kirk's idea.  would be especially useful if you could make
logical combinations of conditions, for instance "you are at this
breakpoint and this expression has a certain value".

this would all be much easier to implement in a lispm like situation
rather than in c where for one thing it is a completely compiled
language and for another the runtime situation is pretty free form
(the compiler does all sorts of nasty things to the stack and such).
but i guess if you were debugging a program you could tell the
compiler to generate more mechanical code, so that's not as much
of a problem.

since the general rule in c seems to be that you can do anything and
change anything, that should probably also be true of a c debugger.
i.e. you should be able to treat any type as any other type and
do whatever you want with it.  for that matter smart breakpoints and
data type interchangeability are just as useful in a ddt level
debugger, the main difference in a symbolic debugger is your typein
and typeout look like c code rather than assembly...
-------
13-Nov-84 22:50:31-PST,578;000000000001
Mail-From: SATZ created at 13-Nov-84 22:50:28
Date: Tue 13 Nov 84 22:50:28-PST
From: Greg Satz <[email protected]>
Subject: more "sizeof" problems
To: [email protected]
Phone: (415) 497-1004

In yacc, in the file y1.c and in the routine closeure(), there is a loop
that takes a pointer to a structure wsets. It increments that pointer
every iteration. The code being generated only increments the address by
6 bytes. It should be at least 8 since wsets contains an int and int *
(and another structure). I can point you to the bad spot if necessary.
-------
14-Nov-84 10:06:57-PST,402;000000000001
Mail-From: KRONJ created at 14-Nov-84 10:06:54
Date: Wed 14 Nov 84 10:06:54-PST
From: David Eppstein <[email protected]>
Subject: lock
To: [email protected]

lock
I am working on an improvement to skip handling.
I seem to have broken the sources in <KCC.CC>.
I have to go to class.
I will finish my changes later today.
In the meantime don't trust the sources in <KCC.CC>.
-------
14-Nov-84 18:04:12-PST,865;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 14 Nov 84 18:04:06-PST
Date: 14 Nov 1984  21:02 EST (Wed)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bug-KCC@Sierra
Subject: unlock

I am done playing with the compiler; you can go back to real work.
In case you care, what I did was change code of the form

		SKIP1
		JRST lab1
		SKIP2
		JRST lab2
	lab1:

into:
		reverse SKIP1
		reverse SKIP2
		TRNA
		JRST lab2
	lab1:

This takes approximately identical time, but has the advantages that
it makes triple skips, which I am fond of, and also that often lab1:
becomes unreferenced and can be flushed, allowing further optimization.
E.g. see <KCC.CC>TEST.C and .FAI for examples of quadruple and
quintuple skips.
15-Nov-84 13:03:04-PST,2652;000000000001
Received: from LOTS-A by Sierra with Pup; Thu 15 Nov 84 13:02:57-PST
Received: from Sierra by LOTS-A with Pup; Mon 5 Nov 84 18:18:53-PST
Mail-From: KRONJ created at  4-Nov-84 08:45:45
Date: Sun 4 Nov 84 08:45:45-PST
From: David Eppstein <[email protected]>
Subject: ADDB bug finally fixed
To: [email protected]
ReSent-date: Mon 5 Nov 84 18:26:43-PST
ReSent-From: Dan Newell  <DAGONE@Sierra>
ReSent-To: d.dagone@LOTS-A
ReSent-Date: Thu 15 Nov 84 13:00:57-PST
ReSent-From: Dan Newell  <D.DAGONE@LOTS-A>
ReSent-To: bug-kcc@Sierra

I got tired of the ADDB bug's existence, so I fixed it.
The problem was of course that in foldplus(), where it switches
two operations to free them for other optimizations, it was checking
one but not both of the operations for the BOTH flag.  I made it check
the other, and the problem went away.  Compare the two versions of
CCOPT.C for details.  The new EXE is in <KCC.CC>.

In case you run across a similar problem later, here's how I go about
solving this sort of thing:

(1) Generate a small test input that exhibits the bug.  This is done
    by taking the (usually large) program that you found the problem
    in, and cutting code to make it as small as possible while
    retaining the problem.

(2) Run the compiler with DDT breakpoints at critical places.  You
    will have to GET CC, DDT, set your breaks, then SAVE and run the
    saved file on your input.  Do PUSHJ 17,FLUSHC$X at various
    different places to clear out the peephole buffer before and after
    various optimizations, and compare the results.

(3) You should now have a fairly good idea where the bug is.  Go
    through the listing by hand with the data from your test input,
    looking for the bug or for more key places to set breakpoints for
    the next iteration.

While I'm listing procedures, here's one for building a new compiler
after some change has been made.

(1) DO CC to make a new binary.

(2) Do it again to run the new compiler on itself.

(3) Do DIR,CHECKSUM to make sure that the two new binaries are
    identical.  If they are, this compiler is not guaranteed to be
    bug-free but at least is probably safe to use for further compiler
    development.  If there was some new optimization that made them
    different, DO CC again and make sure that they are the same this time.

(4) If they were not the same, you have introduced a new bug or
    uncovered an old one.  Do DIR,CHECKSUM on all the REL files to see
    which ones changed.  Make FAI files with both the new and old
    compilers to see what the changes are.  Fix the bug...
-------
15-Nov-84 13:03:14-PST,2094;000000000001
Received: from LOTS-A by Sierra with Pup; Thu 15 Nov 84 13:03:05-PST
Received: from Sierra by LOTS-A with Pup; Sun 11 Nov 84 14:04:54-PST
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sun 11 Nov 84 14:05:56-PST
Date: 11 Nov 1984  17:03 EST (Sun)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Cc:   [email protected]
Subject: another possible bug -- from yacc
In-reply-to: Msg of 10 Nov 1984  18:11-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>
ReSent-Date: Thu 15 Nov 84 13:00:59-PST
ReSent-From: Dan Newell  <D.DAGONE@LOTS-A>
ReSent-To: bug-kcc@Sierra

The code in countcases() that finds labels for cases was using
brklabel when it saw a continue statement; it should have been
using looplabel.  A silly mistake, easily fixed.

The only code that changed in the compiler after this was fixed
was in foldplus(), the part of the peepholer that handles addition
opcode folding, spindling, and mutilating.  Unfortunately the code
there worked better when the continues were incorrectly compiled as
breaks than when they were correctly compiled as continues.  This was
showing up as incorrectly compiled code in some symbol table hashing
routine.  I then fixed foldplus() to work better than either the old
incorrectly compiled code or the old correctly compiled code.

In doing this I noticed that the old code had been incorrectly
compiling a couple of hairy array references in the lexer (something
to do with macro arguments).  The new code compiled them correctly,
but although the generated code was correct it wasn't very efficient.
Much more hacking at foldplus() finally produced good code there too.
Happily this didn't produce any more changes in generated code.

So anyway, your switch statement in yacc should probably work better now.
The new compiler is installed in <KCC.CC> and <KCC.C>.

Dan: you might want to update your sources in <KCC.ATBAT>, whatever
that directory is supposed to be.  I haven't touched anything there.
15-Nov-84 13:03:35-PST,1941;000000000001
Received: from LOTS-A by Sierra with Pup; Thu 15 Nov 84 13:03:14-PST
Received: from Sierra by LOTS-A with Pup; Mon 12 Nov 84 11:05:40-PST
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 12 Nov 84 06:44:17-PST
Date: 12 Nov 1984  09:41 EST (Mon)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Dan Newell <DAGONE@Sierra>
Subject: changes...
ReSent-Date: Thu 15 Nov 84 13:01:01-PST
ReSent-From: Dan Newell  <D.DAGONE@LOTS-A>
ReSent-To: bug-kcc@Sierra

Damn mailer doesn't understand "LOTS-A".  You should put something
replyable in your "From" headers.  Anyway, retrying:

I've only occasionally been working on the compiler.  When I do work
on it, it's mostly hacking up the peephole optimizer, which shouldn't
interfere with any of the other parts you'd be working on.  So there
probably isn't any problem with duplication of effort...

There isn't even a copy of the compiler here; maybe I will bring one
over soon to hack with.  Otherwise I'm stuck with using PCC (gag), or
not using C at all (which is what I have been doing).  When I do work
on the thing, should I be using <KCC.ATBAT> or <KCC.CC>?

Before you fix global scoping of structures, you're going to have to
fix the compiler itself not to depend on that "feature".  That is,
change all data structures that have multiple struct declarations to
be one struct with unions.  Most of these can be found in cc.s or cc.h
(I forget which) but there are also some in cc.g, and of course these
are all used all over the place.  Lots of work.  Sigh.

The type declaration structure already remembers what fields apply to
what structures.  All you would need to do there would be to make
the field offset be stored in that structure rather than in the symbol,
and to find the offset when generating code by looking through the
struct definition for a matching symbol...
15-Nov-84 13:06:15-PST,314;000000000001
Mail-From: DAGONE created at 15-Nov-84 13:06:10
Date: Thu 15 Nov 84 13:06:10-PST
From: Dan Newell  <[email protected]>
Subject: pardon the duplication...
To: [email protected]

   I was trying to remail one message from Dave to bug-kcc
and managed to grab the whole kittenkaboodle.
	Dan
-------
16-Nov-84 18:59:11-PST,692;000000000001
Mail-From: DAGONE created at 16-Nov-84 18:59:05
Date: Fri 16 Nov 84 18:59:05-PST
From: Dan Newell  <[email protected]>
Subject: question about macro and cfork'ing it
To: [email protected]

   Right now, if you specify macro instead of fail for
the assembler you want, you are left hanging inside the
macro assembler. Prarg doesn't seem to be working right
for macro and I can't seem to find any doc to figure
out how to use prarg to get macro squared away and how
to get link.exe in on this as well. Anyone out there know
what I'm trying to find out or where to look? For now
I'm going to chase down Ralph's book and the jsys manual
if I can find a copy.
	Dan
-------
18-Nov-84 03:16:47-PST,592;000000000001
Mail-From: LOUGHEED created at 18-Nov-84 03:16:44
Date: Sun 18 Nov 84 03:16:44-PST
From: Kirk Lougheed <[email protected]>
Subject: yacb (Yet Another Compiler Bug)
To: [email protected]

The following C program:

	int mumble;
	int foo= { 0 };
	int bar;

	main()
	{
	}

generates the following code.  Note the "BLOCK 247566"; it is probably a PC
in the compiler since it always recurs.  I would be very happy if some kind
compiler hacker would remove this bug.

	mumble:	BLOCK	1
	foo:	0
		BLOCK	247566
	bar:	BLOCK	1

		RELOC
	main:
		POPJ	17,

-------
18-Nov-84 08:32:21-PST,947;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sun 18 Nov 84 08:32:17-PST
Date: 18 Nov 1984  11:30 EST (Sun)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bug-KCC@Sierra
Subject: [DG: compiler thanks]

FYI.  This guy's been trying to port programs to the IBM PC, and is
using KCC to test out ports before moving them to the PC...
            ------------
Date: Saturday, 17 November 1984  23:37-EST
From: David Glaser <DG at COLUMBIA-20.ARPA>
To:   EPPSTEIN at COLUMBIA-20.ARPA
Re:   compiler thanks

david:

got pr to work.  the compiler seems  to be working fine.  If you want,
I'll send you the code.

/dg

ps:  just use  the berkely fread in /usr/src/lib/stdio/rdwr.c

pps:  the pc version is not so easy.  apparently the console driver does
horrible things to tabs, line feeds, and form feeds.
19-Nov-84 08:13:39-PST,3309;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 19 Nov 84 08:13:28-PST
Date: 19 Nov 1984  11:10 EST (Mon)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bug-KCC@Sierra
Subject: [HEDRICK: new Pascal for Fortran 10, LINK 6, DDT 43]

Thought this would be of interest...
            ------------
Date: Sunday, 18 November 1984  18:40-EST
From: Charles Hedrick <HEDRICK at RUTGERS.ARPA>
To:   pascal: ; at RUTGERS.ARPA
Re:   new Pascal for Fortran 10, LINK 6, DDT 43

There is a new version of Pascal, designed to work with DEC's next
generation of language software: Fortran 10, LINK 6, and DDT 43. It
should not be different for normal users.  However if you use extended
addressing, it will use new features of LINK version 6 in order to load
your program directly into a non-zero section. The structure of an
extended Pascal program is somewhat different:

  - it uses PSECTS .CODE., .DATA., and .LARG. to hold code, small data,
	and large data areas.  This makes it compatible in program
	structure with Fortran version 10.

  - in order to load directly into extended memory, it uses a number of
	new REL block types.  These are implemented only in LINK version
	6.

  - the extended library has been renamed from PSXLIB.REL to PASXLB.REL.
	(This means you can now find all the pieces of Pascal by looking
	for SYS:PAS*)

  - if your program uses extended addressing, it will not have a job
	data area.  LINK now generates a "program data vector", which
	servers the same function.  

  - Pascal will use PDVOP to try to find DDT.  If it has to load DDT, it
	will look for SYS:XDDT.EXE.  This makes it compatible with DDT
	version 43.

Since the software it is designed to work with is not yet widely
distributed, this version is not in the main distribution directory.
Rather, it is on S:<PASCAL.NEW>. When release 6.0 comes out, these files
will be moved to S:<PASCAL>, and thus will become the standard release.
Until then, if you want to use this version, you should overlay
S:<PASCAL> with this directory.  That is, any files in this directory
should replace those in S:<PASCAL>  The easiest way to do so is
    def dsk: dsk:,<pascal>
You could also simply copy *.* from this directory to <PASCAL> and then
use <PASCAL>.  (That is roughly what we will do when we move to this
version.)

If you simply want to install this version on your system, all of the
binaries are in S:<PASCAL.NEW>, so you don't need to look at <PASCAL>
except for documentation.

Further development work will all occur on this version, unless some
really fatal bug turns up in the old one.  When I make the final release
on tape, I will try to find a way to leave around the old one, for
those sites that do not update to 6.  This should only be a problem
for extended addressing code.  Section 0 code produced by the new
compiler still follows the old conventions.  (Yes, that means that the
compiler produces quite different REL blocks depending upon whether your
code is extended or not.  However most of the runtimes run either way,
depending upon the setting of SN.COD, which is the section number in
which the code is running.)  
20-Nov-84 09:43:46-PST,3535;000000000001
Mail-From: SATZ created at 20-Nov-84 09:43:38
Date: Tue 20 Nov 84 09:43:38-PST
From: Greg Satz <[email protected]>
Subject: [John Bruner <[email protected]>: "tar" and non-8-bit byte machines]
To: [email protected]
Phone: (415) 497-1004

For your enlightenment:
                ---------------

Return-Path: <[email protected]>
Received: from BRL-TGR by SU-SIERRA.ARPA with TCP; Mon 19 Nov 84 22:16:06-PST
From: John Bruner <[email protected]>
Newsgroups: net.unix-wizards
Subject: "tar" and non-8-bit byte machines
Message-ID: <[email protected]>
Date: 19 Nov 84 19:36:55 GMT
Resent-Date:  Mon, 19 Nov 84 17:40:34 EST
Resent-From:  [email protected]
Resent-To:    [email protected]

The S-1 Project at the Lawrence Livermore National Laboratory is
porting UNIX to our own machine, the S-1 Mark IIA.  One problem
that we're currently trying to solve is the implementation of "tar".

The crucial facts are:

1) The S-1 memory is organized into 36-bit words (addressable in
   9-bit quarterwords). **Sigh.**

2) On the S-1, characters are nine bits and are stored one per
   quarterword.

3) UNIX does not distinguish file types (e.g. character vs. binary).

The problem is this: we want to be able to read/write "tar" tapes
containing ASCII text files on both the VAX and the S-1. The
"obvious" mapping is for the S-1 to associate each 8-bit byte
with the low-order 8 bits of a 9-bit quarterword, discarding or
zero-filling the uppermost bit in the quarterword as appropriate.

A different mapping is required for binary files (because the
ninth bit is significant): the S-1 packs 9-bit quarterwords into 8-bit
bytes.  (There is hardware support for this conversion operation.)

The issue is that, in order for the VAX to read S-1 text files
and vice versa, text files must be stored using a different
representation than binary files.  There is no reliable way to
determine whether a file should be "text" or "binary" when the
tape is written, and no field in the "tar" header for recording
this information even if the writer could reliably figure it out.

If all files on the "tar" tape are stored with 9-bit quarterwords
packed into 8-bit bytes, text files on the "tar" tape are
unusable on the VAX.  (Of course, we have programs which will
pack/unpack them, but this must be done manually and it is a real
hassle.)

I don't want to define an incompatible "tar" format for the S-1.
I have used UNIX systems for M68000's which write tapes with byte
reversal problems so that I could not read them directly on our VAX
(it was necessary to pipe the input through "dd conv=swab"), and I
feel that the intent of "tar" format is to provide a standard
means for information exchange.  At this point, though, I can't
think of any alternatives to this approach.

P.S. Our next machine will have 32-bit words, but it will also have
hardware tags.  An image copy of a file on tape will include both
the 32-bit data and a 4-bit tag (probably stored in a fifth byte).
While the 9/8-bit packing problems will go away, the key problem still
remains: a "tar" text file should contain only characters (not tags),
so binary files and text files must be stored in a different format.
I don't see how to do this with the current "tar" definition.
-- 
  John Bruner (S-1 Project, Lawrence Livermore National Laboratory)
  MILNET: [email protected] [jdb@s1-c]	(415) 422-0758
  UUCP: ...!ucbvax!dual!mordor!jdb 	...!decvax!decwrl!mordor!jdb
-------
21-Nov-84 12:50:25-PST,741;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 21 Nov 84 12:50:19-PST
Date: 21 Nov 1984  15:26 EST (Wed)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bug-KCC@Sierra
Subject: include files

Dave Glaser found a need for a couple more include files, so he copied
them from the nearest Vax and I copied them from here on to Sierra.
They are:
	strings.h	Declarations of functions in string(3)
			(<KCC.C>STRING.C).

	sys/file.h	Various constants for things like file access
			codes and the like.  The relevant runtimes
			(ACCESS.C and friends) should probably use these
			instead of actual numbers.
24-Nov-84 05:26:44-PST,1575;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 24 Nov 84 05:26:40-PST
Date: Sat 24 Nov 84 05:23:26-PST
From: Ken Harrenstien <[email protected]>
Subject: C compiler bugnote (sort of)
To: [email protected]
cc: [email protected]

Greg Satz just gave me a binary copy and added me to the BUG-KCC list,
in case you're wondering where I came from.  Anyway:

I am sure this is not news, but there seem to be a lot of C library
routines that need to be written.  Specifically, I just noticed that a
hack of mine bombs out because there is no QSORT, SSCANF, or GETS.  Is
there a list of stuff missing from the library?  If I wished to write
a replacement for a missing routine, who should I coordinate it with?
Otherwise, the new version seems to have fixed all the numerous bugs
I encountered in the old; a great improvement.

I am a little disturbed by the last message to BUG-KCC (so far the
only one I've seen, so I could be missing context) in which it is
implied that it is normal to simply copy over real live UNIX source
files to fill in for anything that is missing.  For .H files you can
use whatever is in the publicly published documentation, yes, but
actually admitting that you are copying source files directly from a
legally protected UNIX system could mean that KCC becomes vulnerable
to various types of lawsuits -- in any event it would no longer be
"public" in which case I would no longer be interested either in
helping or in using it.  Can anyone furnish some reassurances?
-------
24-Nov-84 13:20:54-PST,848;000000000001
Mail-From: LOUGHEED created at 24-Nov-84 13:20:52
Date: Sat 24 Nov 84 13:20:52-PST
From: Kirk Lougheed <[email protected]>
Subject: Re: C compiler bugnote (sort of)
To: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Sat 24 Nov 84 05:26:43-PST

Yes, there are a large number of runtimes missing from KCC's CLIB.  They
need to be written.  The procedure has been to fault in the runtimes as
needed.  Greg Satz is the person you should contact about setting up new
runtimes.

A reminder to runtime writers: it is an explicit goal of this compiler that
it contain no AT&T copyrighted material.  The same goes for the runtimes.
We want to be able to distribute this compiler without AT&T lawyers (and
Stanford Office of Technology and Licensing) breathing down our necks.

Kirk
-------
27-Nov-84 00:05:13-PST,1462;000000000001
Mail-From: SATZ created at 27-Nov-84 00:05:09
Date: Tue 27 Nov 84 00:05:09-PST
From: Greg Satz <[email protected]>
Subject: Re: C compiler bugnote (sort of)
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Sat 24 Nov 84 05:26:42-PST
Phone: (415) 497-1004

I must confess; I am the person handling the coordination of the C
runtimes. There is no real list to speak of describing what isn't in the
compiler. It would be much easier to build a list of things supported
but since I am involved in other things and the compiler isn't really
ready for release, I haven't done this yet. I will right after this
note.

If you are interested in writing runtimes, just drop me a note where the
source is and I will incorporate it into the library sources here. Most
of the runtimes have been coded in C, but some of the support routines
are in macro/fail.  So far all of our runtimes have been written from
scratch. The message you saw was the first time (that I know of) where
restricted code has been used. I will probably delete them shortly, but
time hasn't been available for such things yet.

It is my understanding that the Compiler will be free to anyone who
wants to use it provided that they don't resell it. Stanford University
will copyright it for this reason. Otherwise it is our goal to provide a
PDP-10 C compiler without any licenseeing requirements.
-------
27-Nov-84 19:44:25-PST,2566;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 27 Nov 84 19:44:20-PST
Date: Tue 27 Nov 84 19:42:02-PST
From: Ken Harrenstien <[email protected]>
Subject: [Richard M. Stallman <RMS at MIT-PREP>: Here is the permission notice I use]
To: [email protected]

This should be of interest to KCC people if you haven't already seen it.
Although it is a little long, I suggest you use something like it.
                ---------------

Return-Path: <@MIT-MC:RMS@MIT-PREP>
Received: from MIT-MC by SRI-NIC.ARPA with TCP; Tue 27 Nov 84 19:34:40-PST
Date: Tuesday, 27 November 1984, 22:35-EST
From: Richard M. Stallman <RMS at MIT-PREP>
Subject: Here is the permission notice I use
To: klh at sri-nic

/* Extended regular expression matching and search.
   Copyright (C) 1984 Richard M. Stallman

   Permission is granted to anyone to make or distribute
   verbatim copies of this program
   provided that the copyright notice and this permission notice are preserved;
   and provided that the recipient is not asked to waive or limit his right to
   redistribute copies as permitted by this permission notice;
   and provided that anyone possessing a machine-executable copy
   is granted access to copy the source code, in machine-readable form,
   in some reasonable manner.

   Permission is granted to distribute derived works or enhanced versions of
   this program under the above conditions with the additional conditions
   that the entire derivative or enhanced work
   must be covered by a permission notice identical to this one.

   Anything distributed as part of a package containing portions derived
   from this program, which cannot in current practice perform its function
   usefully in the absence of what was derived directly from this program,
   is to be considered as forming, together with the latter,
   a single work derived from this program,
   which must be entirely covered by a permission notice identical to this one
   in order for distribution of the package to be permitted.

   This software is distributed in the hope that it will be useful,
   but there is no warranty of any sort, and no contributor accepts
   responsibility for the consequences of using this program or for whether
   it serves any purpose in particular.

 In other words, you are welcome to use, share and improve this program.
 You are forbidden to forbid anyone else to use, share and improve
 what you give them.   Help stamp out software-hoarding!  */

-------
27-Nov-84 19:45:07-PST,1109;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 27 Nov 84 19:45:02-PST
Mail-From: KLH created at 27-Nov-84 18:34:01
Date: Tue 27 Nov 84 18:34:01-PST
From: Ken Harrenstien <[email protected]>
Subject: Compiler bug for "if(cond);"
To: [email protected]
ReSent-Date: Tue 27 Nov 84 19:42:44-PST
ReSent-From: Ken Harrenstien <[email protected]>
ReSent-To: [email protected]

I found a compiler bug which is easily demonstrated:

test()
{	if(dothis());
	dothat();
}

If you compile the above program you will see thatthe "dothis" routine
is never called.  In other words, the compiler appears to think that
because the IF has a null success statement, it is not even necessary to
evaluate the condition!  This is a no-no, as many of us who depend on
condition side-effects can testify.  Granted, the above code is likely
to be a typo by the programmer, but if the compiler is so clever what it
should do is just print a warning that the IF has a null statement, and
compile it anyway, rather than silently ignoring it altogether!
-------
27-Nov-84 19:45:27-PST,1038;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 27 Nov 84 19:45:23-PST
Mail-From: KLH created at 27-Nov-84 19:14:30
Date: Tue 27 Nov 84 19:14:30-PST
From: Ken Harrenstien <[email protected]>
Subject: 2-d char array problem
To: [email protected]
cc: [email protected]
ReSent-Date: Tue 27 Nov 84 19:43:06-PST
ReSent-From: Ken Harrenstien <[email protected]>
ReSent-To: [email protected]

There is a problem with references to 2-d char arrays.  Given:
	char table[10][20];

then if you do
	c = table[1][0];
	c = table[1][19];
	c = table[2][0];

You will find that the 3rd reference points in between the first two!
The computation of the 1st index appears to be dividing the 2nd dimension
by 4 even after it has already been reduced to a word count.  The same
problem appears to afflict constructs of the form
	char *cp1, *cp2;
	cp1 = table[1];
	cp2 = table[2];

You will find that cp2-cp1 gives you a number which is not 20.  Want to guess
what it is?
-------
27-Nov-84 21:25:47-PST,1337;000000000001
Mail-From: SATZ created at 27-Nov-84 21:25:44
Date: Tue 27 Nov 84 21:25:44-PST
From: Greg Satz <[email protected]>
Subject: available runtimes
To: [email protected]
Phone: (415) 497-1004

here is a listing of the C runtime entry points.

	Listing of Modules and Entry points
Produced by MAKLIB Version 2B(104) on 27-Nov-84 at 21:24:27

	**************************

C:CLIB.REL[4,1230] Created on 16-Nov-84 at 23:06:00

FSEEK	FSEEK	FTELL	REWIND
LSEEK	LSEEK	TELL
SETJMP	SETJMP	LONGJM
SLEEP	SLEEP	PAUSE
UNLINK	UNLINK
SIGSYS	SIGNAL	SIGSYS	KILL
MKTEMP	MKTEMP
STAT	STAT	FSTAT
ACCESS	ACCESS	.RLJFN
TIME	TIME	CTIME	.T2UTI	.U2TTI
CALLOC	CALLOC
MALLOC	MALLOC	FREE	REALLO
PRINTF	PRINTF	SPRINT	FPRINT
ATOI	ATOI	ATOF
SETBUF	SETBUF	.SOBUF
STDIO	.IOB	FGETS	FOPEN	FREOPE	FPUTS	FCLOSE	UOPEN	FFLUSH	GETC	UNGETC	PUTC	.PUTC
PERROR	PERROR
.MAIN	.START	.EXIT	GTJFN.	.DIRST	.OPEN	.CLOSE	.WRITE	.READ	.CPUTM	FORK	VFORK	WAIT	.EXEC	.PIPE	.SPJFN
	BRK	SBRK	.FLOUT	$BYTE	END	ETEXT	EDATA
RUNTM	.RUNTM	.CH	EXIT	.SEXIT	CLOSE	OPEN	BOPEN	IOPEN	.FNAME	CREAT	BCREAT	ICREAT	GETFD	WRITE	READ
	PIPE	DUP	DUP2	.OFILE	.CFILE	.GTJFN	EXECL	EXECLE	EXECV	EXECVE
CTYPE	.CTYPE
STRING	STRCAT	STRNCA	STRCMP	STRNCM	STRCPY	STRNCP	STRLEN	INDEX	RINDEX
BYTE	$ADJBP	$SUBBP	$BPCNT
DFIX	$DFIX	$DFLOT
ABORT	ABORT
GETPID	GETPID
JSYS	JSYS
-------
28-Nov-84 02:44:24-PST,1116;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Wed 28 Nov 84 02:44:19-PST
Date: Wed 28 Nov 84 02:41:59-PST
From: Ken Harrenstien <[email protected]>
Subject: Code that smashes the PDL pointer
To: [email protected]
cc: [email protected]

Enclosed is a subroutine which manages to generate code that smashes
the PDL register (17).  If you look at the FAIL code output, you'll see
that during the evaluation of the stuff pushed on the stack for the
sprintf call, it uses an IDIVI 16,foo -- which naturally leaves the
remainder in 17.  Tsk, tsk!
	----------------------
/* PTIME - prints time.  Allows up to 10 active calls, good for use with
 *	printfs where several time strings must be pushed on stack.
 */
static int _ptidx;
static char _ptstrs[10*8];	/* 10 strings of up to 7 chars */
char *
ptime(secs)
register int secs;
{	register char *cp;
	register int i = _ptidx;

	if(++i >= 10) i = 0;
	cp = &_ptstrs[i*8];
	_ptidx = i;
	sprintf(cp,"%2d:%d%d", secs/60, (secs%60)/10, (secs%60)%10);
	return(cp);
}
	-----------------------
-------
27-Nov-84 17:23:50-PST,754;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 27 Nov 84 17:23:49-PST
Date: Tue 27 Nov 84 17:21:22-PST
From: Ken Harrenstien <[email protected]>
Subject: qsort.c
To: [email protected]
cc: [email protected]

RMS sent me his (public) qsort.c routine. You can find it in <KLH>QSORT.C.
When trying to compile it, the compiler barfs on references to
indirect function calls:

static  int		(*qcmp)();		/* the comparison routine */

Every call on qcmp() produces an error.  Interestingly enough, when I
tried compiling this on the 4.2BSD VAX, there were no errors; but when
I tried it on the 2.9BSD 11/44, I got "call of non-function" errors.
Evidently there are standardization problems.
-------
27-Nov-84 19:39:46-PST,697;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 27 Nov 84 19:39:43-PST
Date: Tue 27 Nov 84 19:37:19-PST
From: Ken Harrenstien <[email protected]>
Subject: more on qsort.c
To: [email protected]
cc: [email protected]

I changed all qcmp references to (*qcmp) which seems to be a more
portable form of the syntax (it is what I use in ELLE, and hasn't
caused any problems) -- it compiled OK then.  Unfortunately at runtime
qsort doesn't work right; things aren't sorted.  I tried it on the VAX and it
worked.  So there is some further KCC bug in there for the finding.
Aren't you happy they are starting to crawl out of the woodwork?
-------
28-Nov-84 02:18:26-PST,1422;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Wed 28 Nov 84 02:18:25-PST
Date: Wed 28 Nov 84 02:16:04-PST
From: Ken Harrenstien <[email protected]>
Subject: False alarm (hurray)
To: [email protected]
cc: [email protected]

Aha, I deduced the problem with QSORT, although I'm still not sure
what was going wrong with the debugging.

The problem is the fundamental screw of PDP-10 C: the non-congruence of
char pointers with any other kind of pointer.  QSORT's first argument
is defined to be a char pointer.  This screws up every caller who is
not careful to cast the first arg to (char *).  I shudder to think of
how many programs there are which pass "general-purpose pointers"
as function arguments.  Currently C provides for a check of the return-value
type, but does not have any way to declare (or check) the argument types.

As you can imagine, this makes QSORT incredibly inefficient since just
about all of its internal workings deal with char pointers.
To KCC's credit, though, once I cast the 1st arg properly in the call,
QSORT worked.  Considering the manipulations involved in that routine,
this is just short of unbelieveable!

What the right thing should be is not immediately obvious.  I am going
to think a bit and see whether the approach I started in ELLE's
general-purpose storage management can be extended to cover this.
-------
28-Nov-84 21:40:11-PST,1744;000000000001
Mail-From: DAGONE created at 28-Nov-84 21:40:06
Date: Wed 28 Nov 84 21:40:06-PST
From: Dan Newell  <[email protected]>
Subject: Fixes to two bugs...
To: [email protected]

  Bug fix #1 - Structures with members that have forward referenced
types must be ptr types now. It used to not complain about these,
now it generates an error.
	struct foo {
	    struct bar arf; /* wrong */
	    struct bar *asdf; /* ok...so long as bar is later defined */
	};
The second member listed will presently compile if bar is never
defined. I will have to fix this to make the cleanup code check
for undefined references.

  Bug fix #2 - Idivi using 16, trashing 17. Caused by someone
using '=' instead of '==' in an if statement. Evaluated true
always thereby avoiding the test later that checked that reg+1
was ok to use.

   I've put the changes back into kcc.cc and will now rebuild
cc.exe. Let me know if there are any problems.
   Also, I am about to change the switches to look more like
real unix. Before I do so, I want to try to get a reasonable
set of them working so that you all won't be wondering what
hit each time I update them. I also want to come up with
a document describing the switches and mapping from old to new.
   Now, the problem is that I want the driver to chain in
link.exe and create the .exe from the .rel. I've wasted
a fair amount of time sifting through the exec trying to
figure out the mysteries of PRARG and what link.exe wants
because someone sent me there to look. Before I waste much
more time, I was hoping someone could tell me, or point me
to a better place to look, how to do this right. Documentation
would be appreciated or names of people to talk to.
   Dan
-------
29-Nov-84 20:12:58-PST,839;000000000001
Mail-From: WHP4 created at 29-Nov-84 20:00:38
Date: Thu 29 Nov 84 20:00:38-PST
From: Bill Palmer <[email protected]>
Subject: running link from kcc
To: [email protected]
cc: [email protected]

I did a little stone-turning in kcc, execcs, and pa1050 which enabled me to 
write a quick hack that will run link on a bunch of rel files and save the
result.  The program is <whp4>runlnk.c - it uses the pfork() call from 
<kcc.cc>ccpfrk and lifts the arg() call from <kcc.cc>ccasmb.c pretty much 
intact except for one magic number changed because it was different when the
exec set things up for link.

runlnk runlnk cc:ccpfrk

will run link to link runlnk itself.  The first filename on the command line
is where link saves the resulting image.

Hope that provides a little enlightenment.

					Bill
-------
29-Nov-84 20:15:37-PST,1064;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Thu 29 Nov 84 16:47:33-PST
Date: 29 Nov 1984  19:46 EST (Thu)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bug-KCC@Sierra
cc:   [email protected]
Subject: A few more bugs

These came up in porting some code produced by lex.

	int foo;
	extern int foo;

produces

	foo:	BLOCK	1
	EXTERN	foo

and FAIL barfs because you're trying to declare it both extern and local.

	extern int bar;
	int * baz = &bar;
	int * barf = { &bar };

The declaration of baz comes out correctly as

	<code to initialize $1>
	baz:	$1::	BLOCK	1

but barf becomes

	<code to initialize $3>
	$5:	$3::	BLOCK	1
	barf:	$5::	BLOCK	1

The extra braces cause wierdness which is detected when FAIL bombs
with a multiply defined symbol.  I think (don't remember for sure)
this came up before in KLH's bug list.  Strange that in all the work I
did in initialization I managed to miss this...
30-Nov-84 01:13:48-PST,704;000000000001
Mail-From: SATZ created at 30-Nov-84 01:13:45
Return-Path: <@SU-SCORE.ARPA:bassen@oslo-vax>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Fri 30 Nov 84 00:59:19-PST
Received: from oslo-vax.ARPA by SU-SCORE.ARPA with TCP; Fri 30 Nov 84 00:34:35-PST
Received: by oslo-vax.ARPA (4.12/4.7)
	id AA11384; Fri, 30 Nov 84 09:37:08 -0100
Date: 30 Nov 1984 09:35-EST
From: T S Lande <[email protected]>
Subject: C on TOPS-20
To: TOPS-20@su-score
Cc: bassen@oslo-vax
Message-Id: <470651737/bassen@oslo-vax>
ReSent-date: Fri 30 Nov 84 01:13:45-PST
ReSent-From: Greg Satz <[email protected]>
ReSent-To: [email protected]

Anybody know where to obtain a C-compiler for TOPS-20?

30-Nov-84 15:39:49-PST,474;000000000001
Mail-From: WHP4 created at 30-Nov-84 15:39:44
Date: Fri 30 Nov 84 15:39:44-PST
From: Bill Palmer <[email protected]>
Subject: macro processor bug?
To: [email protected]

Howcum this doesn't work?  Works with the 4.2 compiler and PCC...

#define ctrl(letter)	('letter' & 077)

main()
{
    char c;

    c = ctrl(G);
}

KCC gives errors about unclosed character constants, and then about not being
able to find )'s and ;'s.

					Bill
-------
 1-Dec-84 18:00:07-PST,647;000000000001
Mail-From: WHP4 created at  1-Dec-84 18:00:05
Date: Sat 1 Dec 84 18:00:05-PST
From: Bill Palmer <[email protected]>
Subject: INPUT SYNTAX ERROR
To: [email protected]

It appears that if you give a full filename to kcc, it will hand off bogus
arguments to FAIL which produces the error message above.

I take that back; it doesn't always happen.  However, it happens with the
file ps:<whp4>brkfai.c if you type "cc ps:<whp4>brkfai" and not if you type
"cc <whp4>brkfai".  It appears that in the broken case, fail is getting the
argument string "brkfai,=" instead of "brkfai,=brkfai" or something similar.

						Bill
-------
 3-Dec-84 01:39:29-PST,610;000000000001
Mail-From: WHP4 created at  3-Dec-84 01:39:27
Date: Mon 3 Dec 84 01:39:26-PST
From: Bill Palmer <[email protected]>
Subject: somebody should implement this new instruction...
To: [email protected]

or fix kcc so it doesn't try to produce a "FLTRI" opcode.  Here's a example
of how to do it:

main()
{
    float g;

    g = 0;
}

and what kcc produces:

	TITLE	bug
	TWOSEG
	RELOC	0
	RELOC	400000

	OPDEF	XMOVEI [SETMI]
	OPDEF	ADJBP [IBP]
	DEFINE	IFIW <SETZ >
	.REQUEST C:CLIB
	EXTERN	.START
main:
	FLTRI	3,0
	POPJ	17,

	INTERN	main

	LIT
	END

						Bill
-------
 3-Dec-84 08:03:57-PST,840;000000000001
Mail-From: SATZ created at  3-Dec-84 08:03:54
Date: Mon 3 Dec 84 08:03:54-PST
From: Greg Satz <[email protected]>
Subject: Re: seems like a bug, sort of
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Bill Palmer <[email protected]>" of Sun 2 Dec 84 18:15:07-PST
Phone: (415) 497-1004

	That 4.2 scanf does this:

	sscanf("0x0A","%x",&c);
	printf("c = %d\n",c);

	prints c = 0, because it stops reading at the 'x'.  Or does no
	one ever use that notation except in c code?

This seems like the correct behavior. The leading zero and leading 0x
notation is really only supported by the C compiler for constants. Some
other C programs have borrowed it for consistancy, but the runtimes
don't really support it internally. Besides. any number input could
really be in any base.
-------
 1-Dec-84 06:16:51-PST,7332;000000000011
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 1 Dec 84 06:16:40-PST
Date: Sat 1 Dec 84 06:14:02-PST
From: Ken Harrenstien <[email protected]>
Subject: Public LIBC plan
To: [email protected], [email protected]
cc: [email protected]

I am going to see if I can get some official time to work on fleshing out
a public LIBC or at least coordinating it.

In the meantime, I have come up with a first cut at a procedure, which
is in the file <KLH>LIBC.PLAN and which I am including in this
message.  It is sketchy and needs your comments and opinions before
proceeding further.  I think something of the sort is needed because
I would like to see both GNU and KCC be able to share the bulk of such
a library.

A first cut at the LIBC.MASTER file also exists.  It lists all
external symbols defined both in SRI-TSCB's 2.9BSD 11/44 libc.a, and
KCC's CLIB.REL.  I got a little carried away in describing the
LIBC.MASTER format here since I think it may want to be machine
parseable for various types of processing.  At the very least, this
ensures it can readily be converted to some other (more desirable?)
format.

------------------------ LIBC.PLAN --------------------------
Overall plan for putting together a public LIBC, to go with a public C
compiler:

(1) Scan a V7 libc.a to get a complete list of symbols: LIBC.V7
	Use the SYMS program.
	Categorize the symbols; identify them.
	Cross-check against the V7 UPM to verify.
(2) Do likewise for existing GNU routines: LIBC.GNU
(3) Do likewise for existing KCC routines: LIBC.KCC
(4) Merge the three lists; this becomes the "libc master list": LIBC.MASTER
	For each V7 symbol show its current status.
	Format is described below.
(5) Select a location to hold canonical LIBC sources.
	Copy all versions there.
	Where?  A UNIX system would provide a "homey" environment
	and permit cross-checking of V7 routines with GNU versions.
	On the other hand, stuff that works on a TOPS-20 system
	is almost certain to be portable elsewhere, and there may be
	less suspicion concerning UNIX copyright violations.
	Note: SRI-NIC can provide huge amounts of disk storage and
	would be a good distribution location, but probably cannot
	support active development - this can be seen as bad (more
	work to update canonical package from development systems) or
	good (clear formal separation between canonical distribution and
	testing-in-progress versions).  I'm neutral.  What is the general
	status of SU-SIERRA or MIT-PREP?  Comments?

(6) ?? Copy the version info (from libc master) at start of each routine.
	Whenever this is changed, the info must be changed both
	in the heading and in libc master.
	Hack up a version-info extractor so this part of libc
	master can be generated automatically from the files.
	Create place-holder files for functions not yet written?
(7) Assign priorities for implementation (which routines to write first).
	Make these priority notations part of LIBC.MASTER.

Format of symbol entries in LIBC.MASTER:

At least one line per symbol, possibly more:

<class> <symbol+type>	<version>: <module> (Need: <needed-versions>)
			<version>: <module> <other-info>

	Class:	Classification of symbol - similar to UPM sections.
		One-letter type indicators, okay to OR them by using together.
			S = System calls
			U = UNIX-environment dependent (e.g. GETPWENT)
			I = I/O
			H = Defined in .h file, not library.
			C = General C support - not I/O, System, or
				Unix-dependent.  Functions that "extend"
				C.  E.g. string routines, qsort, simple
				math stuff.
			L = Library support - needed by other library routines
				but not normally used by or known to 
				external user.  Usually written in C.
			R = C internal runtime support, invoked by
				compiler to handle various parts of the
				language.  Not invoked as C functions.
				Normally very system specific
				and written in assembler.
		The UPM classification can be appended with a dash:
			-2 = Unix system call
			-3 = General C library
			-3S = STDIO "library", normally integral with libc.a
			-3M = Math library, libm.a
			-3X = Other esoteric library
	Symbol+Type: The symbol, as a C declaration to indicate type.
		Always ended with a semicolon.
		.H file definitions are indicated thusly:
			switch foo;		Conditional compile switch
			constant FOO;		Manifest constant
			structure foo;		Structure declaration
			macro FOO;		Macro definition (general)
			macro int foo();	Macro "function" definition
	Module: Name of module (minus .C or .A extension).  One word.
		If an .H file, the .H is retained.
	Version: V7, GNU, KCC.  V7 indicates Western-Electric UNIX and
		should always appear on the first line if it exists.
		There must be one line for each version, indented if
		the class and symbol declaration are the same.
		If version is NOT written in C, specify language with
		suffixed dash, e.g. KCC-FAIL.  There is no V7-AS because
		the source is unavailable and the distinction is irrelevant.
		Note that if a GNU version exists, it is assumed to be
		usable by all other versions.  Thus a separate KCC version
		is not needed.  Exceptions can be flagged with the
		"Needed" keyword (see below).
		
	Info: General keyworded info.  Each item enclosed in parens or
		brackets; the first word is a keyword identifier (unique
		prefix is sufficient) with optional colon.  If the item is
		"unbalanced" with respect to parens/brackets, the offending
		char can be quoted with \.  This is unlikely to ever happen.
		In general, to indicate a "null-specified" item (as
		opposed to the default, which is assumed if item is not
		specified at all), use the value "-".

		The first two keywords (Done, Needed) only appear on the first
		line.  "Done" allows for compaction in cases where the
		default information is okay.  "Needed" marks those places
		which need work.  Keeping all on one line makes it easier
		to scan for things (using GREP or M-X Keep Lines$).
		Done: Which versions have been written.
			Default: -
			If a version does not appear here, but appears on a
			succeeding line anyway, that is OK.  The line wins.
			If a version appears here but does NOT appear on
			a succeeding line, everything for that
			version is defaulted; class, symbol, module name
			are as for first line.  Keyword items are all
			defaulted.
		Needed: Which versions still need to be written.
			Default: -
			If a version appears here, but also appears on a
			succeeding line anyway, the line wins.

		Maintainer: Who is responsible for routine.
			Default: Whoever is generally responsible for version.
				V7: -
				KCC: Satz@SU-SIERRA
				GNU: RMS@MIT-MC
		Source: location of source file (e.g. "module")
			Default: 
				V7: -
				KCC: [SU-SIERRA]PS:<KCC.C>module.C    ???
				GNU: [MIT-PREP]/u/rms/gnulib/module.c ???
		Doc: name of doc file
			Default:
				V7: UPM.  "man" directory.
				KCC: ? UPM? module.DOC?
				GNU: ? UPM? module.man?
		Sys: Systems tested on
			Default: assumed portable to all systems.
		Comments: Obvious.

Example:

HI-3S	macro FILE;	V7: stdio.h (Done: KCC) (Need: GNU)
I-3S	FILE *fopen();	V7: fopen (Need: GNU)
			KCC: stdio (Sys: TOPS-20)

-------
 3-Dec-84 22:29:25-PST,1028;000000000001
Return-Path: <P.PRUFROCK%LOTS-B.#Pup@[36.48.0.1]>
Received: from [36.48.0.1] by SU-SIERRA.ARPA with TCP; Mon 3 Dec 84 22:29:19-PST
Received: from LOTS-B by LOTS-A with Pup; Mon 3 Dec 84 21:25:36-PST
Date: Mon 3 Dec 84 21:25:06-PST
From: Douglas Lee <P.PRUFROCK@LOTS-B>
Subject: Problems compiling a C program
To: bug-c@LOTS-B



I downloaded a program from SUMEX written for the MIT C compiler. I've been
trying to compile it using the C compiler under the directory sys:<cc>, but
I haven't been sucessful. It seems to compile fine, but when I try to load it
to create an .exe file, the linker gives me an error ps:clib.rel not found.
Am I doing something wrong or is this a problem with the compiler. The program
is an implementation of MACGET for the Mac. It is a file transfer program. 
since I am not familar with C I am not sure if I am doing things right. Do you
know of a fully implemented C compiler on any of the 20's at Stanford?

Thanks in advance,

Douglas Lee <p.prufrock@lots-b


-------
 3-Dec-84 22:57:08-PST,1206;000000000001
Mail-From: SATZ created at  3-Dec-84 22:57:02
Date: Mon 3 Dec 84 22:57:02-PST
From: Greg Satz <[email protected]>
Subject: two bug reports
To: [email protected]
Phone: (415) 497-1004

While helping Bill Palmer track down a test case for one bug, I stumbled
across another. Since the second bug is easier to explain, it will be
first.

1) if the printf in the following program is ommited, the compiler
generates an "unexpected eof error". The 4.2bsd compiler generates a
syntax error which is more correct.

2) it looks like the increment of the pointer p is getting optimized out
if the first if condition is true.

main()
{
	char c, *p = "abcd";
	int i;

	if ((c = *p++) == 'a')
		goto done;
	if (c == '*') {
		c = *p++;
		i = 0;
	} else
		i = 1;
done:
	printf("done %d\n", *p);
}

main:
	ADJSP	17,3
	XMOVEI	3,$2
	IOR	3,$BYTE+4
	MOVEM	3,-1(17)
	LDB	11,3
	MOVEM	11,-2(17)
	CAIN	11,141
	JRST	$1
	CAIE	11,52
	JRST	$3
	ILDB	10,-1(17)
	IBP	-1(17)
	MOVEM	10,-2(17)
	SETZB	12,0(17)
	JRST	$1
$3::
	MOVEI	14,1
	MOVEM	14,0(17)
$1::
	LDB	5,-1(17)
	PUSH	17,5
	XMOVEI	6,$4
	IOR	6,$BYTE+4
	PUSH	17,6
	PUSHJ	17,printf
	ADJSP	17,-5
	POPJ	17,
-------
 4-Dec-84 10:38:58-PST,826;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Tue 4 Dec 84 10:38:47-PST
Date: 4 Dec 1984  13:37 EST (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   [email protected]
Subject: two bug reports
In-reply-to: Msg of 4 Dec 1984  01:57-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

The increment problem sounds like the code that pulls IBPs into DPBs
and LDBs is looking back too far (i.e. past the jump).  Should be easy
enough to fix.

The PS:CLIB.REL problem on LOTS-B sounds like the library was copied
from Sierra rather than recompiled on LOTS.  FAIL puts absolute PPNs
into REL files, making them less than portable.  It would be good if
some way were found to stop it doing so.
 4-Dec-84 11:49:34-PST,662;000000000001
Date: Tue 4 Dec 84 11:49:34-PST
From: Greg Satz <[email protected]>
Subject: Re: [Richard M. Stallman <RMS%[email protected]>: Re: C compiler?]
To: [email protected]
cc: *"PS:<KCC.CC>MAIL.TXT.1"@SU-SIERRA.ARPA
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Sat 24 Nov 84 14:52:40-PST
Phone: (415) 497-1004

I don't think the KCC compiler will be useful to him at all. He is
welcome to copy the sources as long as he doesn't remove the Stanford
copyright and he doesn't try to sell it (the last thing I expect from
RMS). They are kept on SU-Sierra in <kcc.cc> and <kcc.lib>. Please keep
me posted on his interest in it.
-------
 4-Dec-84 13:31:42-PST,2268;000000000001
Return-Path: <RMS@MIT-MC>
Received: from MIT-MC by SU-SIERRA.ARPA with TCP; Tue 4 Dec 84 13:31:32-PST
Date: 4 December 1984 16:16-EST
From: Richard M. Stallman <RMS @ MIT-MC>
Subject: Re: Public LIBC plan
To: SATZ @ SU-SIERRA

I can use code with a Stanford copyright only if the code contains an
explicit permission notice that permits the sort of distribution I
will be doing with GNU.  This will include (I hope) companies giving
away copies of GNU to people who buy computers.  I would also like
the notice not to forbid charging for the service of distribution.

The right definition of "free" for software is "free to be
redistributed."  What I prohibit is for anyone to restrict anyone
else's right to redistribute.  While people can still charge for
distributing the program, they cannot charge very much, because
they no longer have the ability to secure any sort of monopoly.

Here is the permission notice that I use.  I recommend that you
use it too.

   Copyright (C) 1984 Richard M. Stallman

   Permission is granted to anyone to make or distribute
   verbatim copies of this program
   provided that the copyright notice and this permission notice are preserved;
   and provided that the recipient is not asked to waive or limit his right to
   redistribute copies as permitted by this permission notice;
   and provided that anyone possessing a machine-executable copy
   is granted access to copy the source code, in machine-readable form,
   in some reasonable manner.

   Permission is granted to distribute derived works or enhanced versions of
   this program under the above conditions with the additional condition
   that the entire derivative or enhanced work
   must be covered by a permission notice identical to this one.

   Anything distributed as part of a package containing portions derived
   from this program, which cannot in current practice perform its function
   usefully in the absence of what was derived directly from this program,
   is to be considered as forming, together with the latter,
   a single work derived from this program,
   which must be entirely covered by a permission notice identical to this one
   in order for distribution of the package to be permitted.

 4-Dec-84 13:48:09-PST,1324;000000000001
Return-Path: <RMS@MIT-MC>
Received: from MIT-MC by SU-SIERRA.ARPA with TCP; Tue 4 Dec 84 13:48:03-PST
Date: 4 December 1984 16:28-EST
From: Richard M. Stallman <RMS @ MIT-MC>
Subject: Public LIBC plan
To: KLH @ SRI-NIC
cc: satz @ SU-SIERRA

I'm not sure that the formality of the LIBC.MASTER data base
is really needed.  I think that the task of coordinating the
effort to get all the necessary things written can be done
with much less work informally.  Just putting the library sources
in a standard place so that people can see what exists
will accomplish a large part of the job.

Does LIBC.MASTER serve some other useful purpose?
If you feel it's useful and want to set it up, I would not
mind making the small updates needed when I write a library.
But I think you could probably accomplish more by making a simple
file that just records, for each Unix man page, whether it has
been written or not or who is working on it, and using the rest
of the time to write some libraries.

MIT-PREP is not ideal for a storage site since no system backups are
done.  SRI-NIC might be better for that.  I'm not sure that SRI-NIC
would let me have an account since I wouldn't keep the password
secret.  Perhaps if the library directory protection were set to all
7's I could write in it as anonymous.

 5-Dec-84 03:44:59-PST,1100;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Wed 5 Dec 84 03:44:55-PST
Mail-From: KLH created at  5-Dec-84 03:42:06
Date: Wed 5 Dec 84 03:42:06-PST
From: Ken Harrenstien <[email protected]>
Subject: PRINTF bug
To: [email protected]
ReSent-Date: Wed 5 Dec 84 03:42:29-PST
ReSent-From: Ken Harrenstien <[email protected]>
ReSent-To: [email protected]

PRINTF screws up and can get into a semi-infinite loop if you try to
print anything bigger than 132 chars.  In particular, trying to do
	printf("%s", buffer);
will make a horrible mess if the string in buffer is a little too long.

It could be argued that one should use FPUTS instead.  However, the UPM
does not mention this limitation, and PRINTF itself certainly doesn't help
the user figure out what is going on.  The code which crapped out on me
is known to work on both 11/44 2.9BSD and 4.2BSD systems, although possibly
only because their versions have bigger buffers.  I do think 132 is far
too small, whether or not the printf code is fixed to detect overflow.
-------
 5-Dec-84 10:36:29-PST,726;000000000001
Return-Path: <@SRI-NIC.ARPA:[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Wed 5 Dec 84 10:36:19-PST
Received: from SU-SIERRA.ARPA by SRI-NIC.ARPA with TCP; Wed 5 Dec 84 10:33:47-PST
Date: Wed 5 Dec 84 10:35:10-PST
From: Greg Satz <[email protected]>
Subject: Re: PRINTF bug
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Wed 5 Dec 84 03:44:57-PST
Phone: (415) 497-1004

printf shouldn't use any internal buffers at all. The 4.2 stdio
passes all output to flsbuf who actually does the write(). KCC
should be changed to do the same thing. For the interim, I increased
the buffer sizes up to BUFSIZ (1024).
-------
 8-Dec-84 10:18:57-PST,590;000000000001
Mail-From: SATZ created at  8-Dec-84 10:18:54
Date: Sat 8 Dec 84 10:18:54-PST
From: Greg Satz <[email protected]>
Subject: kcc <-> gnu
To: [email protected]
cc: [email protected]
Phone: (415) 497-1004

RMS is interested in the work we have done for the C runtimes. He is
not interested in the compiler. Can anyone come up with any reasons why
we should not deal with RMS. Please take note that we have gotten a
qsort() from him and will probably get a few more things. I will
make everything in <KCC.LIB> on Sierra available to him unless I hear
otherwise.
-------
13-Dec-84 17:47:53-PST,1165;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 13 Dec 84 17:47:48-PST
Date: Thu 13 Dec 84 17:50:47-PST
From: Ken Harrenstien <[email protected]>
Subject: Suggestion
To: [email protected]

KCC is something of a pain to run because it insists on printing out
the name of every function it sees.  I strongly suggest that it NOT
do this by default (make it switch-optional), so as to minimize the
amount of resulting trash.  KCC should then remember internally
the name of the last function it saw, and print out this name whenever
an error message is generated.  If you want to be fancy you can also
remember the line number of the last defined function, so that you can
provide the user with an offset from there as well as the absolute
line number.  This is often more useful when editing with EMACS (rather
than ED!)

The only other reason I can think of for the current printout is to give
the user the impression that KCC is running very fast.  All I can say is
that FAIL could print out every symbol it found, and it would look even
faster and busier, and be equally meaningless...
-------
18-Dec-84 19:00:00-PST,578;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 18 Dec 84 18:59:56-PST
Date: Tue 18 Dec 84 19:02:51-PST
From: Ken Harrenstien <[email protected]>
Subject: Ugh!  Bug with local gotos
To: [email protected]

The following test program

main()
{	int a,b;
	a = 123;
	if(a > 0) goto foo;
	b = 456;
}

will compile without error, but FAIL will complain about a meaningless
undefined symbol.  If your program is any more complicated than the
above, it may take you some time to figure out what the problem is!
-------
18-Dec-84 19:12:25-PST,1246;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 18 Dec 84 19:12:22-PST
Date: Tue 18 Dec 84 19:15:15-PST
From: Ken Harrenstien <[email protected]>
Subject: Suggestion for FAIL output
To: [email protected]

I would recommend that when KCC generates EXTERN statements for FAIL, it
avoid putting more than one symbol on each EXTERN line.  The reason is that
when FAIL complains about an already defined symbol, it does not always
report what symbol it is upset at, and if (as is usually the case) there
are several symbols on the EXTERN line reported, it is not easy to figure
out which one is causing the problem.

This problem often happens when a program tries to use long symbols which
are not unique in the first 6 characters.  I think it is also a bug that
KCC does not try to help find this sort of problem, by warning you of
things that are identical in the first 6 chars.  After all, it certainly
knows it is going to output to FAIL, and it knows that FAIL won't grok
such things.  Make this limit a variable, so that if and when an
assembler/loader/debugger ever arrives which can handle long symbols,
KCC will not have to change anything other than one value.
-------
20-Dec-84 11:58:59-PST,1354;000000000001
Mail-From: SATZ created at 20-Dec-84 11:58:55
Date: Thu 20 Dec 84 11:58:55-PST
From: Greg Satz <[email protected]>
To: [email protected]
cc: [email protected]
Subject: List of runtimes
In-Reply-To: Message from "Richard M. Stallman <RMS @ MIT-MC>" of Sat 8 Dec 84 01:29:00-PST
Phone: (415) 497-1004

Here is a quick list of the runtimes that have been written. I checked
each file for portability and made a note next to its name.

 ABORT.FAI.1	not portable
 ACCESS.C.6	not portable
 ATOI.C.3	portable
 BYTE.FAI.10	not portable
 CALLOC.C.6	portable
 CTYPE.C.2	may be portable
 CTYPE.H.3	may be portable
 DFIX.FAI.3	not portable
 FSEEK.C.2	portable
 GETENV.C.1	not portable
 GETPID.C.8	not portable
 JSYS.FAI.1	not portable
 LSEEK.C.3	not portable
 MALLOC.C.5	portable
 MKTEMP.C.6	partially portable
 PERROR.FAI.2	not portable
 PRINTF.C.19	may be portable
 QSORT.C.2	your version
 RUNTM.C.66	partially portable
 RUNTM.T.38	partially portable
 SETBUF.C.3	portable
 SETJMP.FAI.1	not portable
 SETJMP.H.2	portable
 SIGNAL.FAI.6	not portable
 SIGNAL.H.2	portable
 SLEEP.C.3	not portable
 STAT.C.27	not portable
 STDIO.C.28	portable
 STDIO.H.4	portable
 STRING.C.3	portable
 TIME.C.26	not portable
 TOPS20.FAI.106	not portable
 UNLINK.C.3	not portable
 WAITS.FAI.3	not portable
 YACCPAR..1	portable
-------
 7-Dec-84 17:30:41-PST,4781;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Fri 7 Dec 84 17:30:26-PST
Date: Fri 7 Dec 84 17:27:16-PST
From: Ken Harrenstien <[email protected]>
Subject: Public LIBC plan
To: [email protected], [email protected]
cc: [email protected]

I talked with Jake and it looks like we can justify spending some of
my real time on coordinating LIBC (and other stuff perhaps), which is
a big help.  I probably won't have much time for massive code writing,
but as Greg mentioned, that can be fielded out to students (hey, how
about assigning libc routines as class projects?  Slave labor dept...)

I agree that LIBC.MASTER looks like too much formal overhead.
However, if I am the only one updating it, then you don't need to
worry about that part of it.  I think something like it is necessary,
because my problem is I'm trying to keep track of a bunch of different
efforts none of which are happening on my own machine.  If KCC was the
only thing going on, then Greg could take care of it.  In fact I
wonder whether it might not be better just to let Greg handle all the
LIBC stuff, period.  Greg, perhaps I should explain that RMS suggested
I do this as a way of helping GNU; I wasn't sure at first I could, but
I think I can now, if it doesn't mess up any of your plans.
	Anyway, the complexity of LIBC.MASTER comes from trying to
keep it machine-parseable (why?  why not?) and still describe
routines written for different systems which may or may not be
equivalent.  Besides, there are different versions of the UPM so it is
hard to do it by identifying UPM pages (which are good specs but don't always
have everything that is needed, as RMS has pointed out).

As it is, I don't think I have any problems dealing with modules that
are intended for use with 4.2-compatible systems, or any other.  The
idea from the start was that different versions of LIBC can co-exist;
which collection of routines you will copy from the master software
repository for your own system will depend on what kind of system you
have.  For example, we can have a "core" set of routines, all written
in C, which are as portable as possible.  GNU would be a prime source
and user of these.  Then there can exist alternate versions of the
routines which are highly optimized for specific systems and
compilers; the string handling routines, especially on the PDP-10,
really want to be in assembler.  But the existence of these
specialized versions never implies non-portability since the generic C
version will always be available for bootstrapping with new systems or
if there are any problems.

The thing I am wondering about at the moment is where the
canonical-version location should be.  Either SU-SIERRA or SRI-NIC has
sufficient disk space, and is willing to do it.  The main argument for
SIERRA is that a lot (most?) of the development is happening there.
The main argument for SRI-NIC is that if I am doing the coordination
stuff, it would naturally be easier to do it on my own machine.
Likewise if Greg were doing this, SIERRA would be the obvious choice.

A separate location does have the advantage that it helps to guarantee
that LIBC.MASTER corresponds exactly to the source files available, no
matter where they come from.  It does require more work in copying the
files.  Of course this can be achieved either by using a different
machine (SRI-NIC) or a different directory tree on the same machine
(SU-SIERRA).  One other thing; RMS is right about access to SRI-NIC
(no accounts with public passwords) but SU-SIERRA is not in as visible
a position so things might be different?  In either case, anonymous FTP
writes are possible if the directory access is set to permit it.

How about this suggestion: Initially, set up a canonical source
directory on SRI-NIC, just on the grounds that I am more likely to
spend time on it and stay on top of things that way.  I, or a
designated person here, would be responsible for maintaining both
LIBC.MASTER and the contents of this directory (by FTPing files over
when the authors proclaim them to be ready).  If I decide I am wasting
too much effort in shuffling things over the net, or if I am unable to
keep up and have to punt, then we can move it all to a directory on
SIERRA (or alternatively some UNIX system).  Incidentally, copyright
notices should be specific to each file, so it doesn't matter where
they actually live or how they are intermixed.  I hope Stanford does
not insist on copyrighting everything "Stanford" whether or not it was
written at Stanford.

More comments?

P.S. I have a TOPS-20/TENEX program that groks TAR format files.  With
some spiff-up, this could be of use for moving stuff around.
-------
29-Dec-84 18:41:09-PST,2922;000000000001
Mail-From: KRONJ created at 29-Dec-84 18:41:05
Date: 29 Dec 1984  18:41 PST (Sat)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   [email protected]
Subject: Many improvements and a mystery

I was a little bored today, so I decided to attack the backlog of bug
reports for the C compiler.  Here's a list of improvements:

(1) It now works to have macro arguments in the middle of strings, e.g.
	#define ctrl(letter) ('letter' & 037)
    The problem was that the code to scan arguments when parsing a
    macro call was adding a spurious space at the end of each argument
    expansion, so that you would get what appeared to the compiler
    to be unclosed character constants.

(2) FLTRI no longer appears.  This was only happening when the
    immediate constant was zero.  Now a coercion of zero to float
    results in a SETZ.  I also took the opportunity to move the code
    that converts the other FLTRIs into MOVSIs into a more appropriate place.

(3) Common subexpression optimization on index registers was happening
    even when the -n flag was set.  Fixed.

(4) IBPs are no longer pulled to the wrong side of conditional jumps.

(5) Function names are no longer typed out when they are first encountered.
    Instead, they are remembered along with the location of the start
    of the definition so that error messages can type the name of the
    function in which the error occurred and the line in that function.

(6) Errors discovered after the last text in the input file has been
    read will be reported correctly, rather than causing an unexpected
    end of file message.

(7) Goto labels that are never defined are now an error (rather than
    causing FAIL to complain about some unidentifiable symbol not existing).

(8) An extern declaration of an already defined variable no longer
    causes the INTERN declaration for that variable in the FAIL file
    to become an EXTERN declaration.

(9) If statements with no bodies will now generate code for the
    condition, in case it has side effects.  I decided not to bother
    typing a warning in this case - let lint deal with it.

(10) Multiple-dimensional character array indexing should now work.

I didn't look at the runtimes; also there is still much to do on the
compiler itself, including fixing initializations (pointers in
brackets lose, and I don't think integers are turned into floats)
and understanding the EXEC COMPILE command's PRARG% conventions.

And now the mystery:  It seems that whenever I change the storage
declarations in KCC (and sometimes when I don't), FAIL starts
producing different REL files for the same FAI files.  This doesn't
seem to break anything but it is irritating when I'm trying to find
out if any real changes happened.  Anyone have any ideas why this
might be happening?

David
30-Dec-84 00:26:10-PST,971;000000000001
Mail-From: KRONJ created at 30-Dec-84 00:26:07
Date: Sun 30 Dec 84 00:26:07-PST
From: David Eppstein <[email protected]>
Subject: yacb fixed
To: [email protected]

Kirk's problem where
	int mumble = { 0 }
would produce a very large BLOCK after the variable has been fixed.
I was using a stack variable without ever setting it...no wonder it lost.

In general initialization still sucks.  I think it should be largely
rewritten, but I am still too confused about it to start...  Consider
	int *foo = { &fah };		/* init to addr of int variable */
	int *foo = { NULL };		/* init to null pointer */
	int *foo = { 1, 2, 3 };		/* init to addr of unnamed vector */
	int *foo = { 0 };		/* init to addr of shorter vector */
Obviously not all of the above can coexist, as two lines are identical
syntactically but with different meanings.  The problem is that all
the meanings are plausible, and I don't know which are correct.  Help?

David
-------
31-Dec-84 00:46:26-PST,543;000000000001
Mail-From: KRONJ created at 31-Dec-84 00:46:23
Date: Mon 31 Dec 84 00:46:23-PST
From: David Eppstein <[email protected]>
Subject: warning - sources broken
To: [email protected]

I'm in the middle of making initializers work right, and currently they
don't work at all.  So in the unlikely event of someone else hacking
up the C compiler at this point, remember to recompile the old ccgen
rather than the new broken one.  I may get things back into shape
tomorrow... right now I am no longer able to think straight.
-------
31-Dec-84 12:16:52-PST,893;000000000001
Mail-From: KRONJ created at 31-Dec-84 12:16:49
Date: Mon 31 Dec 84 12:16:49-PST
From: David Eppstein <[email protected]>
Subject: Initializers finally work right
To: [email protected]

As promised, I rewrote the initializer code.  It is now driven by the
data type of the variable it is initializing, rather than the
structure of the initializer.  This means that the code is simpler
(well, I understand it better anyway) and it can get a lot more cases
right.  I checked my pointer question against the 4.2 compiler, and it
didn't accept the constructs I thought would make pointers to unnamed
vectors.  So now I don't do that either - a pointer initializer is
always just a word whether an address or an int.  Nested initializers
are now accepted - that section of the parser was a bit confused.  The
new code can even initialize bit fields in structures...
-------
 1-Jan-85 15:49:45-PST,573;000000000001
Mail-From: SATZ created at  1-Jan-85 15:49:42
Date: Tue 1 Jan 85 15:49:42-PST
From: Greg Satz <[email protected]>
Subject: latest binary
To: [email protected]
Phone: (415) 497-1004

It seems that either a new problem has arrisen or an incomplete
bug fix has appeared. In either case, the problem causes a serious
incompatability with the Unix compilers. YACC no longer compiles.

The compiler is saying that a structure is being used before
it is being defined which is not the case.

Also, has the sizeof calculation screwup been fixed yet?
-------
 1-Jan-85 23:40:27-PST,465;000000000001
Mail-From: KRONJ created at  1-Jan-85 23:40:25
Date: Tue 1 Jan 85 23:40:25-PST
From: David Eppstein <[email protected]>
Subject: structure sizeof problem fixed
To: [email protected]

Dan's previous fix to structure sizeof was to report an error whenever you
used a structure within a structure, even if the reference was not forward.
Now it reports an error only when you attempt to calculate the size of a
forward structure reference...
-------
 2-Jan-85 12:24:32-PST,1023;000000000001
Mail-From: SATZ created at  2-Jan-85 12:24:31
Date: Wed 2 Jan 85 12:24:31-PST
From: Greg Satz <[email protected]>
Subject: messages from kronj
To: "*PS:<KCC.CC>MAIL.TXT.1"@SU-SIERRA.ARPA
Phone: (415) 497-1004


KRONJ, TTY1, 2-Jan-85 12:06PM
i'll be hacking on deglobalization of struct member names, so <KCC.CC>
is not going to be stable.  but <KCC.C>CC.EXE should be ok...
also, i think yesterday's work might have broken unions.
so don't be surprised if you see some problems in that respect...

KRONJ, TTY1, 2-Jan-85 12:07PM
any idea what is not happening correctly?

KRONJ, TTY1, 2-Jan-85 12:17PM
one thing i've noticed is if you subtract two structs,
you get the word difference between them, rather than the
difference divided by sizeof the struct.

KRONJ, TTY1, 2-Jan-85 12:19PM
right

KRONJ, TTY1, 2-Jan-85 12:21PM
maybe i'll get to it today after doing local struct members.  it also makes the
 dumped type table with cc -s look funny.

KRONJ, TTY1, 2-Jan-85 12:21PM
sure
-------
 2-Jan-85 14:27:12-PST,940;000000000001
Mail-From: KRONJ created at  2-Jan-85 14:27:09
Date: Wed 2 Jan 85 14:27:08-PST
From: David Eppstein <[email protected]>
Subject: global thermonuclear structure members
To: [email protected]

It is now permissable to have structure members with the same names but
different offsets.  When a reference is made to such a member, the real
offset is looked up from the type of the structure that the member is
being taken from.  If a member has only one possible offset, the old
behavior still occurs: any struct object can refer to that member.
This saves compile time not doing a linear search through the list for
most structure references, and it also saves compiler hacker time not
fixing all the sloppiness in the compiler.

Also, subtraction of pointers to objects of size greater than one word
will now work better.  Previously it was not bothering to divide the
difference by the size of the objects.
-------
 3-Jan-85 00:26:33-PST,556;000000000001
Mail-From: SATZ created at  3-Jan-85 00:26:30
Date: Thu 3 Jan 85 00:26:30-PST
From: Greg Satz <[email protected]>
Subject: comment parsing bug fixed
To: [email protected]
Phone: (415) 497-1004

The following line(s) would cause KCC indigestion:

#define foo 300 /* comment
		   that breaks across many
		   lines */ ,500

This is actually handled by the 4.2 compiler and now by KCC.

The fix was to add an intermediate parsing routine, nextcc() between
nextc() and _nextc() and make the #define handling code use nextcc().
-------
 3-Jan-85 11:01:06-PST,1088;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Thu 3 Jan 85 11:01:02-PST
Date: 3 Jan 1985  13:59 EST (Thu)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Subject: label overload
In-reply-to: Msg of 3 Jan 1985  12:37-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

    Date: Thursday, 3 January 1985  12:37-EST
    From: Greg Satz <SATZ at SU-SIERRA.ARPA>

    it seems that the bsd compiler allows labels and variables with the same
    names. egrep overloads "out" as a label and a character array. I presume
    this could be done by prefixing all labels with some funky character?

Yeah.  Look at the way struct tags and struct members are done.
The place to change would be plabel() or somewhere around there.
The only not completely obvious things to do are to remember to add one
to the s->sname in the "label never defined" error so the char doesn't
show, and to include the char in the list of such in ccdump.
 3-Jan-85 11:46:59-PST,2714;000000000001
Return-Path: <root@mojave>
Received: from Mojave by SU-SIERRA.ARPA with TCP; Thu 3 Jan 85 11:46:52-PST
Received: by Mojave with TCP; Thu, 3 Jan 85 11:46:47 pst
Date: Thu, 3 Jan 85 11:46:47 pst
From: Super User <root@Mojave>
Subject: cpp errata
To: bug-kcc@sierra

Here is some documentation on the Unix cpp. Greg.

Documentation clarifications:
	Symbols defined on the command line by "-Dfoo" are defined as "1",
		i.e., as if they had been defined by "#define foo 1" or "-Dfoo=1".
	The directory search order for #include files is
		1) the directory of the file which contains the #include request
		   (e.g. #include is relative to the file being scanned when
		   the request is made)
		2) the directories specified by -I, in left-to-right order
		3) the standard directory(s) (which for UNIX is /usr/include)
	An unescaped linefeed (the single character "\n") terminates a
		character constant or quoted string.
	An escaped linefeed (the two-character sequence "\\\n") may be
		used in the body of a '#define' statement to continue
		the definition onto the next line.  The escaped linefeed is
		not included in the macro body.
	Comments are uniformly removed (except if the argument -C is specified).
		They are also ignored, except that a comment terminates a token.
		Thus "foo/* la di da */bar" may expand 'foo' and 'bar' but
		will never expand 'foobar'.  If neither 'foo' nor 'bar' is a
		macro then the output is "foobar", even if 'foobar'
		is defined as something else.  The file
			#define foo(a,b)b/**/a
			foo(1,2)
		produces "21" because the comment causes a break which enables
		the recognition of 'b' and 'a' as formals in the string "b/**/a".
	Macro formal parameters are recognized in '#define' bodies even inside
		character constants and quoted strings.  The output from
			#define foo(a) '\a'
			foo(bar)
		is the seven characters " '\\bar'".  Macro names are not recognized
		inside character constants or quoted strings during the regular scan.
		Thus
			#define foo bar
			printf("foo");
		does not expand 'foo' in the second line, because it is inside
		a quoted string which is not part of a '#define' macro definition.
	Macros are not expanded while processing a '#define' or '#undef'.
		Thus
			#define foo bletch
			#define bar foo
			#undef foo
			bar
		produces "foo".  The token appearing immediately after a
		'#ifdef' or '#ifndef' is not expanded (of course!).
	Macros are not expanded during the scan which determines the actual
		parameters to another macro call.  Thus
			#define foo(a,b)b a
			#define bar hi
			foo(bar,
			#define bar bye
			)
		produces " bye" (and warns about the redefinition of 'bar').

 3-Jan-85 22:42:02-PST,413;000000000001
Mail-From: KRONJ created at  3-Jan-85 22:41:59
Date: Thu 3 Jan 85 22:41:59-PST
From: David Eppstein <[email protected]>
Subject: struct assignment, structs as fn args, structs as ret vals now work
To: [email protected]

This change also added a new module to the runtimes: SPUSH, containing
routines $SPUSH and $SPOP, to push and pop multiple consecutive locations
to or from the stack.
-------
 5-Jan-85 00:03:24-PST,1226;000000000005
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sat 5 Jan 85 00:03:16-PST
Date: 5 Jan 1985  03:01 EST (Sat)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Len Bosack <[email protected]>
cc:   Bug-KCC@Sierra
Subject: Did you ever look at the merged byte pointers?
In-reply-to: Msg of 4 Jan 1985  21:58-EST from Len Bosack <BOSACK at SU-SCORE.ARPA>

    Date: Friday, 4 January 1985  21:58-EST
    From: Len Bosack <BOSACK at SU-SCORE.ARPA>

    When we last tried to compile <KCC.CTEX> we discovered various cases
    where successive byte operations could be merged. Did you ever do
    anything about it? I (as might be expected) never did get to do anything.

    I think there are still some comments left around in the directory....

No, that's been on the list for a long time but I don't remember it
ever getting done.  Another thing to think about is turning
comparisons against zero into TDNx, setting to zero to ANDCAB, and
setting to one to be ORCB.  I think LDBs of halfwords do currently get folded
into HRRZ or HLRZ, although again more could be done with HLLZS etc.
 6-Jan-85 15:29:55-PST,472;000000000001
Return-Path: <[email protected]>
Received: from SRI-AI.ARPA by SU-SIERRA.ARPA with TCP; Sun 6 Jan 85 15:29:51-PST
Date: Sun 6 Jan 85 15:28:12-PST
From: William "Chops" Westfield <[email protected]>
Subject: bugs in KCC
To: [email protected]

REWIND in the runtime library doesnt seem to work.  Apparently the
file pointer is updated properly, but the internal data structures
associated with the file (or perhaps the file status) are not.

BillW
-------
 6-Jan-85 22:11:24-PST,284;000000000001
Mail-From: SATZ created at  6-Jan-85 22:11:20
Date: Sun 6 Jan 85 22:11:20-PST
From: Greg Satz <[email protected]>
Subject: rewind
To: [email protected]
cc: [email protected]
Phone: (415) 497-1004

rewind should work now. Let me know if I broke anything else.
-------
 6-Jan-85 22:44:00-PST,351;000000000001
Mail-From: SATZ created at  6-Jan-85 22:43:52
Date: Sun 6 Jan 85 22:43:52-PST
From: Greg Satz <[email protected]>
Subject: save command
To: [email protected]
Phone: (415) 497-1004


I modified the compiler to permit the save command to obtain
the true name of the file being compiled/saved instead of always
using CLIB.EXE.
-------
 8-Jan-85 00:19:45-PST,410;000000000001
Mail-From: SATZ created at  8-Jan-85 00:19:41
Date: Tue 8 Jan 85 00:19:41-PST
From: Greg Satz <[email protected]>
Subject: weird error message
To: [email protected]
Phone: (415) 497-1004

while trying to compile the game hangman, in particular the
module prdata.c, I got the error message:

Attempt to get address for unknown op -- 16.

Any ideas? prdata is a fairly small module.
-------
 8-Jan-85 00:27:20-PST,404;000000000001
Mail-From: SATZ created at  8-Jan-85 00:27:17
Date: Tue 8 Jan 85 00:27:17-PST
From: Greg Satz <[email protected]>
Subject: new runtimes
To: [email protected]
Phone: (415) 497-1004

I installed scanf, fscanf, sscanf, fread, and fwrite.
Bill Palmer did most of the *scanf stuff. I only tested scanf,
fscanf, and fread. The others still need to be checked
to make sure they work.
-------
 8-Jan-85 01:01:29-PST,656;000000000001
Mail-From: LOUGHEED created at  8-Jan-85 01:01:25
Date: Tue 8 Jan 85 01:01:25-PST
From: Kirk Lougheed <[email protected]>
Subject: compiler looping
To: [email protected]

The following incorrect trivial programs will cause KCC to loop endlessly,
printing error messages:

"main()"	/* User forgot to include braces for the body */

"main() {"	/* User forgot closing brace for body */

The trivial program "main() {}" compiles as expected.  I was looking for
a better error message than cc68 was giving me for another problem when
I found this (very minor) problem.  KCC error messages are excellent, by
the way.

Kirk
-------
 8-Jan-85 10:21:06-PST,1164;000000000011
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Tue 8 Jan 85 10:20:52-PST
Date: 8 Jan 1985  13:19 EST (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
cc:   Bug-KCC@Sierra
Subject: weird error message
In-reply-to: Msg of 8 Jan 1985  03:19-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

    Date: Tuesday, 8 January 1985  03:19-EST
    From: Greg Satz <SATZ at SU-SIERRA.ARPA>

    while trying to compile the game hangman, in particular the
    module prdata.c, I got the error message:

    Attempt to get address for unknown op -- 16.

    Any ideas? prdata is a fairly small module.

I think I broke this in the recent addition of structure assignments.
Look for an assignment of some expression to a double variable.
Then clean up my code in gassign() to do a genstmt() instead of a
gaddress()+DMOVE if the ->ntype->ttype is DOUBLE (note that for
STRUCT, the genstmt()+DMOVE there now is correct).  You could also
release the register pair used - I think I also forgot to do that.
 8-Jan-85 11:50:17-PST,1059;000000000001
Mail-From: SATZ created at  8-Jan-85 11:50:12
Date: Tue 8 Jan 85 11:50:12-PST
From: Greg Satz <[email protected]>
Subject: Re: weird error message
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Tue 8 Jan 85 13:19:00-PST
Phone: (415) 497-1004

I can't seem to find any double variables in the compiler let alone an
assignment to one. Any ideas where that may be?

As far as freeing the register, does the following code fragment
look right (from ccgen1.c in gassign()):

    if (siz == 2) {
	r1 = genstmt(n->right);
	if (n->right->ntype->ttype == STRUCT && n->right->nop != ASGN) {
	    r2 = getpair();		/* 2 word struct, have address */
	    code4(DMOVE, r2, r1);	/* so make into doubleword */
->	    release(r1)			/* XXX FREE REGISTER */
	    r1 = r2;			/* remember where code expects it */
	}
	code4(DMOVEM, r1, gaddress(nod));
	return r1;

I assume that the test for the DOUBLE should be in here too, but I am
still looking at that.
-------
 8-Jan-85 12:21:06-PST,1051;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Tue 8 Jan 85 12:20:57-PST
Date: 8 Jan 1985  15:19 EST (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Cc:   [email protected]
Subject: weird error message
In-reply-to: Msg of 8 Jan 1985  14:50-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

Gee, that code looks better than I thought it did.  The arrowed
release() is not necessary, code4() will do it for you.  Someday
register allocation should be rewritten to be more clean.
I meant double assignment in prdata, but I guess that isn't it.
Anyway, the error means someone is calling gaddress() with something
that doesn't have an address.  The magic number is the ->nop of the
node, which you can look up in CC.S.  Needless to say this indicates a
bug in the compiler, either in the code calling gaddress() or in the
parser letting a non-lvalue get into the code generator.
 8-Jan-85 17:42:45-PST,791;000000000001
Mail-From: SATZ created at  8-Jan-85 17:42:41
Date: Tue 8 Jan 85 17:42:41-PST
From: Greg Satz <[email protected]>
Subject: still more weird problems -- spush
To: [email protected]
Phone: (415) 497-1004

Dave, this looks even weirder. The problem occurs only if one of the
variables is double and it isn't assigned anywhere. It looks like
something is confused with the new structure copying code. Any ideas
that could save me some time or point out something?

Here is the test program:

int Wordnum;
double Average;

foo()
{
	printf("Current Average: %.3f\n", Average / Wordnum);
}

Wordnu:	BLOCK	1
Averag:	BLOCK	2

prdata:
	MOVEI	3,2
	JSP	16,$SPUSH
	IFIW	3,0
	XMOVEI	4,$1
	IOR	4,$BYTE+4
	PUSH	17,4
	PUSHJ	17,printf
	ADJSP	17,-3
	POPJ	17,
-------
 9-Jan-85 12:11:01-PST,1284;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 9 Jan 85 12:10:52-PST
Date: 9 Jan 1985  15:09 EST (Wed)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Cc:   [email protected]
Subject: still more weird problems -- spush
In-reply-to: Msg of 8 Jan 1985  20:42-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

Ok, here's what's happening: the result of Average/Wordnum is a
double.  The code to push function args sees that tsize(n->ntype)
(where n is the arg node) is more than one word, and tries to call
$SPUSH with the address of the argument.  Since it's a double and not
a struct, and since it has a doubleword value but no address, this
obviously loses.  Of course what you want is if it's a double (or an
assignment op on a two-word struct), to simply push the two registers
that it is in.  If it's a two-word struct but the outer op is not an
assignment, you will have an address, so maybe you can do a DMOVE and
the two-reg push instead of a call to $SPUSH.

Even after you fix this you will have to either cast the result to
(float), or fix printf to handle doublewords, to make your example work.
 9-Jan-85 12:09:15-PST,2056;000000000001
Mail-From: WHP4 created at  9-Jan-85 12:09:14
Date: Wed 9 Jan 85 12:09:14-PST
From: Bill Palmer <[email protected]>
Subject: [Charles Hedrick <[email protected]>: Re: compiler and link lore]
To: [email protected], [email protected]

Here's Hedrick's reply to my questions about link, etc..
                ---------------

Return-Path: <[email protected]>
Received: from RUTGERS.ARPA by SU-SIERRA.ARPA with TCP; Wed 9 Jan 85 12:06:56-PST
Date: 9 Jan 85 15:05:32 EST
From: Charles Hedrick <[email protected]>
Subject: Re: compiler and link lore
To: [email protected]
In-Reply-To: Message from "Bill Palmer <[email protected]>" of 9 Jan 85 14:50:00 EST

Yes, there are new .REL blocks that allow long symbol names.  However
at the moment I think they aren't implemented.  That is, they work,
but names are truncated to 6 characters.  It still might make sense
to use them, so that when LINK is fixed, your C will work.  You
should wait until you can get v 6 LINK, as those blocks won't really
be realiable until then.  I think the descriptions in the LINK
manual will be correct.  I have SPR'ed the problems I found with
the manual.  Rel. 5 LINK has the blocks documented, so you can look
at the current LINK manual and start planning.  You may even be
able to get LINK 5 to work, but there are problems with the way
PSECT numbers are handled.  They will probably be fatal.  If you
can't get hold of LINK 6 and need to start your work immediately,
I think I have a patched version of LINK 5 that fixes some of the
PSECT problems.  But Stanford should have enough clout with DEC
to get the newest version.  You are welcome to look at the code in
the Pascal compiler that generates the .REL blocks.  It is all in
one part of the compiler, and (unlike the rest of the compiler) is
well documented.  Take a look at s:<pascal.new>pascmp.pas.  I use
the new blocks there, but I don't try for long symbols.  So there
will be some cases where you will use slightly different block types.
-------
-------
10-Jan-85 17:20:23-PST,322;000000000005
Mail-From: WHP4 created at 10-Jan-85 17:20:15
Date: Thu 10 Jan 85 17:20:15-PST
From: Bill Palmer <[email protected]>
Subject: clib/compiler misfeature
To: [email protected]

If you have a function called the same thing as something in clib, you get
multiply-defined global symbol errors from LINK.
-------
 9-Jan-85 17:32:27-PST,3521;000000000011
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 9 Jan 85 17:32:15-PST
Received: from CWR20B by CUCS20 with DECnet; 9 Jan 85 20:30:34 EST
Date: Wed 9 Jan 85 20:30:44-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: paunix
To: SATZ%SU-SIERRA@CUCS20
In-Reply-To: Message from "Greg Satz <[email protected]>" of Thu 3 Jan 85 12:24:48-EST

Sorry to take so long to get back to you, I'm only now digging out of the
pile of work that stacked up during DECUS.

I have done only minor things to PAUNIX since I set the stuff up on Columbia's
machine.  Fixed some minor bugs and did some work on instrumentation and
performance.  My plan was to attempt to get a shell converted and use it to
drive my development work -- I'd rather use already working programs that
are reasonable test cases rather than keep fabricating simple ones.  I was
going to resume that process this weekend, and in fact my first thing was
going to be to build a BLILIB.MAC equivalent for KCC.  

So, where's that leave us?  Well, it's no problem for me to update the stuff
on Columbia's machine to be exactly what I've got right now.  If you want to
do the KCC interface library, the BLILIB.MAC you now have is (I believe) un-
changed since I put it out and should serve as a good model for what needs
to be done.  On the other hand, if the compiler is really flying well, perhaps
I could do that and send it back to you folks so you can spend time playing 
with it and finding problems.  My next major work was going to be to put in
the fork() and exec() stuff along with some performance work and I was planning
to work on those most of this weekend and during evenings and weekends until
6.1 gets here.  Since, as you probably know, I'm considering leaving here, I'd
like to get it as done as possible before I take off since it's conceivable I
might not have access to a 20 at wherever I end up.  Presumably, that event is
a couple of months away yet, but there's still some work to be done and I wanted
to add the Berkeley enhancements to the thing.

So, if you're willing, I'd like to try getting the Eppstein fixes (which
I guess must be on Columbia-20?) and doing the KCC library this weekend or
before and then passing the pile back to you.

If you'd rather just get the updates and go on yourselves with the KCC library
that'd be ok too -- I haven't been exactly reliable in getting this stuff out
promptly, one of the problems with doing this in one's spare time. I would
do one thing before passing those off to you, and that would be to build a
version without all the debugging stuff that's in it, which turns out to be
one of the major sources of slowness in the thing.

One thing that perhaps you all could do that I can't do anymore would be
to fix 5.x so PAUNIX can run under it.  As far as I can tell, the only
reason PAUNIX does not run under 5.x is that the code which traps TOPS-10
MUUOs and stores the UUO block for the compatibility package stores an
un-extended MUUO block instead of the extended one.  I would think that
this would be a relatively simple thing to fix, but without a 5.x system
of my own (all ours have been running 6.0 for a while) I can't really
go debug it seriously.  Having that fixed would allow a number of places
like Columbia and your other machines to be used to play with the thing.

In any case, let me know which of these scenarios you prefer.  Take care,

	Rob

-------
12-Jan-85 11:44:01-PST,4131;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sat 12 Jan 85 11:43:43-PST
Date: 12 Jan 1985  14:41 EST (Sat)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Cc:   Bug-KCC@Sierra
Subject: generating code in KCC
In-reply-to: Msg of 11 Jan 1985  18:52-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

    Date: Friday, 11 January 1985  18:52-EST
    From: Greg Satz <SATZ at SU-SIERRA.ARPA>

    I am not sure how to generate code to push two words
    on the stack for the function call. A short explaination
    on how genstmt works with code0 - code8 would be real helpful.
    Thanks!

what you want is
    code0(FNARG, SP, r)
    code0(FNARG, SP, realreg(r)+1)
where r is the register pair holding your two words.

The realreg(r)+1 stuff is because some registers are "virtual".  They
initially point to register one (the function return value) but if one
is saved and restored over a subroutine call it obviously shouldn't
overwrite the new return value so it has to get restored elsewhere.
The virtual register system doesn't mix very well with the use of
register pairs, and if you try hard enough you can probably get that
part of the code generator to produce bad code (i.e. lose registers or
use the wrong one or use the same one for two different things).

The various routines to emit instructions in various addressing modes are:

code0(op, r, s)
    generates the opcode  op r,s  where r and s are both registers.
    This means appending it to the end of the peephole buffer and
    performing various optimizations; it will only actually get sent
    to the output file when flushcode() is called (or when the
    buffer overflows and some of the stuff at the start gets emitted).
    Register s is released for future reassignment (although later
    peephole optimizations might notice that it hasn't been changed yet
    and re-use whatever value it contained).

code1(op,r,s)
    generates  opI r,s  where s is a number (ptype IMMED).

code2(op, r, b, i, p, o)
    generates  op r,[bbbbii,,p+o]  that is, generates a local byte
    pointer op with p/s fields in b, indexed by register i, symbol p,
    offset by integer o.  The way virtual registers are handled here
    is questionable - it works only because virtual registers don't
    change except over subroutine calls, and very little optimization
    occurs across a subroutine call.

code3(op, r, s)
    generates  opI r,s  with s a symbol (ptype IINDEXED).  If left unoptimized
    this will be generated as  XMOVEI 16,s  /  op r,16.

code4(op, r, s)
    generates  op r,(s)  i.e. indexes off register S.  This is used
    after a gaddress() to get the contents of a variable, and also for
    assignment ops (like ASGN = MOVEM).  Register s is released.

code5(op, r)
    generates  op r,  for ops like SETZ.

code6(op, r, s)
    generates  op r,$s.  This is mostly used for jumps.

code7(op, preg, pptr, poffset, pindex)
    generates  op preg,pptr+poffset(pindex)  i.e. generates a complicated
    addressing op of type MINDEXED.  this is used to duplicate the addressing
    of some op that has already been generated.

code8(op, r, s)
    like code1(), but op doesn't get I appended to it, making type RCONST.
    E.g. code8(ADJSP, SP, n) where n is some stack offset.  Also used
    to generate comparisons with zero in boolean expressions.

code9(op, r, mantissa, exponent)
    generates  op r,[mantissa E exponent]  i.e. floating point literal.

code13(op, r, s)
    generates  opI r,s(17)  i.e. an XMOVEI of a stack location.  As far
    as I know the only op used is XMOVEI.

code15(op, lab, off, r)
    generates  op lab+off(r).  Used with op=JRST for switch jump tables.

code16(op, r, lab, q)
    generates  op r,$lab(q).  Used for checking switch hash tables.

code17(value)
    generates a literal value.  Used for generating switch hash tables.

There is no code10, code11, code12, or code14.
13-Jan-85 20:43:48-PST,1037;000000000001
Mail-From: SATZ created at 13-Jan-85 20:43:44
Date: Sun 13 Jan 85 20:43:44-PST
From: Greg Satz <[email protected]>
Subject: doubles as procedure arguments
To: [email protected]
Phone: (415) 497-1004

The bug that prevented doubles being passed as function/procedure
arguments has been fixed. Here is the fix. Comments appreciated.

In ccgen1.c in fnarg():

    if (s > 2) {			/* bigger */
	r = getreg();			/* another register */
	code1(IDENT, r, s);		/* get size */
	if (n->ntype->ttype == STRUCT) code4(SPUSH, r, genstmt(n));
	else /* code4(SPUSH, r, gaddress(n)); /* push the struct or whatever */
	    emsg(ECGEN);
	spushes++;			/* used a SPUSH */
	release(r);
    } else {
	r = genstmt(n);
	if (s == 2) {			/* handle doubles or two word struct */
	    r1 = getpair();
	    code0(DMOVE, r1, r);
	    code0(FNARG, SP, r1);
	    code0(FNARG, SP, realreg(r1)+1);
	} else				/* single word arguments */
	    code0(FNARG, SP, r);
	if (optimize) code8(ADJSP, SP, 0); /* hack up stack */
    }
-------
14-Jan-85 10:12:07-PST,801;000000000011
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 14 Jan 85 10:11:58-PST
Date: 14 Jan 1985  13:09 EST (Mon)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Cc:   [email protected]
Subject: doubles as procedure arguments
In-reply-to: Msg of 13 Jan 1985  23:43-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

I don't understand the bit about
    code0(DMOVE, r1, r).
In the case of a struct which is not an assignment statement (and
therefore is still an address rather than a register pair) you want
    code4(DMOVE, r1, r)
to put the doubleword into registers.  Otherwise, what's wrong with
simply pushing  r  and  realreg(r)+1 ?
14-Jan-85 11:35:49-PST,973;000000000001
Mail-From: SATZ created at 14-Jan-85 11:35:45
Date: Mon 14 Jan 85 11:35:45-PST
From: Greg Satz <[email protected]>
Subject: Re: doubles as procedure arguments
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Mon 14 Jan 85 13:09:00-PST
Phone: (415) 497-1004

Here is my latest revision. It seems to generate correct code for
doubles and two word structs. The compiler doesn't know how to utilize
two word structs as arguments (taking them off of the stack) but does
handle doubles. I am trying to track that code down.

    } else {
	r = genstmt(n);
	if (s == 2) {			/* handle doubles or two word struct */
	    if (n->ntype->ttype == STRUCT) {	/* structure address */
		r1 = getpair();
		code4(DMOVE, r1, r);
		r = r1;
	    }
	    code0(FNARG, SP, r);
	    code0(FNARG, SP, realreg(r)+1);
	} else				/* single word arguments */
	    code0(FNARG, SP, r);
-------
14-Jan-85 20:19:32-PST,5361;000000000015
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 14 Jan 85 20:19:18-PST
Received: from CWR20B by CUCS20 with DECnet; 14 Jan 85 23:17:09 EST
Date: Mon 14 Jan 85 23:12:05-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: paunix
To: SATZ%SU-SIERRA@CUCS20
cc: Gingell@CWR20B
In-Reply-To: Message from "Greg Satz <[email protected]>" of Fri 11 Jan 85 14:15:50-EST

Late last night (fighting a broken KL notwithstanding), I got the world's
simplest C program, i.e.

	main()
	{
		printf("Hello world\n");
	}

to produce the message running under PAUNIX.  In fact, it ran the first
shot, amazingly enough.  Most of the effort went into turning BLILIB.MAC
into something KCC code could use and in building a CLIB.REL which eliminated
the conflicting entry points.  

While I was amazed that it worked at first, it occurred to me that perhaps
it shouldn't have.  At the moment, PAUNIX just dumps strings to the terminal.
The string "Hello world\n" ends with a LF, and I expected a problem since
real UNIX would have replaced the LF with the CRLF pair.  Well, was I surprised
to find the printf code in the library was doing that substitution already.

So, now I have my first KCC dilemma to raise, namely -- where should the
I/O canonicalization take place?  If I were to do it in PAUNIX, I would
turn LF to CRLF when writing to:

	cooked TTY's
	files opened in 7-bits

but not

	raw TTY's
	files opened for any byte size other than 7

on the theory that 7-bit files are intended for coexistence in the UNIX
and TOPS-20 environments and that raw output or files PAUNIX was told to
open for non-TOPS-20 text (i.e., 8 or 9 or 36 bit I/O) would only be
meaningful to other programs expecting binary I/O or PAUNIX or real UNIX
itself.

Whatcha think?  Of course, this means that the translation support should
probably get yanked out of the printf family in CLIB at some point.

-----

In any case, here's what I did to get things working with KCC.  First, it
should be noted that PAUNIX's functionality hasn't been extended much beyond
where it was earlier this past fall -- i.e., the fork(), exec(), etc. code
has not been added.  It's basically still an I/O library.  All I did was
make it possible for the interface library to work with KCC-generated code
and make the appropriate alterations to CLIB so that the setup code for
running a C main() knew to detect and/or fetch PAUNIX into the address
space as necessary.

1. BLILIB.MAC has been turned into UNXINT.MAC.  It has conditional assembly
   parameters which indicate whether it's being compiled for KCC or for
   BLISS-36.  The default is for KCC.  I did this to try to have as much
   commonality between the language libraries as possible to make support
   a little easier.  The major differences are:

	- understanding KCC's "reverse-order" parameter passing;
	- dealing with "off-by-one" ILDB/IDPB byte pointers properly,
	  adjusting them before the actual call to PAUNIX;
	- handling signals properly for KCC -- i.e., having the
          dispatch handler in the language interface library 
	  save all registers to deal with the fact that KCC assumes
	  that callers save everything they need -- soemthing not
	  possible for the case of signals.

   I have not tested these extensively at this point, but several
   programs using various printf and the exit code all ran without
   a hitch from section 0.  There was a problem running the code
   from section which caused an abort and a CORE.EXE, I'll have to
   check into that.

2. I modified TOPS20.FAI so that it was no longer the "MAIN" program
   from a TOPS-20 standpoint, i.e., the entry vector to a C program
   points into UNXINT.  UNXINT when built for KCC will jump into the
   code in TOPS20.FAI almost unaltered from the original.  I have all
   this under UNIX conditionals in TOPS20.FAI.  The other changes to
   TOPS20.FAI cause it to use PAUNIX's brk and sbrk, as well as PAUNIX's
   _exit.

3. RUNTM.C and RUNTM.T were altered so that entry points which collied
   with PAUNIX-defined entry points (like creat, open, read, write)
   and even those entry points which do not yet work in PAUNIX (i.e.,
   execxx) to not be compiled when the symbol PAUNIX is #define'd.
   The unfortunate side effect of this is that it caused functionality
   which the current runtimes implement to be lost.  We might find it
   advisable to eliminate the unimplemented entry points from UNXINT
   so that people can use the best of both until PAUNIX really does all
   the UNIX stuff -- I didn't do that because it was easier for me to
   just get things working this way.

4. CLIB.REL was rebuilt less the other runtime modules (i.e., ACCESS.C,
   GETPID.C, UNLINK.C and the like) which also conflicted with real
   UNIX entry points, again for the same reasons as described above.

So, that's about it -- I'm checking into a couple of more tests now and
figuring out why the extended section execution causes a CORE, just wanted
to let you know where it's at right now.  

----

It appears that KCC doesn't really assemble the .FAI file that gets
produced during a compilation.  Is it possible that KCC depends on having
PIP: available for this purpose?  

-------
15-Jan-85 13:16:34-PST,1721;000000000001
Mail-From: SATZ created at 15-Jan-85 13:16:33
Date: Tue 15 Jan 85 13:16:33-PST
From: Greg Satz <[email protected]>
Subject: Re: paunix
To: GINGELL%[email protected]
cc: [email protected]
In-Reply-To: Message from "Rob Gingell <GINGELL@CWR20B>" of Mon 14 Jan 85 20:19:32-PST
Phone: (415) 497-1004

Checking TTY(4) of my 4.2bsd UPM, I find the CRMOD flag which causes
Unix to map CR or LF to CRLF. This should be done in PAUNIX and not
printf based on this flag. RAW mode turns this bit off. I am not sure
how easy this would be. I just checked and it seems that you don't use
CRMOD. The problem is that output to terminals is different from output
to other types of files (disk, tape, etc.). I think your assessment of 7
vs. other byte sizes is correct for non-tty files.

The library routines need to be cleaned up. I was waiting for PAUNIX
before undertaking that job. The big question is whether to undertake
writing our own version of UPM 3 or stealing Berkeley's. I will probably
do a little of both in the beginning.

I made a modification to KCC here so that the SAVE command will use the
name of the KCC main routine instead of CLIB. It required that KCC
generate an entry vector in the main() routine. I can do this to the
PAUNIX version too.

I would be willing to organize the section 3 library routines as well as
the routines we have written that aren't in PAUNIX yet. This will
maximize functionality while still using PAUNIX so we can give you some
feedback.

What version of FAIL do you have? KCC uses a PRARG% call to pass stuff
to FAIL (it consequently doesn't work correctly with MACRO). You can get
a newer version of FAIL from Sierra::SRA:<FAIL>.
-------
19-Jan-85 21:14:21-PST,4178;000000000011
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sat 19 Jan 85 21:14:09-PST
Received: from CWR20B by CUCS20 with DECnet; 20 Jan 85 00:12:03 EST
Date: Sat 19 Jan 85 18:19:09-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: paunix
To: SATZ%SU-SIERRA@CUCS20
In-Reply-To: Message from "Greg Satz <[email protected]>" of Tue 15 Jan 85 16:44:50-EST

I agree that the mapping of \n to the appropriate imaging sequence should
be done in PAUNIX and will take steps to put it in.  I thought out the 
rest of the problems with doing the translation for non-terminal (i.e.,
disk) files and have arrived at the following conclusions -- let me know
what you think of this approach.

	Basically, we will do what I said the last time -- i.e., 7-bit
	I/O to TTY's ("cooked") and to disk files opened for 7-bit I/O
	will have LF translated to CRLF on the way out, and CRLF to
	LF on the way in.  Other files will simply get the raw character
	stream as delivered to from read/write.  A potential problem I
	thought of was that of programs which "lseek'ed" such files.

	lseek is currently implemented much like the TOPS-20 SFPTR%, and
	is thus both efficient and simple.  However, if a program lseek'ed
	a file which was altering the data stream what would lseek mean?
	Would the number provided to lseek mean the byte number with CR
	in the stream or not, and how would adjustments be made to account
	for it?  While solutions are possible, they involve either massive
	bookkeeping or CPU time, or possibly both.

	The solution path I have chosen is that files which are opened for
	7 bit I/O and have the translations taken place will appear to 
	programs running under PAUNIX as character-special files.  lseek's
	on such files are defined to be NOP's, and thus the implementation
	problem goes away.

	However, this choice could conceivably create problems for some
	programs.  Do things like editors and compilers lseek source files?
	I know 'ed' lseeks it's temporary files, but this would not be a 
	problem since that file would be opened for (say) 9-bit bytes
	and hence the data would not get screwed up.  But I wonder if
	things like 'tail' would handle such a thing right?

	Seems like a big test of how many UNIX programs *really* treat
	files as simple byte streams.

----

For UPM(3), I vote for trying to steal Berkeley's stuff where possible,
to enhance commonality.  However, this might cause some people licensing
problems.  Would the GNU stuff be useful here?

For my FAIL problems, I did indeed have an ancient version of FAIL and
I snarfed the one from SRA:<FAIL> and all is now well -- thanks.

-----

I have found that the problem I was having with extended KCC code producing
a CORE.EXE is that the core layout chosen by the KCC startup routines 
was at odds with some of my assumptions in the implementation of brk.  
Further, my brk implementation was too stupid about some things, like
allocating sections when the break is extended across a section boundary.
I'm going to do some thinking about what to do about this.  In the
meantime, I've prepared a new set of PAUNIX sources for you on
Columbia-20.  These should allow you to run non-extended KCC programs
exercising the current limits of PAUNIX functionality.  They are in

SNARK:<G.GINGELL.PAUNIX.BLISS>		Bliss interface to PAUNIX support
SNARK:<G.GINGELL.PAUNIX.C>		KCC interface to PAUNIX support
					Includes *simple* C demonstration
					program
SNARK:<G.GINGELL.PAUNIX.PAUNIX>		Sources to PAUNIX as well as
					support tools
SNARK:<G.GINGELL.PAUNIX.SUBSYS>		SYS: stuff for using PAUNIX,
					including CLIB.REL

----

Where am I going from here.  Well, I will do the following and then
update the Columbia-20 copy again:

	- fix the extended addressing problems;
	- add the CR/LF mapping stuff for disk/TTY/?;
	- add the "byte-size" table hinted at in PAUNIX.MEM;
	- initial PAUNIX.INIT file handling;
	- (maybe) simply "mini-shell" support;
	- try to fix anything you bring up.

After freezing those fixes, I'll go after fork() and exec().

-------
21-Jan-85 17:42:07-PST,2205;000000000001
Mail-From: SATZ created at 21-Jan-85 17:42:06
Date: Mon 21 Jan 85 17:42:05-PST
From: Greg Satz <[email protected]>
Subject: Re: paunix
To: GINGELL%[email protected]
cc: [email protected]
In-Reply-To: Message from "Rob Gingell <GINGELL@CWR20B>" of Sat 19 Jan 85 21:14:21-PST
Phone: (415) 497-1004

I see the problem you mention with 7 bit file conversion and lseek.
However, there is the potential incompatability if we aren't careful. I
would like "tail foo" where foo is a non-7 bit file and "tail bar" where
bar is a 7-bit file to work the same way. When I lseek in either, I
don't care if I am passing crlf or just lf. The output should look the
same (since stdout is to a tty: conversion should be done).

In other words, you shouldn't make lseek on 7-bit disk files a nop.
Rather, it is up to the programmer to determine what it means. I would
still stick with the NOP for character device (TTY:) files. Tar does an
lseek on /dev/mt (MTA:?); that should probably work too. 7-bit files are
there for compatability and not for general use unless explicitly
required. For true compatability you probably want to open files in
8-bit rather then 9-bit so (theoritically) they could be transported to
a real Unix.

----
I have been in contact with RMS with respect to GNU. He doesn't
have that much of it in place to be useful to us. He is extremely
interested in our work since he is hoping that we will produce something
he can use. Also, I am not sure I want to spend time (re)writing public
domain Unix utilities for TOPS-20. We can provide a public domain
compiler and UPM(2). If people want UPM(3) they will have to get a
license. Under the current circumstances, I don't have a better
solution.

----
I copied the latest files from CUCS20. I will begin working with them
when I return from USENIX in Dallas. I want to make it the default
version and give it away to at least one site. I will ask Bill
Westfield, new Score programmer, to look into the extended compatability
problem so we can get it up there too.

Your list of things to do looks real good. I am getting excited about
this now since it is finally taking some real form.
-------
21-Jan-85 17:52:45-PST,1972;000000000001
Mail-From: SATZ created at 21-Jan-85 17:52:43
Date: Mon 21 Jan 85 17:52:43-PST
From: Greg Satz <[email protected]>
Subject: [Rob Gingell <GINGELL@CWR20B>: Re: paunix]
To: [email protected]
cc: [email protected], [email protected]
Phone: (415) 497-1004

Bill, I figured that you are fairly busy, but I thought I would ask
anyway.  Rob Gingell at Case Western is developing a compatability
package, much like PA1050, called PAUNIX. I plan on using it for most of
the Unix Programmers Manual, Section 2. However, there is a major
problem in that it only works on V6 since he loads PAUNIX into section
037 and the extended compatability entry vector stuff in the V5.[13]
monitor is broken.

It would be nice to install the latest KCC with PAUNIX on Score. Since
you aren't planning on converting to V6 in the near future, it would be
useful to fix this problem.

Below is part of a note from Rob detailing the problem. You can find a
PAUNIX.EXE in <KCC.PAUNIX.SUBSYS> and a test program that uses PAUNIX
as PATEST in the same directory. All of these reside on Sierra.

Let me know if you have any questions.
                ---------------

Date: Wed 9 Jan 85 20:30:44-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: paunix
To: SATZ%SU-SIERRA@CUCS20

.
.
.
One thing that perhaps you all could do that I can't do anymore would be
to fix 5.x so PAUNIX can run under it.  As far as I can tell, the only
reason PAUNIX does not run under 5.x is that the code which traps TOPS-10
MUUOs and stores the UUO block for the compatibility package stores an
un-extended MUUO block instead of the extended one.  I would think that
this would be a relatively simple thing to fix, but without a 5.x system
of my own (all ours have been running 6.0 for a while) I can't really
go debug it seriously.  Having that fixed would allow a number of places
like Columbia and your other machines to be used to play with the thing.
-------
21-Jan-85 19:03:45-PST,710;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 21 Jan 85 19:03:41-PST
Date: Mon 21 Jan 85 19:01:00-PST
From: Ken Harrenstien <[email protected]>
Subject: Odd declaration bug
To: [email protected]
cc: [email protected]

The following program craps out during compilation:

struct entry {
	int a;
};
main() { printf("hello bug"); }

KCC seems to think that "entry" is some kind of reserved keyword, but
its error message is rather confusing.  I discovered this while
attempting to compile a program that I got from someone on a VAX.  I
got around it for the time being by replacing "entry" with a different
word, but what's the story?
-------
21-Jan-85 19:07:41-PST,675;000000000001
Mail-From: WHP4 created at 21-Jan-85 19:07:39
Date: Mon 21 Jan 85 19:07:39-PST
From: Bill Palmer <[email protected]>
Subject: Re: Odd declaration bug
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Mon 21 Jan 85 19:03:45-PST

entry is indeed a reserved keyword as things stand right now; it's used for
declaring entry points in modules such as the runtimes.  I think there are
a few old messages in the kcc mail archive about this.

David claims this is fixable by checking the context or syntax in which
it is seen; fixable or not, it obviously hasn't been done yet...

						Bill
-------
21-Jan-85 19:20:33-PST,457;000000000001
Mail-From: SATZ created at 21-Jan-85 19:20:29
Date: Mon 21 Jan 85 19:20:29-PST
From: Greg Satz <[email protected]>
Subject: Re: Odd declaration bug
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Mon 21 Jan 85 19:03:44-PST
Phone: (415) 497-1004

This has burned me a few times too. It is on the todo list, but I had
forgotten about it. It should be easy enough to fix.
-------
22-Jan-85 13:17:43-PST,723;000000000001
Mail-From: SATZ created at 22-Jan-85 13:17:37
Date: Tue 22 Jan 85 13:17:37-PST
From: Greg Satz <[email protected]>
Subject: [Kirk Lougheed <[email protected]>: Re: fail bugs]
To: [email protected]
Phone: (415) 497-1004

In trying to port a C program with KCC, I ran cross a strange FAIL
error. I was trying to use a variable named ifl. Here is an explanation
and fix which I will attend to upon my return.
                ---------------

Date: Tue 22 Jan 85 12:28:35-PST
From: Kirk Lougheed <[email protected]>
Subject: Re: fail bugs

The problem is that IFL is a FAIL pseudo-op.  It might be worthwhile to
have FAIL undefine its pseudo-ops at the start of each file.

Kirk
-------
22-Jan-85 17:38:59-PST,944;000000000001
Return-Path: <White%[email protected]>
Received: from udel-relay by SU-SIERRA.ARPA with TCP; Tue 22 Jan 85 17:38:51-PST
Received: From csnet-pdn-gw.ARPA by udel-relay.ARPA id a029083
          ;22 Jan 85 20:36 EST
Received: from hplabs by csnet-relay.csnet id bb28060; 22 Jan 85 18:01 EST
Received: by HP-VENUS id AA16629; Tue, 22 Jan 85 12:17:08 pst
Message-Id: <8501222017.AA16629@HP-VENUS>
Date: 22 Jan 1985 1216-PST
From: Vic White <White%[email protected]>
Subject: Re: Odd declaration bug [in KCC]
To: [email protected]
Cc: bug-kcc%[email protected]
Source-Info:  From (or Sender) name not authenticated.

	I'd be willing to lay odds that, since KCC compiles first
into FAIL, and ENTRY is a FAIL pseudo-op, KCC gets heartburn.
Does it make sense for KCC to generate symbols in cases like these,
or have a simpler convention of appending or prepending (e.g., '$entry')?
-------

23-Jan-85 15:27:52-PST,2284;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 23 Jan 85 15:27:40-PST
Date: 23 Jan 1985  18:26 EST (Wed)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Vic White <White%[email protected]>
cc:   Bug-KCC@Sierra
Subject: Odd declaration bug [in KCC]
In-reply-to: Msg of 23 Jan 1985  13:36-EST from Vic White <White%hplabs.csnet at csnet-relay.arpa>

    Date: Wednesday, 23 January 1985  13:36-EST
    From: Vic White <White%hplabs.csnet at csnet-relay.arpa>

        Thanks for the info.  There still seems to be a potential problem
    with name conflicts with the assembler, though.  Which of these would
    you think is a better solution:

    	1) Fix source programs to avoid these conflicts

The fewer changes needed to port a program, the better...

    	2) Use Kirk's suggestion of removing FAIL pseudo-ops as
    	   symbols at the beginning of the FAIL modules generated.

This is better.  You also have to do the MACRO pseudos -- KCC can emit
MACRO as well as FAIL, if you ask it to.  With a bit more work it
could emit files that would run through either.

    	3) Have a convention for naming classes of symbols (external
    	   routines with 'NAME.', for example) that avoids conflicts.

I wouldn't want to do anything to reduce the number of significant
letters in externals.  Renaming only the conflicting symbols would be
ok, if they had a $ or % or something like that that can't appear in
another symbol.

    	4) Generate a REL file directly from KCC ?

This is of course the best solution.  But it's also the hardest to
implement.  In any case it would still be useful to have assembly
output available, and it would be best if it always assembled...

    Also, I'm curious.  Why have a language construct to pass entry
    declarations?   All level 0 routines are global ... is there an
    advantage to this entry qualifier?

ENTRY blocks are needed for library routines so that LINK can find
them, and they must go at the top of REL files.  FAIL and KCC are both
one-pass, and so need to know about entries before they emit any code.
Normal programs don't need to use entry.
22-Jan-85 18:22:03-PST,3480;000000000001
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Tue 22 Jan 85 18:21:53-PST
Received: from CWR20B by CUCS20 with DECnet; 22 Jan 85 21:19:03 EST
Date: Tue 22 Jan 85 21:18:11-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: paunix
To: SATZ%SU-SIERRA@CUCS20
In-Reply-To: Message from "Greg Satz <[email protected]>" of Mon 21 Jan 85 23:32:02-EST

A brief update on status.  I have now gotten KCC extended addressing
code running under this thing.  Of course, these are the simplest of
programs.

Here's what I did to get this to work -- it involved some rework of
the emulated brk call.  "Normal" UNIX has the notion of text,
data, and stack segments which together account for all of the address
space.  I made the decision early on that PAUNIX wouldn't bother to
enforce that, but that it would simulate much of the behavior of "reducing
the break" by deleting whole pages from the address space which fell in
the range of deleted addresses.  Increasing the break involved remembering
the new limit, and checking the range of pages in the "new" area to make
sure they were either writable or non-existant.  It was left to the 
program to reference them by touching them.

Well, while this basic philosophy is still intact, the implementation had
numerous bogus assumptions in its algorithms which I think have been ironed
out.  One of the assumptions was that I could get away without initializing
a break value within PAUNIX (which was assumed to be 0).  While this worked
for my BLISS programs (which allocated new data from the low segment), the
KCC extended addressing code shot this all to hell, since what followed 0
was an empty section 0, followed by code, followed by stack, THEN followed
by the data segment, each in their own sections.  Further, the brk
implementation didn't allow you to grow the break over a section boundary --
since sections aren't demand created brk clearly had to be smarter.

The problem was solved by adding a PAUNIX pseudo-call called A.OUT.  It's
purpose is to allow the bootstrap code for a particular program to inform
the emulator of data which would on a real UNIX system have been obtained
by the kernel in reading the a.out header of an exec'ed file, but which is
not really available in a TOPS-20 program without making assumptions on
the contents of the job data area -- something I didn't want the emulator
to do because I don't know what BLISS, SAIL, MACRO, PASCAL, etc. programmer's
all do to make use of that.  So, before calling main(), the bootstrap code
now "sets" PAUNIX's notion of the initial break value.  Further, PAUNIX
will allocate and delete sections as appropriate during a brk call.  And
even further, it protects the brk call from attempting to delete or
overwrite PAUNIX itself -- although it allows the program to delete itself
-- just like real UNIX.

I know, this is all probably really dull to listen to, but at least you
know I'm thinking and doing something about it!

In any case, I'm now going to run all my programs extended -- fun fun fun.

I haven't updated the Columbia-20 copies yet -- I gather you're going to
be gone all next week.  I'll be gone most of it too, be up at DEC for the
6.1 festivities, but expect to get a full weekend in before that and also
the evenings of this week.  I'm going off to work on the TTY I/O stuff 
now, will keep you up to date.

	Rob
-------
26-Jan-85 13:55:00-PST,1707;000000000001
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sat 26 Jan 85 13:54:55-PST
Received: from CWR20B by CUCS20 with DECnet; 26 Jan 85 16:53:35 EST
Date: Sat 26 Jan 85 16:53:37-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: paunix
To: SATZ%SU-SIERRA@CUCS20
In-Reply-To: Message from "Greg Satz <[email protected]>" of Sat 26 Jan 85 16:43:16-EST

We'll try to have a good time up at DEC -- will let you know if we find
out anything particularly dramatic.

Another brief update -- started working on the TTY handling code, and
noticed that as with the brk call a number of simplifying assumptions
were too simple.  So at the moment, we're very close to having almost
all the functionality of the Berkeley terminal driver implemented.  Now,
since the current objective was to get PAUNIX to 7th edition status and
then start inserting the 4.x stuff, some of this stuff won't get used
right away (mainly job control), but the hooks will be there.  And you
can change your erase, kill, etc. characters to your hearts content.
It should also be just as (or more) efficient than the current calls to
TEXTI%.  And -- it takes LF to CR/LF on output and CR/LF to LF on the
way in when the CRMOD flag's set.

I'm going to try to get as much of my last-published plan done this
weekend so I can go after fork() after we get back from DEC.

Take care, hope you had (will have?) fun in the land of Unix user's
groups.  (Say, are they as fun as DECUS?  The last Usenix conference
I went to was in Toronto in 79(?) -- was a pretty good time, but only
a couple a hundred people.  I imagine there's quite a few more now...)
-------
27-Jan-85 13:08:33-PST,412;000000000001
Mail-From: WHP4 created at 27-Jan-85 13:08:27
Date: Sun 27 Jan 85 13:08:27-PST
From: Bill Palmer <[email protected]>
Subject: how to throw kcc into a loop
To: [email protected]

Try compiling something like this little fragment (problem: no closing " on
format descriptor) and kcc will print the same errors at you for a long time.

main ()
{
    printf("%s\n,"foo");
}

					Bill
-------
27-Jan-85 16:20:35-PST,365;000000000001
Mail-From: SATZ created at 27-Jan-85 16:20:26
Date: Sun 27 Jan 85 16:20:26-PST
From: Greg Satz <[email protected]>
Subject: pathological bugs fixed
To: [email protected]
Phone: (415) 497-1004

Programs with missing double quotes should now terminate. I also
fixed the problems with the compiler going crazy with

main()

and

main(){
-------
27-Jan-85 17:56:56-PST,454;000000000001
Mail-From: SATZ created at 27-Jan-85 17:56:43
Date: Sun 27 Jan 85 17:56:43-PST
From: Greg Satz <[email protected]>
Subject: entry and entry
To: [email protected]
Phone: (415) 497-1004

I modified KCC to accept entry as an identifier as well as
a statement. Entry can only be used as a statement if it
is the first token in a program. This should reduce the number
of things you have to modify when porting a program from Unix.
-------
29-Jan-85 11:47:39-PST,792;000000000001
Mail-From: SATZ created at 29-Jan-85 11:47:35
Date: Tue 29 Jan 85 11:47:34-PST
From: Greg Satz <[email protected]>
Subject: KCC bugs
To: [email protected]
cc: [email protected]
Phone: (415) 497-1004

The following programs came from a suite that Dan Newell managed to get.

This one loops forever in gencode(). I want to write a debug routine
that outputs info from the parse tree passed to gencode() so I will
understand it better. In the meantime, any hints?

main()
{
   int a,b,c,d;
   a=1; b=2; c=3;
   d = a + b + c;
}

KCC complains that the variable i is being used twice. This is sort of
scary since I am not sure how much symbol table hacking will be
required.

main()
   {
   int i = 0;
      {
      int i;
      i++;
      }
   }
-------
29-Jan-85 13:17:03-PST,3195;000000000011
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Tue 29 Jan 85 13:16:53-PST
Date: 29 Jan 1985  16:14 EST (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Subject: KCC bugs

Infinite loops that don't print error messages once each cycle are
almost always in the peephole optimizer.  In this case, you have a
pair of instructions like
  q:	ADD	R,S
  p:	ADD	R,T
where R is the reg that got 1 into a, S is b=2, and T is c=3.
Of course there were MOVEs around them but they got folded out by the
common subexpression eliminator.

Now foldplus() is called on p.  First it does an if that isn't true,
so we skip past the body (which is a large switch statement).  Next
there is a while loop, which is where we are getting stuck.  It
switches on q->pop (PLUS), which in turn switches on p->ptype (REGIS),
which then passes by an if statement designed for some other special
case than the one we have, and does
		p->pop = q->pop;	/* set opcode */
		q->pop = PLUS;		/* to swap them */
		swappseudo(p,q);	/* opcodes hacked, swap ops */
		foldplus(q);		/* try optimizing again */
		q = p;			/* start at top again with this */
		continue;		/* try loop again */
Now the problem is that after the continue, we have
  q:	ADD	R,T
  p:	ADD	R,S
which looks a lot like what we had before.

One way to fix this would be to change the
		foldplus(q);
		q = p;
into
		p = q;
This makes the next iteration of the loop do most of what was
previously done in the recursive call to foldplus(), and then skips
any attempts to fold what was q before the swap back into the
resulting code.  I'm not convinced the code produced will be any
worse, and it's certainly a lot safer.  Try it anyway, and see what it
does to the emitted code.  If it makes things much worse, you could
instead put
		if (q->ptype == REGIS) break;
at the start of the case; this would make it behave as now, except
that it would not do anything in the currently losing situation. You
are of course welcome to break foldplus() up from its current disorder
into small, easy to understand, routines.


I have often wished for a better way of debugging the optimizer.
Code to print out parse trees is not likely to help, but might be
useful in other ways... maybe you could make it another option like
the current symbol table dump stuff.


For the other problem, I am not very familiar with the symbol tables,
so I'm not sure how much I can help.  The block structure of symbols
is reflected in a stack structure in the table, but I don't know
whether the way lookup works will let you have different local entries
for the same name.  If not, you might have to set up some sort of
stack hanging off each symbol describing the next outer scope, and use
that instead of the code setting the scope to SSCOPED when a block is
exited (and of course pushing instead of giving an error for duplicate
local symbols).


Is there a lint for the 20 yet?  One project that might be useful
(and a lot of work) would be to make KCC pass it...
29-Jan-85 18:15:07-PST,631;000000000001
Mail-From: SATZ created at 29-Jan-85 18:15:06
Date: Tue 29 Jan 85 18:15:06-PST
From: Greg Satz <[email protected]>
Subject: Re: KCC bugs
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Tue 29 Jan 85 16:14:00-PST
Phone: (415) 497-1004

unfortunately lint has a construct like

#if define(vax) || defined(pdp11)

which kcc just doesn't plain like. I looked into implementing
this but it seems that the pconst() code would have to be
broken. I may just remove it and ignore it but it will come
back to haunt me later... sigh.
-------
29-Jan-85 20:54:43-PST,537;000000000001
Mail-From: SATZ created at 29-Jan-85 20:54:35
Date: Tue 29 Jan 85 20:54:35-PST
From: Greg Satz <[email protected]>
Subject: bugs
To: [email protected]
Phone: (415) 497-1004

I fixed the following bugs:

#undef foo used to complain if foo wasn't defined. make it keep quiet.

#if expression was eating the next line. make it back up over the \n
so the next line will be seen.

typedef enum (a,c,b) BAR; wasn't working. fixed.

Now, the next problem is trying to implement #if defined(BAR) || defined(FOO)
-------
30-Jan-85 09:42:16-PST,2154;000000000001
Mail-From: SATZ created at 30-Jan-85 09:42:10
Date: Wed 30 Jan 85 09:42:10-PST
From: Greg Satz <[email protected]>
Subject: structure member confusion
To: [email protected]
Phone: (415) 497-1004

Dave, it seems that there is a problem with non-unique structure member
names. I traced through the code and found out that when there are two
members with the same number but in different structures, you remember
this by setting the offset to AMBIGMEM. However, the type for the member
is set to the first member with that name. Consequently, if the second
name is a structure, and the first is an integer, you will lose as the
following example indicates.

I see two solutions to this problem:

1) Easier solution: when checking member types, check to see if value is
AMBIGMEM, if so, check the type field in the SMEM list you build. This
type field needs to be added into the definition and recorded when this
list is built. Currently, the type info from the call to typespec() is
being ignored if findsym() does find a symbol (two members with the same
name and different offsets in different structures).

2) Make a new constant called AMBIGTYP which is set when AMBIGMEM is
set. This requires more work when checking types but it may be a cleaner
implementation.

Comments?

Here is the program that loses:

---------------
union ndu {
	struct {
		int op;
		short type;
		char *ptr;
		} in;
	struct {
		int op;
		short type;
		char *ptr;
		int tst;
		} on;
};

struct atype {
	short aty;
	short extra;
};

struct foo {
	int flag;
	int *name;
	short args;
	struct atype type;
};

union u {
	struct foo l;
	struct {
		short flag;
		char *fn;
	} y;
};

union u rc;

main()
{

	rc.l.flag = 15;
	rc.l.type.aty = 1;
	rc.l.type.extra = 0;
	bar();
}

2Sierra#cc -s -g foo.c
KCC:    foo

Error at main+4, line 41 of foo.c:
        rc.l.type.aty 
Structure name required on the left of the (.) operator -- aty.

Error at main+5, line 42 of foo.c:
        rc.l.type.extra 
Structure name required on the left of the (.) operator -- extra.
?2 errors detected
-------
30-Jan-85 10:00:29-PST,1266;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 30 Jan 85 10:00:20-PST
Date: 30 Jan 1985  12:57 EST (Wed)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Greg Satz <[email protected]>
Cc:   [email protected]
Subject: structure member confusion
In-reply-to: Msg of 30 Jan 1985  12:42-EST from Greg Satz <SATZ at SU-SIERRA.ARPA>

Uh, right, looks like you need to remember types in the SMEM structure.
I can think of two possible ways to resolve the case when the offset
is unambiguous but there is more than one type given for that offset:

(1) Set the offset to AMBIGMEM on type conflicts.  This means that
    unambiguous offsets don't need to search through SMEM at all.  It
    is clean, but more restrictive - now if you give an offset two
    different types you will have to use it only from structs in which
    it is defined.

(2) Leave the offset alone, and always look through SMEM for a type.
    If no match is found, return say the one defined first or last or
    something like that.  This doesn't restrict the language accepted
    any further than it is now, but the semantics become less clean.
 1-Feb-85 17:10:52-PST,1190;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Fri 1 Feb 85 17:10:49-PST
Date: Fri 1 Feb 85 17:09:29-PST
From: Ken Harrenstien <[email protected]>
Subject: Bug in MALLOC
To: [email protected]
cc: [email protected]

I copied <KCC.LIB> over here and have been updating LIBC.MASTER and
scanning the code.  I found one potentially serious bug with MALLOC;
when it calls SBRK to get more memory, it never checks the return value
for an error.  If SBRK returns -1 or 0 you are going to get shafted.
Now, I realize the PDP-10 has a lot of memory, but...

I'll probably run into a couple of other irregularities as I proceed, and
will let you know about them.  I share your dubiousness about the 
portability of CTYPE.C/H, by the way; that sort of offset hack optimization
should be handled by the compiler, if anything.  Simplest fix is to
normalize the table & macros and conditionalize ctype.c so the table declaration
is "int", not "char", when being compiled by KCC.  (Personally I like the
idea of having more room explicitly available for user flags, but portability
requires the user not know about _ctype_ anyway...)
-------
 3-Feb-85 22:21:07-PST,578;000000000001
Mail-From: SATZ created at  2-Feb-85 16:41:05
Date: Sat 2 Feb 85 16:41:05-PST
From: Greg Satz <[email protected]>
Subject: bug fixes
To: [email protected]
Phone: (415) 497-1004

Here is a status report on KCC:

1) added the __LINE__ and __FILE__ macros

2) fixed the problem with a.b.c.d where c was declared as a structure
   and something else.

3) Fixed the foldplus() looping bug with a patch from Dave.

Yet to do:

{ int i; { int i; } } bombs miserably. local symbols aren't stacked correctly

#if defined(FOO) yet another hack to do.
-------
 2-Feb-85 12:26:34-PST,1835;000000000001
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sat 2 Feb 85 12:26:27-PST
Received: from CWR20B by CUCS20 with DECnet; 2 Feb 85 15:25:19 EST
Date: Sat 2 Feb 85 15:24:52-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: paunix comments
To: SATZ%SU-SIERRA@CUCS20
In-Reply-To: Message from "Greg Satz <[email protected]>" of Tue 29 Jan 85 16:22:27-EST

| Thanks for the notes, here's some comments/questions.

do you plan on implementing an _exit() so processes can exit without
closing parents files?

| I'm  having   trouble  parsing   this  one.    The  current   PAUNIX
| implementation defines only  the _exit() entry  point (although  the
| manual does not reflect that so I'll fix that).  My understanding is
| that in real UNIX, exit() is  really a stdio entry point and  exists
| to clean up stdio data structures before doing _exit() which is  the
| UNIX equivalent of a  HALTF% with status, except  that all fd's  are
| closed.  So, my question back is --  if I change the manual page  to
| reflect the  fact  that PAUNIX  just  implements _exit(),  and  that
| exit() is really in stdio for C programs (and non-existant for other
| languages unless  they implement  somthing similar)  does that  meet
| your concerns here?

in the setuid manual page, you put in getgid instead of setgid.

| Thanks, that's fixed now.

That's it for now.

| That's it for  me too.  V6.1  school was a  good time, learned  some
| things about DECnet  (which probably  don't concern  you folks)  and
| TCP/IP (which I didn't know much about because we don't use it now).
| More on that later, I'm trying to  get back into the flow of  PAUNIX
| right now so I'm trying to not think about all the work I've got  to
| do to bring 6.1 up.
-------
 4-Feb-85 02:01:33-PST,505;000000000001
Mail-From: WHP4 created at  4-Feb-85 02:01:28
Date: Mon 4 Feb 85 02:01:28-PST
From: Bill Palmer <[email protected]>
Subject: prarg spoken here
To: [email protected]

I just made some changes to KCC and Sierra's exec so that one can use the
load and compile commands with KCC if desired.  I believe the changes are
limited to:  <6-EXEC>EXECCS.MAC, <KCC.LIB>PFORK.FAI, <KCC.CC>CC.C, 
<KCC.CC>CC.S, and <KCC.CC>CCASMB.C.  Let me know if you have any problems
with it.

					Bill
-------
 4-Feb-85 04:31:33-PST,2247;000000000001
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 4 Feb 85 04:31:27-PST
Received: from CWR20B by CUCS20 with DECnet; 4 Feb 85 07:30:28 EST
Date: Mon 4 Feb 85 07:29:57-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: paunix comments
To: SATZ%SU-SIERRA@CUCS20
In-Reply-To: Message from "Greg Satz <[email protected]>" of Mon 4 Feb 85 01:35:14-EST

You're welcome.

Just as some more background on the file closing question, PAUNIX has to
go to some pains to deal with the closing of files.  First, the "I/O name
space" available to a process is the traditional 20 file descriptors.  
Internally, these are mapped to JFN's which may be shared between several
processes.  The sharing occurs as a result of fork() and dup().  Two
distinct open's of a file will get separate JFN's (and separate file pointers).
The structure of the tables very nearly approximates UNIX's (as described
in Thompson's "UNIX Implementation"), although some of the tables are not
directly implemented in PAUNIX as it relies on corresponding structures 
implemented within TOPS-20.

When a process exits, it's FD's are closed which will decrement reference
counts in the JFN table (PAUNIX data structure oft).  When the reference
count on an oft cell goes to 0, the JFN will actually be closed.  However,
TOPS-20 has the notion of JFN's "belonging" to a fork, such that when the
fork is killed all the JFN's it has brought into the world are CLOSF%'ed
with CZ%ABT via the internal CLZFF% done by a KFORK%.  Thus, the criterion
for PAUNIX KFORK%'ing a TOPS-20 fork are that it has _exit()'ed, that it's
parent has obtained it's status, and that it has no entries with non-zero
reference counts in the oft.  To support this, there's a process state called
JFN-zombie, which is like the zombie state except it is the state a process
enters after its status has been read and while it still has entries in the
oft.

So in this case, if a program does something under UNIX in a multi-process
environment -- it will work exactly the same under PAUNIX, the TOPS-20'isms
regarding JFN ownership and CLZFF% are accounted for so as to not interfere
with emulating UNIX.
-------
 4-Feb-85 16:30:54-PST,519;000000000001
Mail-From: WHP4 created at  4-Feb-85 16:30:46
Date: Mon 4 Feb 85 16:30:46-PST
From: Bill Palmer <[email protected]>
Subject: macro interface now functional
To: [email protected]

I fixed CCASMB.C so that it can converse with MACRO (PA1050, actually) 
successfully.  The problem was that PA1050 expects sixbit 'MAC' in a
certain word in the PRARG block, and KCC was passing sixbit 'FAI' in all
cases.  Since the magic number wasn't commented, it wasn't clear what
the problem was.

				Bill
-------
 5-Feb-85 06:04:31-PST,1307;000000000011
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 5 Feb 85 06:04:26-PST
Date: Tue 5 Feb 85 06:03:19-PST
From: Ken Harrenstien <[email protected]>
Subject: Bizarre "preprocessor" bug
To: [email protected]
cc: [email protected]

I encountered a mysterious error while compiling a large program, and I had
a very hard time figuring out how my changes could have broken it.  After a
long process of whittling I arrived at the following piece of code which
demonstrates the problem.  Evidently the first blank line in a conditional
that is being ignored will act as if an #endif was seen, IF there was a
previous unmatched endif.  All I can say is that KCC should really complain
about unmatched endifs, and in any case should not start treating
conditionalized code so rashly.  Greg may have fixed one aspect of this bug
already, I'm not sure; the version of KCC I have seems to have last been
written on 1-Jan-85.  I'll have to get a new one soon.  Anyway here's the
magic cookie:
	---------------------
#endif	/* Take this line out and the code will compile */

main() { }

#ifdef COMMENT
	Ignored=-(%0*#$!=^*<x> This line will be ignored

	Seen=-(%0*#$!=^*<x> This line will be seen!
#endif COMMENT

	---------------------
-------
 4-Feb-85 18:43:03-PST,936;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 4 Feb 85 18:43:01-PST
Date: Mon 4 Feb 85 18:41:52-PST
From: Ken Harrenstien <[email protected]>
Subject: Belay that ctypo
To: [email protected]
cc: [email protected]

I just realized that the reason for the +1 offset in the ctype table
is simply so that the char value EOF (-1) is an acceptable test
argument, and in fact (as I suspected) there is no efficiency
difference in the resulting code.  Of course, using an integer table
would still be more efficient (a MOVE instead of ADJBP and LDB) but
that is a different issue.  I guess since the code will work equally
faithfully either way, whatever the machine, and the 10 tends to have
plenty of memory, I would favor using an integer table.  String
parsing is one place that needs help to compensate for the overhead
introduced by unoptimal byte pointer handling.
-------
 6-Feb-85 23:46:44-PST,1395;000000000001
Mail-From: DAGONE created at  6-Feb-85 23:46:39
Date: Wed 6 Feb 85 23:46:38-PST
From: Dan Newell  <[email protected]>
Subject: Passing floating constants
To: [email protected]

   Shouldn't they be passed as doubles rather than just floats.
For instance, foo(1.0) presently generates
	IFIW 1,[1]
	PUSH 17,1
	PUSHJ 17,foo
whereas foo(i) if i is double is done
	DMOVE	6,-7(17)
	PUSH	17,6
	PUSH	17,7
The callee doesn't know if it should pop a float or a
double off the stack unless some kind of convention is
involved.

   As the callee is allowed to declare a parameter as
a float or double, then we can do one of the following:
   - Pass everything as floats (defeats purpose of
	having doubles, not only that I think K&R says
	that floating arith is all in double to avoid
	losing precision. Double to float would lose some.
   - Force the caller to cast his constants to double
	Clutters code with casts. More work for programmer.
	Makes all sorts of problems when passing floats
	to routines expecting doubles, and visa-versa if
	casts not correct.
   - Pass all floats/constants as doubles. No precision
	is lost, usercode is clean, caller knows what to do
	and callee knows what he's getting and can do
	what he bloody well pleases with the parameter
	once he's got it. The only penalty is the implicit
	cast if passing a float.

	Dan
-------
 7-Feb-85 00:25:42-PST,540;000000000001
Mail-From: DAGONE created at  7-Feb-85 00:25:28
Date: Thu 7 Feb 85 00:25:27-PST
From: Dan Newell  <[email protected]>
Subject: Floating round-up/down
To: [email protected]

    In the test suite I have, it seems they expect that
1/2 rounds down to 0.0 instead of to 1.0. Is this just
typical of VAX's/8086/68K machines and the 20 is different?
I checked the VAX and i=1.0/2.0 is 0.
    This is a pretty minor point but I am just trying
to keep tabs on all the messages I'm getting from the test
suites.
	Dan
-------
 7-Feb-85 06:57:11-PST,920;000000000011
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Thu 7 Feb 85 06:57:09-PST
Date: 7 Feb 1985  09:55 EST (Thu)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bill Palmer <[email protected]>
Subject: is realcode() injurious to your health?
In-reply-to: Msg of 6 Feb 1985  19:47-EST from Bill Palmer <whp4 at SU-SIERRA.ARPA>

No, realcode() isn't very careful about keeping the data structures intact.
It assumes you're done with that pseudo-op, and hacks it up in certain
cases so that later parts of the routine will use the hacked op
instead of what was there before.  Also there are some optimizations
built into realcode() such as turning IMULI by power of two into ASH
that may or may not be reflected in the buffer when it returns.

Is there some reason for wanting it otherwise?
 7-Feb-85 19:00:14-PST,336;000000000001
Mail-From: DAGONE created at  7-Feb-85 18:59:54
Date: Thu 7 Feb 85 18:59:54-PST
From: Dan Newell  <[email protected]>
Subject: Vertical tabs not handled right
To: [email protected]

   ^K, or vertical tab, is a legal C white space and is
not presently being handled correctly. See <kcc.test.bell>ctrlk.c
	Dan
-------
 7-Feb-85 20:50:13-PST,916;000000000001
Mail-From: DAGONE created at  7-Feb-85 20:50:07
Date: Thu 7 Feb 85 20:50:07-PST
From: Dan Newell  <[email protected]>
Subject: Scoping problems with kcc
To: [email protected]

main()
    {
    {
    int i;
    i++;
    }

    {
    i++;
    }
    }
====
   Complains for second use of i as being in incorrect scope.
   It should be flushing known instance of int i; in first
scope from symbol table and complaining about second use
as an undefined.
   This bug though is semi-tolerable but indicative of careless
handling of the symbol table.
   The next program though shows where this leads...
====
main()
    {
    int i;

	{
	int i;
	i++;
	}
    i++;
}
====
   This program bombs with invalid use of auto variable outside
its scope.
   What's happening is the second int i; is wiping out knowledge
of the outer scope's int i;
   This is a no no.
	Dan
-------
 8-Feb-85 09:00:53-PST,900;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Fri 8 Feb 85 09:00:43-PST
Date: 8 Feb 1985  11:59 EST (Fri)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   dagone@sierra
cc:   bug-kcc@sierra
Subject: float arguments

The problem is that according to K&R, floating point constants are
double not float, but I thought that was stupid and didn't do it.  All
floating point calculations are supposed to be done in double
precision too, but luckily the new ANSI standard will not require
that.  Perhaps a solution would be to coerce floating point constant
arguments to double, but then of course we would lose when someone
wanted to pass a constant as a float argument.  The current goal seems
to be compatibility with 4.2 -- how does its compiler handle this problem?
 8-Feb-85 13:40:11-PST,510;000000000001
Received: from LOTS-A by Sierra with Pup; Fri 8 Feb 85 13:39:41-PST
Date: Fri 8 Feb 85 13:36:09-PST
From: Dan Newell  <D.DAGONE@LOTS-A>
Subject: float parameters
To: bug-kcc@Sierra

To: bug-kcc@LOTS-A.#Pup

   I checked the Vax compiler and it seems that all
floating point parameters are passed as doubles and
the callee ignores the precision he doesn't need if
he only wants a float. This avoids the problems of
what parameter has been passed at the expense of
some space/speed.
	Dan
-------
 8-Feb-85 14:49:09-PST,8342;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Fri 8 Feb 85 14:48:46-PST
Date: Fri 8 Feb 85 14:47:28-PST
From: Ken Harrenstien <[email protected]>
Subject: Minor optimization
To: [email protected]
cc: [email protected]

Code of the form (*proctab[3])(arg1, arg2)
results in something like
	MOVE 12,PROCTA+3
	PUSHJ 17,(12)
Although there seems to be no good reason why
	PUSHJ 17,@PROCTA+3
would not work better.  (*proctab[i])(...) could likewise be
	[optional: MOVE 12,<wherever "i" is>, if not already in register]
	PUSHJ 17,@PROCTA(12)

Since proctab is defined as an array of procedure addresses, you
should never find any entry with an additional indirect bit turned on,
and if the word is garbaged, the indirection doesn't matter since you
will jump into randomness anyway.  While I can't say this is a
high-priority item, it does seem like an easily performed
local-optimization check.  This could be extended to anything which
references an address, since no C addresses should ever have indirect
bits in them.


Date: Fri 8 Feb 85 17:32:06-PST
From: Greg Satz <[email protected]>

The reason that code sequence is generated that way is because
KCC supports extended addressing. Your examples only make 18 bits
available while KCC, even though it uses an extra instruction,
has the full word to use (even if not all of it will be).


Date: Fri 8 Feb 85 18:01:29-PST
From: Ken Harrenstien <[email protected]>

Are you quite sure that my suggestion (PUSHJ 17,@PROCTA+4) makes only
18 bits available?  I looked carefully at the DEC flowchart for
extended address calculation and don't see where there is a problem.
The value in PROCTA+4 can certainly be global.  If you are objecting
to the first address calculation (PROCTA+4 being only 18 bits) then
you should also object to the sequence
	MOVE 12,PROCTA+4
	PUSHJ 17,(12)
which is what KCC currently generates!


Date: Fri 8 Feb 85 18:35:06-PST
From: Ken Harrenstien <[email protected]>
Subject: Compiler heads over cliff

Here is a piece of code that produces rather strange FAIL output.
------------
typedef union {
	double is_str;
	struct { char *is_bp; int is_len; };
	} IDSTR;
main()
{	IDSTR name, getit();

	name.is_str = getit();
}
------------
	<the usual preliminary stuff deleted>
main:
	MOVEM	3,3(17)		; ?????!!!!
	MOVEM	4,4(17)		; ?????!!!!
	ADJSP	17,4
	PUSHJ	17,getit
	POP	17,4
	POP	17,3
	PUSHJ	17,$DFLOT
	IFIW	3,1
	DMOVEM	3,-1(17)
	ADJSP	17,-2
	POPJ	17,
------------

Now, I don't know what KCC thinks it was doing, and don't even know if
what I am trying is legal C, but KCC should NEVER, NEVER generate ANY
code which makes a positive-offset reference to the stack!!!  (I have
commented on the two instructions above which do.)  Otherwise, interrupt
handling could easily smash the "saved" values.

Any attempt to generate such offsets should probably be considered an
internal error and reported.  I won't be surprised if there are other
things wrong with the syntax, but KCC didn't complain...


Date: 9 Feb 1985  13:33 EST (Sat)
From: David Eppstein <[email protected]>
Subject: Minor optimization and lemminghood

KCC in fact used to indirect the way you suggest (this was before it
knew about function variables, but it happened for dereferencing
pointers and for certain structure references).  I took it out one day
because I got confused and thought that under certain circumstances
involving byte pointers of some format or other that the indirect bit
might accidentally be set in the pointer being indirected through.
I no longer believe this.  It is probably safe in terms of accidental
double indirection to perform the optimization.  It is not necessarily
safe in terms of the peephole optimizer, though - that should be
thoroughly checked before doing the fold again.  There is definitely
no problem with numbers of bits available.

Re using places above the stack:  It is documented somewhere (SIGNAL.FAI?)
that interrupts must use another stack because this happens.  There is
one routine, called I believe only on emission of an ADJSP instruction
that is responsible for most of this behavior.  However, if it is
changed not to use locations above the top of the stack, there are a
couple of less often hit places that need to be fixed as well:

(1) TOPS20.FAI, in fork() or wait() or both and maybe some other
    places, uses large chunks above the top of the stack for page
    buffers and other such storage.  I guess it should do an ADJSP and
    take the storage for real.

(2) Structures as return values live above the stack, and are then
    moved around a couple times before going wherever they were
    assigned.  I don't know how to change the calling sequence to
    avoid this (I have heard rumors that some PCC VAX or 68K or
    something code breaks when using interrupts for just this reason).


Date: Sat 9 Feb 85 18:30:55-PST
From: Ken Harrenstien <[email protected]>

(1) Yes, it should be easy to do an ADJSP and "take the storage for real".
Considering the overhead of the JSYSes involved, a couple of ADJSPs are
nothing.

(2) I am ignorant of the complications of KCC's inner workings, which
allows me to suggest that if you know a function is going to return a
structure, you can easily reserve stack storage for it before calling
the function.  This space reservation should be done just after the
last argument PUSH and just before the function call itself.  The
function, since it knows it is declared as returning a structure,
knows how much space there is between the return address and the
first, second, etc. arguments.  (You cannot reserve this space before
pushing arguments since then you could not implement printf with its
indefinite # of arguments.)  So, the "return(foostruc)" copies foostruc
into this reserved space; upon return to the caller, the caller has
plenty of opportunity to hack the structure before popping it and the
function arguments off the stack.  For the possibly typical case of an
assignment statement like

	struct bar foo, getstruc();
	foo = getstruc(a,b,c);

then you have copied it two different times, once to return the
structure and again to assign it to a more permanent place before
popping it off the stack.  Of course it is much more efficient to do:

	struct bar foo, *getstruc();
	getstruc(&foo, a,b,c);

which is why (a) I don't recommend that functions return structures in
general and the feature should be avoided, and (b) the feature need
not be super efficient as it should rarely be encountered.  It SHOULD,
however, work correctly in the face of interrupts.  That it sometimes
doesn't for V7 C is such a brain-damaged and imbecilic idea that I
can't believe they are serious about it.  I guess this might be
another reason to avoid the feature, since your program becomes less
safely portable.

Suggesting that "another stack" be used is not a solution.  On sophisticated
systems you can have several different levels and classes of interrupts
pending.  Surely each one should not be responsible for its own stack.
By far the simplest and safest method is to never use anything above the
top of the stack.

I guess the fact I have gone on at length shows I feel strongly about it.
It is very tempting to use the space above the stack, but using just a tiny
bit is like getting just a tiny bit pregnant!

One additional comment.  It may not have been apparent from the code
fragments I sent, but what I was trying to do was coerce C into
passing a two-word structure as argument and return-value WITHOUT
invoking the general structure-handling hacks.  Since the code already
exists to handle "doubles" quite efficiently (with DMOVE etc), I was
hoping to find a way to piggyback other kinds of two-word entities
onto this, by using a union declaration and doing the actual passing
with the "double" type.  If you can arrange it so that structures of
one or two words in length are "special" and moved around like ints
and doubles, that would be really terrific.  I suppose the concept could
be extended even farther depending on how many registers you care to
tie up to hold the return value (overkill department, though).

--Ken
 9-Feb-85 04:01:41-PST,771;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 9 Feb 85 04:01:36-PST
Date: Sat 9 Feb 85 04:00:25-PST
From: Ken Harrenstien <[email protected]>
Subject: More fun with typedef'd unions
To: [email protected]
cc: [email protected]

This code:
	union us {
		double is_str;
		struct { char *is_bp; int is_len; };
	};
	typedef union us IDSTR;
	IDSTR teststr;
	main() {}

Produces this FAIL code:
	<..the usual preliminaries..>

		RELOC
	testst:	BLOCK	777777777777

		RELOC
	main:
		POPJ	17,
	<etc>

Something is wrong of course.  In addition to fixing that, I submit
that as another robustness measure KCC should check before it
accidentally tries to furnish negative arguments to a BLOCK.
-------
 9-Feb-85 04:38:10-PST,785;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 9 Feb 85 04:38:07-PST
Date: Sat 9 Feb 85 04:36:58-PST
From: Ken Harrenstien <[email protected]>
Subject: Another candidate for trivial optimization
To: [email protected]
cc: [email protected]

This is particularly relevant in light of what seems to be an agreement
to pass all floats as doubles.  I note that when a double is pushed on the
stack as an argument to a function, the compiler first DMOVE's the variable
to a pair of registers, and then PUSHes each register separately.  What
it should do is eliminate the DMOVE and simply PUSH the two words directly
from the variable.  KCC is already clever about doing this for integers
and most other one-word stuff.
-------
 4-Feb-85 23:45:21-PST,977;000000000001
Mail-From: DAGONE created at  4-Feb-85 23:45:20
Date: Mon 4 Feb 85 23:45:20-PST
From: Dan Newell  <[email protected]>
Subject: what's to be done with this...
To: [email protected]

what is the sizeof foo? 4 * sizeof(int)? or 0?
====
int foo[] = {1,2,3,4};

int i;

main()
	{
	i = sizeof(foo);
	i++;
	i = sizeof(foo[0]);
	}
====
	This is a direct result of code in ccdecl.c that
	is parsing the foo[]
====
	case LBRACK:
	    nextoken();
	    pp = addpp(pp, pushtype(ARRAY,
				    (token == RBRACK)? 0 : pconst(),
				    NULL));

====
   If it sees the '[' with the ']' following immediately,
then it uses a size of 0. But the array is initialized later
to a definite length. I don't have K&R with me but once I
get to it I will look this one up. I am working on this one
to get the <.MS> suites running.

   It is also doing funny things with arrays of functions
but I need to get past this before I get to them.

    Dan
-------
11-Feb-85 18:41:48-PST,846;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 11 Feb 85 18:41:40-PST
Date: Mon 11 Feb 85 18:40:50-PST
From: Ken Harrenstien <[email protected]>
Subject: Uh-oh, a double bug
To: [email protected]
cc: [email protected]

This code:

	double main(){ return(0); }
	double foo() { return(0.0); }
	double bar() { return((double)0.0); }

produces these instructions:

main:
	SETZ	1,
	POPJ	17,
foo:
	SETZ	1,
	POPJ	17,
bar:
	SETZ	3,
	SETZ	4,
	MOVE	1,3
	MOVE	2,4
	POPJ	17,

Pretty gross, huh?  All of these examples should consist of
	SETZB	1,2
	POPJ	17,
but the first two reveal a serious bug (KCC is only setting 1 word, not both)
and the third produces code which is technically correct but somewhat
absurd.  I expected at worst something like DMOVE 1,[0 ? 0].
-------
 7-Feb-85 06:57:11-PST,920;000000000011
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Thu 7 Feb 85 06:57:09-PST
Date: 7 Feb 1985  09:55 EST (Thu)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bill Palmer <[email protected]>
Subject: is realcode() injurious to your health?
In-reply-to: Msg of 6 Feb 1985  19:47-EST from Bill Palmer <whp4 at SU-SIERRA.ARPA>

No, realcode() isn't very careful about keeping the data structures intact.
It assumes you're done with that pseudo-op, and hacks it up in certain
cases so that later parts of the routine will use the hacked op
instead of what was there before.  Also there are some optimizations
built into realcode() such as turning IMULI by power of two into ASH
that may or may not be reflected in the buffer when it returns.

Is there some reason for wanting it otherwise?
14-Feb-85 00:30:27-PST,1155;000000000001
Mail-From: WHP4 created at 14-Feb-85 00:30:15
Date: Thu 14 Feb 85 00:30:15-PST
From: Bill Palmer <[email protected]>
Subject: [Greg Satz <[email protected]>: kcc and link]
To: [email protected]

The modifications described below have been effected.
                ---------------

Mail-From: SATZ created at  7-Feb-85 17:32:07
Date: Thu 7 Feb 85 17:32:07-PST
From: Greg Satz <[email protected]>
Subject: kcc and link
To: [email protected]
Phone: (415) 497-1004

Bill, now that you are the link/exec/kcc expert, it would be nice
to make another change. When kcc is invoked without the EXEC
it would be nice to make it behave more like Unix. In other words,
if I do a "cc foo.c" I should get a file called a.out. Since that
is not really the -20 way, we should at least make a core image.
So unless I do a "cc -c foo.c" I should get a core image that
is either runable or saveable. The -c flag means generate
a .rel file but not a core image. The -S flag (not -g) means
leave the macro file (fail or macro actually) around after you are
finished. On Unix, if you do a -S and -c only the -S is done.
-------
-------
14-Feb-85 16:02:26-PST,1117;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 14 Feb 85 16:02:20-PST
Date: Thu 14 Feb 85 14:02:12-PST
From: Ken Harrenstien <[email protected]>
Subject: Loader barfage on undef syms
To: [email protected]
cc: [email protected]

Whenever I am trying to load the results of a compilation (preparatory to
SAVEing it) and there are some undefined symbols, what happens is that the
loader complains

%LNKFLE LOOKUP error (0) file was not found PS:CLIB.REL[4,1230]

and you are left with the distinct impression that something is wrong,
but you're not sure what to do about it.  As it happens, entering the
name of any random .REL file will make it happy, and it will proceed
to tell you about the undefined symbols.  But I sure don't want to
explain this to every naive user!

I recall Greg saying that this is because we are using a FTP'd CC.EXE
rather than building it from scratch on our own system.  ([4,1230] does
not correspond to any valid directory here).  I submit that this is
suboptimal behavior and really ought to be fixable...
-------
14-Feb-85 16:32:00-PST,881;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 14 Feb 85 16:31:43-PST
Date: Thu 14 Feb 85 14:15:05-PST
From: Ken Harrenstien <[email protected]>
Subject: Bug with #include
To: [email protected]
cc: [email protected]

Don't you just love these bug reports.

If you use #include with angle brackets and the file does not
exist in C: then you get the following sort of message:

	Error at line 3 of intest.c:
	#include <intest.h>
	Problem with file -- C:C:intest.h.

I think this error message could be a little more informative (and correct).
I.E. it shouldn't double the C: (makes you think the compiler is looking
in the wrong place to begin with!) and it can certainly say "File not found"
rather than "problem with file" which makes you think that it found the
file but thereby incurred indigestion.
-------
14-Feb-85 17:26:20-PST,815;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 14 Feb 85 17:26:13-PST
Date: Thu 14 Feb 85 17:24:35-PST
From: Ken Harrenstien <[email protected]>
Subject: Worse and worse
To: [email protected]
cc: [email protected]

Hmm.  I brought over the latest CC.EXE thinking that the new
hookup with LINK might have solved some problems.  Instead, a worse
problem has been introduced.  If there are undefined symbols, one
gets the same barfage as before, but after typing in some random
.REL file name to make LINK happy, IT NEVER SHOWS THE UNDEFINED SYMBOLS!
Instead it just pretends that it completed loading without any errors.
This is so bad that I am simply going to flush this version and wait
for the next CC, which may interact better with LINK.
-------
15-Feb-85 09:25:17-PST,669;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Fri 15 Feb 85 09:25:13-PST
Date: 15 Feb 1985  12:04 EST (Fri)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Ken Harrenstien <[email protected]>
Cc:   [email protected]
Subject: Loader barfage on undef syms
In-reply-to: Msg of 14 Feb 1985  17:02-EST from Ken Harrenstien <KLH at SRI-NIC.ARPA>

You need to rebuild your CLIB.REL from sources.  FAIL insists on
putting absolute PPNs in its requests for library files, so it doesn't
work very well just to copy a CLIB.REL from elsewhere.
15-Feb-85 09:41:49-PST,445;000000000001
Mail-From: SATZ created at 15-Feb-85 09:41:21
Date: Fri 15 Feb 85 09:41:21-PST
From: Greg Satz <[email protected]>
Subject: Re: Loader barfage on undef syms
To: [email protected]
cc: [email protected], [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Fri 15 Feb 85 12:04:00-PST
Phone: (415) 497-1004

I will look into seeing how much effort it would
take to fix fail.
-------
17-Feb-85 11:29:10-PST,2168;000000000001
Mail-From: SATZ created at 17-Feb-85 11:29:04
Date: Sun 17 Feb 85 11:29:03-PST
From: Greg Satz <[email protected]>
Subject: Fix to Fail for .REQUEST and .REQUIRE
To: [email protected]
Phone: (415) 497-1004

Here is a fix to Fail to allow CLIB.REL to be machine independant.
Before, Fail would generate .REL blocks with the PPN inside of them
and the device field set to PS. Now Fail will use the device field
passed to it and not use a PPN at all. This behavior is consistant
with Macro.

A new KCC, CLIB, and FAIL can be found on Sierra (if it stays up
long enough) as <subsys>cc.exe, c:clib.rel, and sys:fail.exe.
Soures for Fail are in sra:<fail>. Here is a diff in case you don't
want to FTP the whole thing.

^%LBLCK:
.
.
.
TNX,<	PUSHJ	P,TGETFY	;GET TENEX FILE SPEC
	JRST	LIBER1		;CAN'T PARSE NAME
	MOVEI	3,0
	IDPB	3,1		;NULL TO FINISH TEXT STRING
	MOVSI	1,(1B2) 	;OLD ONLY.
	MOVEM	1,GTTBL
	MOVEI	1,[ASCIZ/REL/]	;DEFAULT EXTENSION OF REL.
	MOVEM	1,GTEXT
	MOVEI	1,GTTBL		; GET A JFN FROM THIS MESS.
	HRROI	2,GTNAM		; STRING START THERE.
	GTJFN
	JRST	[OUTSTR	[ASCIZ/CAN'T FIND /]
		 OUTSTR	GTNAM
		 OUTSTR	[ASCIZ/, PASSING FILE NAME TO LINKER
/]
		 MOVSI 	1,(1B12!1B17) ; JUST TRY FOR THE FAKE FILE NAME
		 HRROI	2,GTNAM	; AND LET LINK DO THE REST.
		 GTJFN
		 JRST	LIBER1	; THROW UP OUR HANDS IN DISGUST.
		 JRST	.+1]	; ELSE WIN.
	HRRZM	1,LIBBLK+1	;A SAFE PLACE.
	SETZM	LIBBLK+4	;CLEAR DEVICE FIELD
	MOVE	1,[POINT 7,GTNAM1] ;A PLACE TO PUT DEVICE STRING
	MOVE	2,[POINT 7,GTNAM] ;FROM HERE
LBLCK2:	ILDB	3,2		;GET A BYTE
	IDPB	3,1		;PUT A BYTE
	JUMPN	3,LBLCK2	;UNTIL WE SEE NULL
	PUSHJ	P,MSIX
	CAIN	3,":"		;FIELD TERMINATED BY THIS CHARACTER
	 MOVEM	1,LIBBLK+4	;SAVE DEVICE
T20,<
REPEAT 0,<
	HRROI	1,GTNAM1	;GET DIRECTORY AND DEVICE
	MOVE	2,LIBBLK+1
	MOVE	3,[110000,,1]	;PUNCTUATE PROPERLY
	JFNS
	MOVSI	1,1		;EXACT MATCH
	HRROI	2,GTNAM1	
	RCDIR			;CONVERT TO DIRECTORY NUMBER FOR SPECIFIC STRUCTURE
	TLNE	1,(1b3)		;skip if ok.
	ERROR	[ASCIZ/CAN'T TRANSLATE TO DIRECTORY NUMBER/]
	HRLI	3,4		;THE "PROJECT" PART
	MOVEM	3,LIBBLK+3	;SAVE PPN
>;REPEAT 0
	SETZM	LIBBLK+3	;ZERO PPN
>;T20	
-------
17-Feb-85 12:00:04-PST,597;000000000001
Mail-From: SATZ created at 17-Feb-85 12:00:01
Date: Sun 17 Feb 85 12:00:01-PST
From: Greg Satz <[email protected]>
Subject: Re: Bug with #include
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Thu 14 Feb 85 16:31:53-PST
Phone: (415) 497-1004

That one was easy. I fixed the #include processing to issue the error:
Can't open file. I also fixed the C:C: problem. For consistancy with
Unix, I turned this error into a warning. At least KCC will tell you
that it couldn't open the file; pcc isn't so helpful.
-------
17-Feb-85 12:26:02-PST,2347;000000000001
Return-Path: <MAILER-DAEMON@Berkeley>
Received: from UCB-VAX.ARPA by SU-SIERRA.ARPA with TCP; Sun 17 Feb 85 12:25:59-PST
Received: from SU-SIERRA.ARPA (su-sierra.ARPA.ARPA) by UCB-VAX.ARPA (4.24/4.41)
	id AA09389; Sun, 17 Feb 85 12:24:27 pst
Date: Sun 17 Feb 85 12:25:37-PST
From: MAILER-DAEMON@Berkeley (Mail Delivery Subsystem)
Subject: Returned mail: Host unknown
Message-Id: <[email protected]>
To: <[email protected]>

   ----- Transcript of session follows -----
bad system name: fortune
uux failed. code 68
550 <[email protected]>... Host unknown

   ----- Unsent message follows -----
Received: from SU-SIERRA.ARPA (su-sierra.ARPA.ARPA) by UCB-VAX.ARPA (4.24/4.41)
	id AA09386; Sun, 17 Feb 85 12:24:27 pst
Message-Id: <[email protected]>
Date: Sun 17 Feb 85 12:25:37-PST
From: Greg Satz <[email protected]>
Subject: Re: "C" for the PDP-10
To: fortune!redwood!rpw3@BERKELEY
In-Reply-To: Message from "fortune!redwood!rpw3@Berkeley" of Sat 9 Feb 85 07:47:39-PST
Phone: (415) 497-1004

Sorry for the long delay. The compiler is still in a state of flux and I
needed to check how we could give it to you, if at all.

This compiler was written a while ago by a gradute student here at
Stanford. It was heavily modified by another student who made the output
code production quality. However, other parts of the compiler are still
immature and there are quite a few bugs left.

We plan on distributing it once it is finished, but we don't want
various random versions appearing. If you are willing to:

1) not redistribute the source under any conditions without Stanford's
approvel and

2) report back all bug fixes, modifications and enhancements.

we would be willing to give you a copy.

The compiler will generate code for the Stanford Waits operating system.
It used to be TOPS-10 years ago but has since changed drastically. There
are switches which disable code generation of the ADJBP and ADJSP
instructions. I am not sure how well tested that code is. Also, runtimes
will be a small problem. Most of our runtimes will only work in the
TOPS-20 environment and the direction we are going won't ever work on
anything but TOPS-20.

If you are still interested, then get in touch with me either by mail or
phone, (415) 497-1004.
-------
17-Feb-85 12:44:12-PST,734;000000000001
Mail-From: SATZ created at 17-Feb-85 12:05:30
Date: Sun 17 Feb 85 12:05:30-PST
From: Greg Satz <[email protected]>
Subject: predefined manifest constants
To: [email protected]
Phone: (415) 497-1004

What constants do you think KCC should predefine? Here are a few
I thought of. Comments?

tops20 or TOPS20	if the operating system we are compiling
			is a TOPS20 system.

tops10 or TOPS10	if this is TOPS10

waits or WAITS		If this is Stanford Waits

unix or UNIX		KCC cross compiling on a Vax?

tenex or TENEX		Tenex?

kl10 or KL10		KL machine (model A or B or should we care)

ki10 or KI10		This is probably stretching it

vax or VAX		Am I serious?

What about things like DEC20 or DEC10?
-------
17-Feb-85 13:28:24-PST,884;000000000011
Mail-From: LOUGHEED created at 17-Feb-85 13:28:20
Date: Sun 17 Feb 85 13:28:20-PST
From: Kirk Lougheed <[email protected]>
Subject: Re: predefined manifest constants
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Sun 17 Feb 85 12:05:33-PST

The only version of KCC we stand to do a good job of is one that runs
on an extended addressing PDP-10 (KL10 or Mars or whatever) under
TOPS-20 Release 6.X or greater.  Defining manifest constants for
environments we have no experience with is akin to misleading
advertising -- someone might believe that the compiler works on KI10
running TENEX.

I suggest TOPS20 and KL10 (and their variants) for the time being.
As we or other people extend KCC to other machines and operating systems,
then other manifest constants can be defined.

Kirk
-------
17-Feb-85 13:44:19-PST,716;000000000011
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sun 17 Feb 85 13:44:12-PST
Received: from CWR20B by CUCS20 with DECnet; 17 Feb 85 16:42:31 EST
Date: Sun 17 Feb 85 16:42:31-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: Re: predefined manifest constants
To: Lougheed%SU-SIERRA@CUCS20
cc: SATZ%SU-SIERRA@CUCS20, bug-kcc%SU-SIERRA@CUCS20
In-Reply-To: Message from "Kirk Lougheed <[email protected]>" of Sun 17 Feb 85 16:32:16-EST

I concur with Kirk's comments, but would suggest the addition of a
constant 'PDP10' to signify any machine which conforms to the 
"non-processor-specific" portion of the PDP-10 architecture.

	Rob
-------
17-Feb-85 17:02:25-PST,572;000000000011
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sun 17 Feb 85 17:02:20-PST
Date: Sun 17 Feb 85 17:01:45-PST
From: Ken Harrenstien <[email protected]>
Subject: SIZEOF doesn't work with initialized arrays
To: [email protected]
cc: [email protected]

Consider the code:
	int footab[10];
	int foosiz = sizeof(footab);

	int bartab[] =  { 1,2,3,4,5,5,6,7,9,10 };
	int barsiz = sizeof(bartab);

"foosiz" will be properly initialized.  "barsiz" will be ZERO!

It took me a long time to track this one down...
-------
18-Feb-85 01:01:38-PST,1495;000000000001
Return-Path: <fortune!redwood!rpw3@sri-tsc>
Received: from sri-tsc by SU-SIERRA.ARPA with TCP; Mon 18 Feb 85 01:01:33-PST
Received: by sri-tsc at Sun, 17 Feb 85 23:13:21 pst
From: <fortune!redwood!rpw3@sri-tsc>
Received: by fortune.UUCP (4.12/4.7)
	id AA21434; Sun, 17 Feb 85 19:25:47 pst
Message-Id: <[email protected]>
Date: Sun, 17 Feb 85 19:25:47 pst
To: [email protected]
Subject: Re: C for PDP-10

Conditions (1) no redistribution and (2) you get bug-fixes and enhancements,
etc., seem fine. I'll be calling you early this week.

p.s. If we do get it up with reasonable cleaniness on a 6.03A KA-10, would you
be willing to make it available to others AFTER your formal release of your
version? (Just asking...)

p.p.s. The reason the first return path didn't work is that system "dual"
rewrites its name out of any return paths it passes on to "ucbvax", since
there is software on "ucbvax" that knows how to use "dual" as a sort of
gateway to sites "ucbvax" doesn't talk to directly (Erik Fair maintains both
systems' mailers). UNFORTUNATELY, while "fortune" is in "dual"'s list of hosts,
it's not in that of "ucbvax". Oops! The correct path through "ucbvax" is
"[email protected]".  Another working choice is
"[email protected]".


Rob Warnock
Systems Architecture Consultant

UUCP:	{ihnp4,ucbvax!dual}!fortune!redwood!rpw3
DDD:	(415)572-2607
USPS:	510 Trinidad Lane, Foster City, CA  94404

18-Feb-85 10:54:42-PST,2048;000000000005
Mail-From: SATZ created at 18-Feb-85 10:54:35
Date: Mon 18 Feb 85 10:54:35-PST
From: Greg Satz <[email protected]>
Subject: Fail Block Symbols
To: g.gorin@LOTS-A
cc: [email protected]
Phone: (415) 497-1004

Ralph,

Fail generates block symbols that always type out in DDT as the registers.
Do you think it would hurt to generate 54 rad50 instead of 14 rad50?
This does suppress the type out and Fail was able to recompile itself
with any seemingly ill effects.


Date: Mon 18 Feb 85 11:14:23-PST
From: Ralph Gorin <G.GORIN@LOTS-A>

The true home for FAIL maintenance is at SAIL where DDT certainly knows
about FAIL's block-structured symbol table.  One can't help but imagine
that the problem is DDT, not FAIL.  In MACRO assemblies, the program
name is nearly always typed out as AC #1, a further argument that DDT
is broken.

May I suggest that you look at DDT[S,SYS] at SAIL for ideas about
symbol table handling that might be incorporated into DDT for the 20?
Block structure in FAIL (for those programs that use it) is a big
win. I think it'd be wrong to break FAIL in accordance with the
current brain-damaged DDT.

(I wouldn't think such a change would affect anything assembled by
FAIL, but it could further confuse DDT, and possibly confuse
programs that look at the symbol table.)

	Ralph

Date: Mon 18 Feb 85 11:29:27-PST
From: Ralph Gorin <G.GORIN@LOTS-A>

Well, as far as I know, I was the last FAIL maintainer.  The "official"
sources are on SAIL, on [CSP,SYS] I think.  Martin Frost (ME@SAIL) is
the system's programmer.  I suggest that you ask him for permission
to add that to the source at SAIL, with copy to Len Bosack.

By the way, your change seems like a good idea.  There are several places
where that kind of nonsense may go on; you might study the sources further.
Of course the problem arises because both FAIL and LINK operate in the
old way.  Who ever thought that ...    Well, you've got the right idea
about fixing it to pass the logical name to LINK.

	Ralph
18-Feb-85 11:02:10-PST,427;000000000001
Mail-From: SATZ created at 18-Feb-85 11:02:06
Date: Mon 18 Feb 85 11:02:05-PST
From: Greg Satz <[email protected]>
Subject: Re: predefined manifest constants
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Kirk Lougheed <[email protected]>" of Sun 17 Feb 85 13:28:24-PST
Phone: (415) 497-1004

It turns out that someone wants to bring up KCC on a KA TOPS-10 system!
-------
18-Feb-85 12:43:41-PST,518;000000000001
Mail-From: SATZ created at 18-Feb-85 12:43:35
Date: Mon 18 Feb 85 12:43:35-PST
From: Greg Satz <[email protected]>
Subject: Re: Bizarre "preprocessor" bug
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Tue 5 Feb 85 06:04:31-PST
Phone: (415) 497-1004

Unmatched #endifs will make KCC issue an error message.

Also, #ifdef foo without matching #endif was only working if foo
was defined. I fixed the case when it wasn't defined.
-------
18-Feb-85 15:21:42-PST,1148;000000000001
Return-Path: <[email protected]>
Received: from Mojave by SU-SIERRA.ARPA with TCP; Mon 18 Feb 85 15:21:36-PST
Received: from SRI-NIC.ARPA by Mojave with TCP; Mon, 18 Feb 85 15:21:08 pst
Date: Mon 18 Feb 85 15:20:30-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: replies
To: [email protected]
Cc: [email protected], [email protected]
In-Reply-To: Message from "Greg Satz <satz@Mojave>" of Sat 16 Feb 85 19:15:20-PST

Sure, STDIO is one of the things I was looking at.  You want me to
tackle it and flesh it out, spiff it up, etc?

Probably my first step would be to split up the existing source files
into manageable modules similar to their UNIX counterparts.  Second,
furnishing the missing routines.  Third, making efficiency improvements.
This last step could involve some restructuring of the way file streams
are managed, although there really is not too much that can be done.  The
modularization will make it easier to substitute assembly language versions
for critical portions (I see no reason not to do so, since the C versions
will always be on hand if KCC's internal conventions ever change).
-------
18-Feb-85 19:15:28-PST,599;000000000001
Mail-From: SATZ created at 18-Feb-85 19:15:24
Date: Mon 18 Feb 85 19:15:24-PST
From: Greg Satz <[email protected]>
Subject: anomolous declaration
To: [email protected]
Phone: (415) 497-1004

The 4.2bsd compiler supports the following declaration:

int foo[];

where foo is an external declaration. The compiler generates code
that reserves zero (0) memory, but the loader thinks that this
symbol is an undefined external. I guess KCC should do the same
thing, but does anyone know how this would be used? It is sort of
the declaration "extern int foo" but different.
-------
18-Feb-85 21:46:59-PST,700;000000000001
Mail-From: SATZ created at 18-Feb-85 21:46:54
Date: Mon 18 Feb 85 21:46:54-PST
From: Greg Satz <[email protected]>
Subject: Re: SIZEOF doesn't work with initialized arrays
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Sun 17 Feb 85 17:02:25-PST
Phone: (415) 497-1004

I think I may have fixed this bug. At least your example will
now generate a correct sizeof. I don't think this solution
will handle the case:

struct foo {
	int a:18;
	int b:6;
	int c:6;
	};

struct foo bar[] = {1,2,3};

I think sizeof(bar) will return three instead of one. I didn't have
the heart to test it, but will shortly.
-------
18-Feb-85 23:57:56-PST,962;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 18 Feb 85 19:34:53-PST
Date: Mon 18 Feb 85 19:34:17-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: anomolous declaration
To: [email protected], [email protected]
cc: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Mon 18 Feb 85 19:17:15-PST

Why not interpret "int foo[];" as "extern int foo[];"?  That is what
is meant, I believe.  An example of how it can be used is for
various modules to #include a header file which declares "extern int vectab[]"
so that code can refer to vectab properly (as an array), and one of the
modules that gets loaded will contain the actual specification of vectab
(typically an initialized array).  This allows you to modify the contents
of the initial array by changing and recompiling just one module, instead
of all of the modules which make up the program.
-------
18-Feb-85 23:58:09-PST,487;000000000001
Received: from LOTS-A by Sierra with Pup; Mon 18 Feb 85 23:11:31-PST
Date: Mon 18 Feb 85 23:07:37-PST
From: Dan Newell  <D.DAGONE@LOTS-A>
Subject: Re: anomolous declaration
To: SATZ@Sierra
cc: bug-kcc@Sierra
In-Reply-To: Message from "Greg Satz <SATZ@Sierra>" of Mon 18 Feb 85 19:13:40-PST

   It's used the same as equivilence in FORTRAN I think.
The linker is supposed to resolve all these undefined externals
to the one who actually allocates memory for it.
	Dan
-------
19-Feb-85 19:24:42-PST,529;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 19 Feb 85 19:24:34-PST
Date: Tue 19 Feb 85 19:23:40-PST
From: Ken Harrenstien <[email protected]>
Subject: Bug in new KCC
To: [email protected]

This happens with version 25 (<SUBSYS>CC.EXE.25) of the compiler,
which I have once again had to bump off:

	char foo[] = "test";

	main() {}

Produces

	Error at line 3 of test.c:
	main(
	Initializer mismatched with variable type -- foo.
	?1 errors detected
-------
19-Feb-85 21:49:37-PST,714;000000000001
Mail-From: SATZ created at 19-Feb-85 21:49:29
Date: Tue 19 Feb 85 21:49:29-PST
From: Greg Satz <[email protected]>
Subject: Re: Bug in new KCC
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Tue 19 Feb 85 19:24:40-PST
Phone: (415) 497-1004

Unfortunately the fix I made to handle initialization of undefined
size arrays is not general and far from complete. It seems that
KCC doesn't have any code to handle this for any type of variable
and thus needs to be written. I am working on it, but don't have a lot
of time to devote to it at the moment. Expect any undefined size
declaration not to work until you hear further.
-------
21-Feb-85 11:43:00-PST,988;000000000001
Mail-From: SATZ created at 21-Feb-85 11:42:48
Date: Thu 21 Feb 85 11:42:48-PST
From: Greg Satz <[email protected]>
Subject: more foo[]
To: [email protected]
Phone: (415) 497-1004

Could someone please explain to me what int foo[] means as an automatic
variable? The following code compiles on 4.2bsd:

foo()
{
    char foo[], *bar = "hi there";

    foo = bar;
}

What does this mean. K+R seems to say that empty braces are only
permitted with initializers or when making something external.  It seems
to me that empty braces without an initializer as an AUTO variables
should be an error. Comments?

I just went over to the vax to play some more. It seems that the foo[]
can mark a placce on the stack and is another way of referencing the
same variable. For example:

foo()
{
    int foo;
    int foo2[];

    foo = 1;
    printf("foo2=0%o %d\n", foo2, *foo2);
}

This will print the address of foo and 1. Does anyone rely upon this?
-------
21-Feb-85 13:37:32-PST,1131;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 21 Feb 85 13:37:27-PST
Date: Thu 21 Feb 85 13:35:43-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: more foo[]
To: [email protected], [email protected]
cc: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Thu 21 Feb 85 11:54:55-PST

Personally I see no problem with prohibiting unsized AUTO vars.  There
certainly seems to be no use for them.  If you wish to get the address
of something on the stack, you can take its address with "&".  This is
more clear, more portable, and works fine.

Of course there is some inconsistency because C permits you to have
"implicit external" definitions within a procedure, such as "int foo();".
You could extrapolate this to "int foo[];"  (i.e. foo is going to be
globally declared somewhere else as an integer array, of unknown size).
But from your VAX experimentation it looks like this is not the case.
I guess procedure declarations are special because you know they can never
be defined while inside another procedure!
-------
21-Feb-85 17:40:28-PST,352;000000000001
Mail-From: WHP4 created at 21-Feb-85 17:40:23
Date: Thu 21 Feb 85 17:40:23-PST
From: Bill Palmer <[email protected]>
Subject: a few more exec and kcc bugfixes
To: [email protected]


1)  The exec shouldn't try to pass PPNs to kcc now.  

2)  KCC is more intelligent about what parts of filenames to hand to LINK.

					Bill
-------
21-Feb-85 03:39:13-PST,1510;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 21 Feb 85 03:38:39-PST
Date: Thu 21 Feb 85 03:31:44-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: Bug in new KCC
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Thu 21 Feb 85 00:44:07-PST

Unless the loader is very smart, I don't think you can expect sizeof
to work on external references to things of unknown size.  The important
word here is "external".  If the thing is actually defined in some way in
the current file, then sizeof should work.  It is only if the size/contents
really are furnished by an external file that you have to punt.
Thus char foo[]="abc"; is okay, as is extern char foo[]="abc"; BUT
extern char foo[]; is of unknown size, and so is simply char foo[];.

I don't have my C manual with me, but seem to recall that C makes some
assumptions about when a variable is external or not.  It is obviously
better to always make an explicit "external" declaration, but for
compatibility I guess KCC needs to know about the defaults.  It is a bit
messy.

Expecting the user to know that sizeof doesn7t work for externals is
like expecting assembly language users to understand what 1-pass and
2-pass processing implies.  Once you understand it, it makes sense, but
documentation never seems to get around to explaining the basic facts of life.

I'm not sure if I make sense myself this late.
-------
24-Feb-85 09:54:49-PST,359;000000000011
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sun 24 Feb 85 09:54:39-PST
Date: Sat 23 Feb 85 16:05:21-EST
From: David Eppstein <[email protected]>
Subject: Should the following compile?
To: [email protected]

main()
{
	int x,y;
#define x y
	x = 3;
#undef x
	x = 3;
}
-------
25-Feb-85 09:01:13-PST,1800;000000000001
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Mon 25 Feb 85 09:00:50-PST
Date: Mon 25 Feb 85 01:09:32-PST
From: Greg Satz <[email protected]>
Subject: Extended Addressing Comparison
To: [email protected]

Someone asked me the differences in speed of a KCC compiled program when
run under extended addressing and normal addressing. I found a program
called bench which has the following tests.

The first run is with normal addressing. The second is with extended
addressing where code is in section one, stack in section two, and data
in section three. The last run was done on a 4.2bsd Unix system just for
grins.

Perm	a permutation routine
Tower	Solves Towers of Hanoi
Queen	8 Queens
Intmm	Matrix multiply, integer
mm	Maxtrix multiply, float
Puzzle	A compute-bound program from Forest Baskett.
Quick	Quick sort
Tree	Tree sort
Bubble	Bubble sort
FFT

@bench
   Perm  Towers  Queens   Intmm      Mm  Puzzle   Quick  Bubble    Tree     FFT
    801     817     465    1011     994    5059     567     605    1885    1526

Nonfloating point composite is      1741.

Floating point composite is      2466.
@get (PROGRAM) bench.EXE.1 
@depOSIT (MEMORY LOCATION) $exadf (contents) 1
 [Shared] 
@st
   Perm  Towers  Queens   Intmm      Mm  Puzzle   Quick  Bubble    Tree     FFT
    756     807     439     951     962    5071     610     620    1948    1516

Nonfloating point composite is      1740.

Floating point composite is      2456.

% bench
   Perm  Towers  Queens   Intmm      Mm  Puzzle   Quick  Bubble    Tree     FFT
   2440    2410     990    1730    2040   11240    1160    1680    2830    3880

Nonfloating point composite is       3779

Floating point composite is       5518
-------
25-Feb-85 09:14:29-PST,773;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 25 Feb 85 09:14:17-PST
Date: Sun 24 Feb 85 18:03:35-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: Should the following compile?
To: [email protected], [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Sun 24 Feb 85 09:56:29-PST

It should.  #-statements are in theory supposed to be handled by a preprocessor
before the C compiler proper sees anything.  So the fact that the string "x"
is already mentioned in a declaration doesn't mean anything, because that
is C code, which the preprocessor pays no attention to.  #define'd symbols
and C symbols are separate things.
-------
20-Feb-85 18:08:42-PST,1864;000000000001
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 20 Feb 85 18:08:36-PST
Received: from CWR20B by CUCS20 with DECnet; 20 Feb 85 21:07:00 EST
Date: Wed 20 Feb 85 21:07:05-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: PAUNIX Update
To: Satz%SU-SIERRA@CUCS20

Been a while since we talked, thought I'd give you a little update.

I got pretty far into the terminal driver emulation stuff, and found out
I blew some things.  In frustration, I started working on fork(), which
seemed easier by comparison.  Thus, PAUNIX now has about 3/4's of both
fork() and tty(4) working and the next toss of it will hold both of them.
I want to do that toss pretty soon, maybe early next week.  I've been
hampered a little by bringing up TOPS-20 6.1 here, it's been a real
chore.  We have something akin to your Ethertip, but it runs over 
regular DECnet, and does things like execute BIN% on the PDP-11, however
to make it all efficient it used internal entry points in DECnet.  The
big thing in 6.1?  Rewritten DECnet -- sigh.

"you win some -- you lose some".

Also, do you and Dave Eppstein keep your copies of KCC generally in sync?
I've been grabbing KCC from Columbia, but it seems from recent posts that
you've been applying all the fixes.  My interest stems from a) I'm slowly
turning the Bourne shell into a program that conforms to the C specification
and thus is compilable by KCC; and b) a couple of people here are working
on things I'd like to see done in C and I'd like to get them started with
the most stable KCC I can find.

Finally, do you remember Laura Nekola?  She liked California (and her boy-
friend who lives out there) so much she decided to leave here and go work
there.  You may be hearing from her, if you haven't already.

Take care,

	Rob
-------
25-Feb-85 19:50:39-PST,412;000000000001
Mail-From: SATZ created at 25-Feb-85 19:50:32
Date: Mon 25 Feb 85 19:50:32-PST
From: Greg Satz <[email protected]>
Subject: Scope bug fixed
To: [email protected]
Phone: (415) 497-1004

The problem with

{
    int i = 5;
    {
	int i = 10;
    }
}

hopefully has been fixed. I had to rewrite the symbol table routines
to use a doubly linked list instead of a fixed array (heap).
-------
25-Feb-85 20:01:05-PST,5261;000000000005
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Mon 25 Feb 85 20:00:43-PST
Date: 25 Feb 1985 19:58-PST
Sender: [email protected]
Subject: Symbol tables...
From:  William "Chops" Westfield <[email protected]>
To: [email protected]
Message-ID: <[SU-SCORE.ARPA]25-Feb-85 19:58:40.BILLW>

How difficult would it be to have the c compiler hash symbols that
didnt fit in fail/link into $nnnnn or whatever?  One of the major
remaining imcompatabilities is symbol length and case significance.
True, this would make debugging harder - something like Kashtans
method of leaving them alone if possible should be used...

BillW

Date: Mon 25 Feb 85 20:37:12-PST
From: Ken Harrenstien <[email protected]>

An interesting idea, but it would only work well for self-contained
programs.  Given the universal use of the C library functions, there are
no self-contained C programs.  Thus all of your externals need to be
hashed in the same way, so they will match up with the externals in
the libraries you have.  Then you get REALLY screwed when your hash
algorithm produces a conflict, because you don't have any good way of
determining, from the LINK multiply-defined error message, what C symbols
are actually in conflict.

An alternative suggestion: make use of block-structured symbols.  Both
FAIL and MIDAS support this, and IDDT seems to do reasonable things
when used on programs with such symbols.  I don't know about MACRO
as my manual is not here.  Anyway, the basic idea would be to divvy up
an over-long symbol into 6-character chunks; all but the last chunk would
be a block name.  get_next_character() would translate to
GET.NE"XT.CHA"RACTER:  ... an abomination of sorts, but it guarantees
that you know what is going on at all times if you insist on using longish
names.  Now, if you also insist on handling case distinction, you can prefix
all upper-case letters with $.  This makes some symbols a bit longer, but
with our new technique for handling verbosity, this is okay.  Thus
Get_Next_Char() becomes $GET.$"NEXT."$CHAR: and all is well, except
possibly in the mind of the programmer.

Personal rant: I think anyone who uses case distinction, or even mixes
upper and lower case within a symbol, should be dumped on some faraway
Pacific island and left there.  Ditto those who use 20-char symbols.
I grant 6 is a trifle small when dealing with large collections of routines,
but getc is still so much handier than Get_Next_Character_From_File_Stream.


Date: 26 Feb 1985  15:21 EST (Tue)
From: David Eppstein <[email protected]>

LINK supposedly supports long symbol names.  The problem is that
neither FAIL or MACRO know about them.  So if we were producing REL
files rather than assembly we could at least tell DEC about the bugs
in that part of LINK.  Another possibility is to teach FAIL about long
symbols so that KCC can send them to it...

If we come up with some convention for flagging upper case in symbol
names (e.g. with $) we should be careful not to conflict with the
internal routines (like $SPUSH, $ADJBP, etc) that also use $.  On
similar lines, perhaps ..STRT and .START should be renamed to $EVEC
and $START in case someone wants to use __start or _start in a program?


Date: Tue 26 Feb 85 15:38:07-PST
From: Ken Harrenstien <[email protected]>

Either $ or % should be fine for KCC-internal symbols since there is no
way (that I know of) to create a C symbol with those characters.  (_ is
changed to .)  In order to avoid any possible conflict with user-defined
symbols if one of them is used for internal routine addresses, just make
sure that you DOUBLE it for the latter purpose.  For example, use $$SPSH
instead of $SPUSH, $$ADJB instead of $ADJBP, etc.

Don't forget that (I)DDT needs to know about the long symbols too.  I
can't find any documentation in the LINK manual about the format of
the resulting symbol table (what .JBSYM points to).  If the table
hasn't changed, obviously it is pointless to talk about using long
symbols (what good are they if you can't debug them).  But if it has
changed, someone will have to grovel around and find out what the new
format is supposed to be.

This is an awful lot of work.  On the other hand, a few tests should
quickly reveal whether block structure works or not.


Date: Tue 26 Feb 85 16:44:38-PST
From: Greg Satz <[email protected]>

I need to print out and read through the fail manual. I am not
really familiar with Fail's block sturcture so I won't comment
about that except to say it is a good idea.

As for debuggers, it will always be useful to have [I]DDT work
with KCC output; however, we need a real source level debugger
along the lines of the Pascal debugger. Any takers?


Date: Wed 27 Feb 85 17:56:00-PST
From: Greg Satz <[email protected]>

	From: David Eppstein <[email protected]>
	Subject: Symbol tables...
	
	On similar lines, perhaps ..STRT and .START should be renamed to
	$EVEC and $START in case someone wants to use __start or _start
	in a program?

This has been done (except I used $$STRT). If you get binaries, make
sure you get both the library and compiler this time.
26-Feb-85 06:55:21-PST,1383;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 26 Feb 85 06:55:08-PST
Date: Tue 26 Feb 85 06:35:58-PST
From: Ken Harrenstien <[email protected]>
Subject: Subtly vicious bug
To: [email protected]
cc: [email protected]

The following code is buggy:
	------------
struct ts { int tsi[20]; };
tsdo(tp)
struct ts *tp;
{	struct ts tstab[1];
	tstab[0] = tp;
}
foo() { return(0); }
morefoo(a) { return(a*a); }
	-------------
When compilation is attempted, this happens:
	@cc test
	KCC:    test

	Error at tsdo+5, line 7 of test.c:
	foo(
	Unsupported type coercion.
	?1 errors detected
	@

The problem is that the error message does NOT happen at the guilty
statement, namely "tstab[0] = tp;".  Instead it happens at the end of
the procedure (tsdo).  This makes it mighty puzzling to track down,
especially when there are many, many lines of code between the statement and
the end of the procedure!!!  Furthermore, not a single one of the following
procedures is compiled, even though the symbols are noticed and declared
internal in the resulting FAIL code.

It took me about an hour just to whittle down the original program to
the test case you see here.  If I hadn't been able to try compiling it
on a 4.2 system (which pinpointed the guilty statement), it would have
taken a lot longer...
-------
26-Feb-85 07:13:01-PST,2666;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 26 Feb 85 07:12:55-PST
Date: Tue 26 Feb 85 07:11:38-PST
From: Ken Harrenstien <[email protected]>
Subject: KCC sloppiness in type checking
To: [email protected]
cc: [email protected]

I never thought I'd be complaining about this, because most of the
time when a compiler barfs about illegal type combinations it is just
being overly picky and annoying.  However, for KCC this is serious:
	-----------
main()
{	printf("String %s\n", foo(1));
	
}
char *
foo(a)
{	char *cp;
	cp = bar(a);
	return(cp);
}
bar(a) { return(a*a); }
	-----------

There are two problems exhibited here.  The first problem is that
although "foo" is implicitly declared as "int" by its first appearance,
the explicit declaration as "char *" does NOT produce any warning
message about a type redeclaration.  It should; 4.2 CC does.
In this particular example, no serious harm is done, but read on...

In the second oversight, bar is implicitly "int", but is being
assigned to a char pointer.  This should definitely cause an error
message; 4.2 CC barfs.  Now here we have a potential run-time crash
because "bar" really is returning an integer.

The reason type checking is so important for KCC is because char
pointers are really byte pointers; they are nothing like integers, or
even like pointers to any other kind of thing.  There is lots of old C
code that sort of assumes they can all be considered equivalent,
especially when passing things around as arguments and stuff.
However, because KCC actually has to perform various crunchings to
convert from one form to another, you can be unpleasantly surprised
when a char pointer is cast to an integer (or vice versa) without your
knowledge, and just as surprised when the crunching should have been
done but wasn't.  I've been screwed a couple of times this way.

P.S. Not connected with the above bug(s): I would guess that the most
common way of fouling up is when calling a subroutine which wants
a "general pointer" as argument, and relies on using (char *) as this
generalized pointer.  QSORT is a good example of this, as are FREE
and REALLOC.  The screw happens when the user provides an argument that
is a pointer to something other than char, without explicitly casting it.
There isn't really anything KCC can do about this -- it is a general C
problem -- but so far KCC is the only compiler I know of which is
affected, because (char *) and (anything-else *) are so different.
All the more reason for being as picky as possible about those things
that CAN be caught.
-------
27-Feb-85 13:24:14-PST,773;000000000001
Mail-From: SATZ created at 27-Feb-85 13:24:10
Date: Wed 27 Feb 85 13:24:10-PST
From: Greg Satz <[email protected]>
Subject: Re: Should the following compile?
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Sun 24 Feb 85 09:54:49-PST
Phone: (415) 497-1004

	Date: Sat 23 Feb 85 16:05:21-EST
	From: David Eppstein <[email protected]>
	Subject: Should the following compile?
	
	main()
	{
		int x,y;
	#define x y
		x = 3;
	#undef x
		x = 3;
	}

It will now. I added a new routine called findasym() which does exactly
what findsym() does but also takes a symbol class to look for. All
routines now look for macros specifically instead of haphazardly.
-------
 2-Mar-85 16:05:02-PST,787;000000000001
Mail-From: SATZ created at  2-Mar-85 16:04:54
Date: Sat 2 Mar 85 16:04:54-PST
From: Greg Satz <[email protected]>
Subject: runtimes and preprocessor
To: [email protected]
Phone: (415) 497-1004

CC now understands the -E flag. This will just run the preprocessor over
the .c file with the output going into the same filename with a .i
extension. This should make getting lint running a little easier

C programs will now expand wild card file names. They will also treat
characters between quotes as a single token.

I hacked up an Emacs tags generator for .c files. Source is in
<kcc.unix.src>ctags.c and binary in <kcc.unix.bin>ctags. It seems to
work fairly well except tags uses a substring search which sometimes
results in finding a wrong routine.
-------
 1-Mar-85 04:10:13-PST,4041;000000000011
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Fri 1 Mar 85 04:10:07-PST
Date: Fri 1 Mar 85 04:08:23-PST
From: Ken Harrenstien <[email protected]>
Subject: stdio (and other ruminations)
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Thu 28 Feb 85 19:40:56-PST

4.2, huh.  Trouble is, I don't have a 4.2 manual and don't, at the
moment, know what frobs it has which the traditional V7 (K&R) stdio
doesn't.  My sights were only on the latter.  Well, I will check with
Gus, and see how soon I can dig one up.  I guess I could also look at
the 4.2 stdio sources, but am afraid it might influence my work
unduly.  Safer, and more fun, to do it from the descriptions.

	The only thing that isn't straightforward is how to resolve the
LF/CRLF question.  My feeling is that the transformation is best done
at the point closest to system readin/writeout.  In other words, the
STDIO buffers will always exist in "UNIX" form, and it won't be until
it's time to actually force a buffer out (or refill it) that anything
unusual is done to it.  This would normally involve a total re-copy of
the STDIO buffer to and from a temporary buffer (on stack?) which does
conversion of LF to CRLF and vice versa along the way.  This approach
has some overhead, but it is much less than the current overhead of
always calling a function for the simplest getc/putc operations (glork!)
Come to think of it, on TOPS-20 conversion-copy could even use one of the
fancy DEC-20 string move instructions, although it would have to be
tested to make sure it was faster than plain vanilla instructions.

	However, I'm not sure whether the conversion should be done by
STDIO or by the read/write "system calls".  At the moment, my
inclinations are to mung the calls, since this is a more general
solution, and helps make STDIO itself more portable.  The calls
already need to maintain some state information for each "file
descriptor", and that is a logical place to include a "convert" flag
(after all, there will be some FD's for which conversion is not
desired, and presumably a way to indicate this).  I am worried that
trying to use PAUNIX may render this approach unworkable.  For all I know
it already provides this conversion feature, but if it doesn't, we
need to do something.

	On TNX systems, PMAP I/O is faster than SIN/SOUT, and
at some point read/write should probably use this when it is feasible
(it can't be done for a pipe JFN for example).  This raises the possibility
of losing some buffered output when the program dies suddenly without
taking a normal exit.  However, it won't help even if there was one
SOUT for each write(), because of the stupid TNX feature of forgetting
all about a file if you reset the jfn instead of closing it properly.
So any UNIX-type program that does its own writes and expects them to
make it out even in the event of its untimely demise (as is usual on UNIX)
is going to lose, and will need some attention no matter what read/write
tries to do.

	Incidentally, there are certain problems with fseek/lseek when
you think about it.  If you rely on counting the number of characters
you have output in order to determine where you should seek back to,
or if you think seeking to EOF will tell you how many characters you
can read from the file, you're in trouble.  Unfortunately, some
programs like ELLE do exactly this.  There just is no way to
make fseek/lseek work right for FDs that need conversion, except for
the trivial cases of BOF/EOF.  Perhaps attempting to do a non-trivial
lseek on a conversion-flagged FD should cause a seek error, so that
these programs can be rooted out easily without subjecting their users
to extremely mysterious malfunctions.  Yes, I think this should be done.

	Well, in spite of our hopes, sometimes we have to admit that
there is some code that just cannot be made portable without a bunch
of conditionals.
-------
 2-Mar-85 16:23:03-PST,1653;000000000011
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 2 Mar 85 16:23:00-PST
Date: Sat 2 Mar 85 16:21:14-PST
From: Ken Harrenstien <[email protected]>
Subject: fast-copy addition to libc?
To: [email protected]
cc: [email protected]

(I think I may have mentioned this before)

There is a need for a LIBRARY-supported routine that can be used for
fast copying.  Not only is this a real common function, the exact
best implementation is highly machine-dependent, which makes it an ideal
candidate for inclusion in the C library.

I realize the chances of achieving a true addition to the standard LIBC are
miniscule, but that does not mean we cannot proceed on our own.  Since
the code will be public, it can be obtained along with any program that
uses it; besides, its function is trivial to program if you just want
it to work and don't care about efficiency.

The question is, what to call it?  Does 4.2BSD already have something of
this sort, i.e. is it already in existence?

There is an interesting screw case involved with support of this kind
of thing on KCC, because (char *) byte pointers are not always of
bytesize 9.  This means that code which tries to optimize the
word-aligned case by using (int) copies instead of byte copies will
lose badly.  Thus it has to know about byte pointers in order to avoid
disaster; another argument for providing such a routine, since it can
then be made very machine-dependent without requiring any user to know
anything about it.  (This all came up while I was looking at ELLE to see
how much of it would still work with KCC.)
-------
 2-Mar-85 16:52:00-PST,818;000000000001
Mail-From: SATZ created at  2-Mar-85 16:52:00
Date: Sat 2 Mar 85 16:51:59-PST
From: Greg Satz <[email protected]>
Subject: Re: fast-copy addition to libc?
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Sat 2 Mar 85 16:23:03-PST
Phone: (415) 497-1004

4.2bsd has byte zero, copy, and compare routines:

bzero(array, n)
char *array;
int n; /* number of bytes in array to zero */

bcopy(from, to, n)
char *from, *to;
int n; /* number of bytes */

bcmp(array1, array2, n)
char *array1, *array2;
int n;

There is a manual page for these in 4.2 under byte manipulation. We may
want to support them as well as izero, icopy, and icmp for full words
where a blt could be used. The byte routines could call the word
routines.
-------
 2-Mar-85 16:59:40-PST,661;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 2 Mar 85 16:59:39-PST
Date: Sat 2 Mar 85 16:57:53-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: runtimes and preprocessor
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Sat 2 Mar 85 16:52:32-PST

Hmmm.  I guess that means KCC should have a similar behavior (do it after
warning) for -P.  I can't think of any reason why a switch name should
change either; they are all basically meaningless as far as mnemonic value
goes.

(yet another message... my aren't we busy today)
-------
 2-Mar-85 17:07:28-PST,463;000000000001
Mail-From: WHP4 created at  2-Mar-85 17:07:27
Date: Sat 2 Mar 85 17:07:27-PST
From: Bill Palmer <[email protected]>
Subject: now that ctags works
To: [email protected]

You could make ctags understand all the other languages that emacs:tags
understands so that you could have tags files for programs/systems that
aren't written entirely in C.

On the other hand, it might be easier to write a program that merged
tags files.

					Bill
-------
 2-Mar-85 17:24:26-PST,1458;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 2 Mar 85 17:24:23-PST
Date: Sat 2 Mar 85 17:22:38-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: fast-copy addition to libc?
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Sat 2 Mar 85 16:57:20-PST

As long as KCC uses 7-bit constant strings, the 9-bit assumption is
not very safe.  But fortunately there is an easy test for this!
We just compare the appropriate BP fields together and if they are
identical then word moves are kosher.  This checks both size and
alignment at one shot.  So I think we can live well with b*.

What grates on me is the necessity for continually changing simple
addresses into byte pointers just to satisfy the needs of some library
function that requires (char *) as argument.  No help for it, though,
unless you want to start a (int *) revolution (hmm, I wonder just how
many of those functions there are.  qsort, malloc/realloc/free, ...
uh, er, maybe there aren't that many after all... guess storage
allocation is the only real instance... okay, I'll eat my words.
Chomp, chomp.)

P.S. There exist specialized hairy routines that use word moves and logical
shifts to transfer non-aligned strings at high speed (faster than the
DEC-20 string copy instruction) but let's not get into that just now.
Libraries are wonderful.
-------
 2-Mar-85 17:48:11-PST,699;000000000001
Mail-From: SATZ created at  2-Mar-85 17:48:06
Date: Sat 2 Mar 85 17:48:06-PST
From: Greg Satz <[email protected]>
Subject: sizeof question/bug
To: [email protected]
Phone: (415) 497-1004

What should the following sizeofs be?

char foo[6] = "abcde";
char bar[] = "abcde";
foosiz = sizeof(foo);
barsiz = sizeof(bar);

I fixed the initialization code to handle [] character arrays. Now both
sizeofs return 8! Sizeof is returning the number of characters available
in two words worth of storage and not the number initialized.

This is inconsistant with the Vax PCC compiler. I will see if the
compiler relies upon this to see how hard it would be to fix. Comments?
-------
 2-Mar-85 18:08:46-PST,481;000000000001
Mail-From: SATZ created at  2-Mar-85 18:08:35
Date: Sat 2 Mar 85 18:08:35-PST
From: Greg Satz <[email protected]>
Subject: array initializer bug fixed
To: [email protected]
Phone: (415) 497-1004

The latest compiler will generate correct sizes for arrays including
structures with bit fields which don't predefine the array size but
use initialization. I still need to check the no memory = external
case and what happens when foo[] is an auto variable.
-------
 2-Mar-85 18:40:19-PST,1488;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 2 Mar 85 18:40:11-PST
Date: Sat 2 Mar 85 18:38:21-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: sizeof question/bug
To: [email protected], [email protected]
cc: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Sat 2 Mar 85 17:48:08-PST

The size of a char array is a special case.  I think this is the ONLY
instance where it is reasonable for sizeof to provide the exact number
of characters in the array, since (despite K&R's disclaimer) just about
everything assumes that sizeof is talking about chars, rather than bytes.

In all other cases, sizeof should always return the allocated number of
bytes.  For example, "struct it {int i; char c;}" should result in
sizeof(it) == 8, rather than the "exact" sizeof(it) == 5.  I can supply
any number of reasons why this is necessary if the world is not to
break horribly.

I would not even be very upset if sizeof also returned the allocated
size for character arrays.  But I can't think of any way in which
using the exact size for this one special case would cause harm, and
must admit it does make some sense.  So, given that the Vax compiler
also does it, it should be okay to go ahead and "fix" it.

P.S. In your examples, note that both sizes would then become 6.
barsiz would not be 5 since the terminating null byte is part of the
resulting char array.
-------
 2-Mar-85 19:18:54-PST,940;000000000001
Received: from LOTS-A by Sierra with Pup; Sat 2 Mar 85 19:18:52-PST
Date: Sat 2 Mar 85 19:17:09-PST
From: Dan Newell  <D.DAGONE@LOTS-A>
Subject: re: sizeof, my two sense
To: SATZ@Sierra
In-Reply-To: Message from "Greg Satz <SATZ@Sierra>" of Sat 2 Mar 85 17:47:01-PST

   char foo[6] = "abcde";
   char bar[] = "abcde";

   Sizeof should probably generate what you would expect,
6 for foo and 6 for bar. Kcc has enough problems with
char's not really being char's but instead kludged up
int's. If you don't have the sizeof right, you mayy introduce
really rude sizeof bugs when someone relies on sizeof to
measure strings that may have nulls (\000) in them.
We used binary strings like this at Microsoft all the
time though we may not have used sizeof on them that much.
I could see someone trying though. 
   Char's should be chars and if you have 5 of them, the
compiler should think 5 and not think 2.
	Dan
-------
 3-Mar-85 03:54:50-PST,903;000000000015
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Sun 3 Mar 85 03:54:49-PST
Date: Sun 3 Mar 85 03:53:32-PST
From: William "Chops" Westfield <[email protected]>
Subject: I thought you said KCC's Pre-processor...
To: [email protected]

Had infinite length symbols?   It doesn't seem to???

TEST.C:
#define PerProcess pp
#define PerProcessArea pprcar

typedef struct
  {
    int nameserver;	/* Default file (name) server */
    int contextid;	/* Default context for above */
    unsigned stackSize;		/* Stack size for this process */
  }
PerProcessArea;

extern PerProcessArea *PerProcess;

main()
{
	printf("\nHello World");
}
------
TEST.I



typedef struct
  {
    int nameserver;	
    int contextid;	
    unsigned stackSize;		
  }
pprcar;

extern pprcar *pprcar;

main()
{
	printf("\nHello World");
}
-------
 4-Mar-85 02:49:28-PST,634;000000000001
Mail-From: SATZ created at  4-Mar-85 02:49:27
Date: Mon 4 Mar 85 02:49:27-PST
From: Greg Satz <[email protected]>
Subject: Re: I thought you said KCC's Pre-processor...
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "William "Chops" Westfield <[email protected]>" of Sun 3 Mar 85 03:54:50-PST
Phone: (415) 497-1004

sorry, but I lied. It turns out that everything is restricted
to 10 characters. This is because everything gets hashed through
the symbol table and it only uses the first 10 letters. I could up that.
What do you think is reasonable, 15? It is all still pretty bad.
-------
 2-Mar-85 16:43:21-PST,1607;000000000011
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 2 Mar 85 16:43:19-PST
Date: Sat 2 Mar 85 16:41:28-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: [Rob Gingell <GINGELL@CWR20B>: Re: paunix]
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Sat 2 Mar 85 16:25:11-PST

Sounds reasonable.  I have a couple of comments which you may wish to pass
on.  In fact I recommend it...

(1) Should emphasize that the disabling of lseek only happens for FDs
	which are conversion-flagged, regardless of the byte size.
	ELLE, as one example of an editor, most definitely would require
	that LSEEK worked for 7-bit files, and would be responsible for
	handling the representation of CRLF in its own way.
	A secondary implication is that the program should be able to
	specify a non-default conversion setting at the time the FD is
	opened.

(2) Even for FDs which are conversion-flagged, some cases of lseek should
	still work.  Specifically, given an offset of 0, lseek should
	ALWAYS work, for all cases of "origin" (0,1,2).  The value it
	returns should be the absolute position as if no conversion
	was happening.
	However, given any non-zero offset, lseek should fail.
	I believe it is far better that it fail (returns -1) than that
	it simply act as a no-op.  This way, it likewise does nothing, but
	it also provides some indication to the program that something
	is wrong, and this will be extremely valuable while ferreting out
	the inevitable portability bugs.
-------
 4-Mar-85 16:51:09-PST,956;000000000001
Mail-From: SATZ created at  4-Mar-85 16:50:55
Date: Mon 4 Mar 85 16:50:55-PST
From: Greg Satz <[email protected]>
Subject: question and bug
To: [email protected]
Phone: (415) 497-1004

Systems Concepts now has a copy of KCC. I just got a call from someone
there with a question and a potential problem. When a char pointer is
coerced into an integer pointer, KCC emits three instructions to do
this: TLC, TLZ, and XMOVEI.  Why is the TLZ necessary?

The possible bug is: if a character pointer is assigned to an integer
pointer, KCC will do a MOVE/MOVEM instead of converting it to an
address. Is there a good reason for this?

    char *p, buf[80];
    int *q, *r;

    p = buf;
    q = p;
    r = (int *)p;

	XMOVEI	3,2(17)
	IOR	3,$BYTE
	MOVEM	3,1(17)		; p = buf
	MOVEM	3,26(17)	; q = p		/* why? */
	TLC	3,400000
	TLZ	3,370000	; is this instruction necessary?
	XMOVEI	3,0(3)
	MOVEM	3,27(17)	; r = (int *)p	
-------
 4-Mar-85 19:44:51-PST,3883;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 4 Mar 85 19:44:37-PST
Date: Mon 4 Mar 85 19:42:37-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: question and bug
To: [email protected], [email protected]
cc: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Mon 4 Mar 85 16:51:31-PST

My response would be:

1.	Assignment of a char pointer to an int pointer should require
an explicit cast.  It is indeed a bug that KCC permits this to happen
without casting; this is why it compiles into MOVE/MOVEM, because it
doesn't realize the cast is needed.  A similar problem happens with:
	intp = malloc(20);
because malloc is declared as returning a char pointer.  Unless the
compiler complains, you will be screwed pretty badly.  What
you should be doing is:
	intp = (int *)malloc(20);

The compiler knows enough to be able to force the cast automatically
(akin to other type conversions it does), but because this action is
NOT part of standard C (sigh), KCC should at the very least print a
warning message, because the construct will not be portable.

2.	TLC	3,400000
	TLZ	3,370000	; is this instruction necessary?
	XMOVEI	3,0(3)
	MOVEM	3,27(17)	; r = (int *)p	

This is a tricky one.  The basic answer I came up with is "no, the TLZ
is not necessary".  If you are not interested in the gory details, skip
the rest of this message, which is largely to satisfy myself that I
understand what is going on.

	This code assumes however that you will never encounter
a byte pointer which has indexing or indirection.  This seems a reasonable
assumption for the KCC environment.  My best guess as to why the TLZ is there
is because the XMOVEI may have been intended to use indirection instead of
indexing.  That is, the sequence
	TLC 3,400000
	TLZ 3,370000
	XMOVEI 3,@3
would serve to boil down any indexing/indirection of local-format byte
pointers, since bits 0,1 would have the value 10 for such pointers, which
makes them a local-format indirection word.  For global-format pointers,
of course, all you need is a TLZ 3,770000.  But because the current
code uses indexing and not indirection (ie XMOVEI 3,(3)), bits 1-5 of
the register are ignored no matter what the situation is.  Indexing
while in a non-zero section looks at bit 0 of the index register to
decide whether to use bits 6-35 (sign=0) or bits 18-35 (sign=1).

Note that KCC is assuming that the sign bit indicates whether the byte
pointer is in global format (sign=1) or not (sign=0).  This is not
true for all byte pointers (local-format pointers with P values of
32,33,34,35,36 will mistakenly be interpreted as global) but this
would only be a problem if you were trying to deal with byte strings
of 4 bits or less, or if you have some routine that assembles byte
pointers using the familiar point-to-start-of-word (P = 36.) format.
This latter possibility would most likely occur by mistake.

Since I went to the trouble of sketching a table to figure this out,
I'll include it for the record:
---------------------------
Running in:	Given BP type	Optimal conversion (assumes BP in A)

Zero-section	local		MOVEI A,@A	; for indir/indexing

				MOVEI A,(A)	; w/o    "  "

				<or just HRRZM A, w/o    "  ">

		global		<illegal in zero section>

Non-Z-sec	local		TLO A,400000	; For indir/indexing
				TLZ A,200000
				XMOVEI A,@A

				TLO A,400000	; W/O indir/indexing
				XMOVEI A,(A)	; OK as long as sign bit is off

		global		TLZ A,770000	; No indexing/indirection.

		2-word		<not used as KCC char ptrs -- ugh!>
--------------------

The trouble is that KCC must assemble code that will work in multiple
sections as well as still working in section 0.  Anyway, it looks like
the best coding is simply TLC A,400000 ? XMOVEI A,(A).
-------
 5-Mar-85 07:44:07-PST,1014;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Tue 5 Mar 85 07:44:02-PST
Date: 5 Mar 1985  10:41 EST (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   [email protected]
Subject: char pointer to int pointer conversion
In-reply-to: Msg of 4 Mar 1985  22:42-EST from Ken Harrenstien <KLH at SRI-NIC.ARPA>

For the byte pointers KCC uses, the TLC is sufficient to distinguish
between local and global byte pointers.  To make this sequence correct
for all pointers would require a lot more instructions.  I had thought
the TLZ was necessary to avoid indirections but from what you say it
seems I was wrong.

I guess the best thing to do with assignments from char pointers to
int pointers is go ahead and coerce but give a warning message.
Perhaps there should also be a warning for int to either kind of
pointer, to catch people who neglect to declare malloc() at all...
 9-Mar-85 09:19:06-PST,18538;000000000001
Mail-From: KRONJ created at  9-Mar-85 09:19:01
Date: 9 Mar 1985  09:19 PST (Sat)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   [email protected]
Subject: coercion of null

Is it a bug that
    (char *) (int *) 0  !=  (char *) 0
and in extended addressing
    (int *) (char *) 0  !=  (int *) 0 ?
The latter causes pitfalls for careless error checking of malloc().
Of course in extended addressing you're not likely to run out of
memory anyway...

By the way, don't expect the sources on Sierra to be stable for a
while...I just went through a major rewrite of register allocation and
I haven't got all the bugs out yet.


Date: Mon 11 Mar 85 17:10:50-PST
From: Ken Harrenstien <[email protected]>

Sigh.  My recommendation would be that of MACLISP: "NIHIL EX NIHILIS".
In other words zero should always stay zero no matter what the type
is.  This may not conform to the perfect specification of a language
but in this case it seems to be a practical necessity; there is lots
of code that assumes it.  What this means for KCC is that all
instances of (char *)0 should result in a zero word, and all type coercions
which convert something to (char *) SHOULD CHECK FOR ZERO before
conversion is done.

The effect on runtime code is basically that a CAIE test should
come before the IOR of $BYTE.  This can be optimized to some extent
by using SKIPE instead of MOVE & CAIE where the combination turns up.

Since coercion to (char *) does not really happen that often (I
looked, and could only find a handful of such cases, primarily
malloc), this seems like a very reasonable thing to do.  Note that
coercions from (char *) to other kinds of things will not require any
different handling, because all instances of (char *)0 will simply be
zero (rather than 331100,,0 for example), and the present
pointer-coercion code will work with this.


Date: 12 Mar 1985  10:32 EST (Tue)
From: David Eppstein <[email protected]>

    Date: Monday, 11 March 1985  20:10-EST
    From: Ken Harrenstien <KLH at SRI-NIC.ARPA>

							...  Note that
    coercions from (char *) to other kinds of things will not require any
    different handling, because all instances of (char *)0 will simply be
    zero (rather than 331100,,0 for example), and the present
    pointer-coercion code will work with this.

Nope,
	SETZ	1,		;(char *) NULL
	TLC	1,400000	;Invert global/local byte ptr flag
	XMOVEI	1,(1)		;(int *) (char *) NULL

executed in section 1 still results in  1,,0  rather than  0.
So the (char *) => (int *) coercion would need work also.
CAIE 1,0  just before the TLC still gives  1,,0.
As far as I can tell the least instructions this can be done right in
is four:
	CAIE	1,0
	TLCA	1,400000
	TRNA
	XMOVEI	1,(1)
and of course we can no longer fold the XMOVEI into further ops.
Does anybody see a better way?


Date: Wed 13 Mar 85 10:21:01-EST
From: David Eppstein <[email protected]>
Subject: (char *) to (int *) conversion

I thought about it some more last night, and came up with the following:

	TLCN	R,770000	;Flip local/global bit, test if byte ptr
	TLCA	R,770000	;Not a byte pointer, flip them back
	XMOVEI	R,(R)		;Was a byte pointer, canonicalize int ptr

works with the same assumptions as we have now.  If we add the (reasonable)
assumption that we only use OWGBPs in extended addressing, we can do

	TLZE	R,770000	;Clear P/S field, skip if not byte pointer
	XMOVEI	R,(R)		;Clear left half of local byte pointer

And if we assume that we only have 23 bit addresses (unreasonable) we can do:

	TLZ	R,777700	;Clear P/S field(s) and unused addr bits

My current favorite is the middle one of the above.
As if the extra instruction makes a difference...


Date: Wed 13 Mar 85 13:14:31-EST
From: David Eppstein <[email protected]>

    Date: Wed 13 Mar 85 09:48:08-PST
    From: Ken Harrenstien <[email protected]>

    You're right, the existing code does not work with (char *)0.
    This extended-addressing global-pointer business is a real PAIN.
    As further proof, your code examples are not exactly correct
    either:

    	TLCN	R,770000	;Flip local/global bit, test if byte ptr
    	 TLCA	R,770000	;Not a byte pointer, flip them back
    	  XMOVEI R,(R)		;Was a byte pointer, canonicalize int ptr

    This fails for a local byte pointer which is pointing to the last
    byte in a word, e.g. [001100,,foo].

Yeah, you're right.  I had forgotten that locals can have a zero P field.

					 Also, this code does make the
    additional assumption that only OWGBPs are seen when running with
    extended addressing.  Consider what happens when a local-format
    pointer like 331100,,foo is seen -- it will become 1100,,foo because
    the XMOVEI is gobbling a 30-bit address.  Even if the memory hardware
    only allows 23 bits, the full address field is supposed to be 30 bits,
    and you will definitely get an illegal memory reference with 1100,,0
    (I tried it).

Not true, it becomes  441100,,foo  and treated as an IFIW by the XMOVEI.

    Your second case (TLZE ? XMOVEI) fails for the same reason (P=00) since it
    has to work when in section 0.

Right again.

    It seems as if your original suggestion of:
    	CAIE 1,0
    	 TLCA 1,400000
    	  TRNA			; whatever the fastest skip is
    	   XMOVEI 1,(1)
    is the only thing that works for all cases.  I don't see any problem
    with "folding" XMOVEI, since the only thing that could possibly be
    different is to have a different AC for the result (eg XMOVEI 2,(1))
    but you can accomplish the same thing by using SKIPE 2,1 at the
    start instead of CAIE and then the following instructions refer to 2 of
    course.  The SKIPE is also good for replacing the MOVE that is most
    often seen.  Of course you can always use a JUMPE if the stuff is
    already in the right register.

If the instruction after the coercion dereferences the pointer, with
the current code this is
	TLC	1,400000
	XMOVEI	1,(1)
	MOVE	1,(1)
which gets folded to
	TLC	1,400000
	MOVE	1,(1)
because the XMOVEI doesn't do anything for you if you are just going
to use that register as an index anyway.  What I am saying is that
with the code sequences being proposed, this would no longer happen.
But this is not a big concern because in most cases code that uses the
pointer without checking it like this is wrong, and I'm not
particularly concerned with making bugs as efficient as possible.

    It would be worthwhile to determine whether there is in fact any
    possibility of encountering a local-format byte pointer when using
    extended addressing.

Not unless you call a routine written in something other than C.
All of the byte pointers made by KCC except those used in literal LDBs
and DPBs get constructed using $BYTE.

			  KCC exhibits some schizophrenic behavior in this
    respect; if you assign the address of a string, such as when pushing a
    printf control string on the stack, KCC will painstakingly put a byte
    pointer together by hand (first an XMOVEI of the string address, then
    an IOR of the appropriate byte-pointer bits from a table which the
    runtime initializes at startup).  However, if you retrieve the
    contents of a specific char array element (good example: the ctype
    array and the macros which reference it) then KCC just compiles a fast
    LDB which addresses a constant local-format byte pointer.

If you look at the unoptimized code for the latter case you will see
the same painstaking assembly of the byte pointer.  One of the
optimizations performed is to turn it into a local byte pointer when
there is no possibility of it getting onto the stack.

    You can't have it both ways.  KCC should either be assuming that all
    code, constants, and variables will be in a single section (thus the
    only stuff in other sections will be the stack and anything seized by
    malloc) OR it should assume that code, constants, and variables can be
    scattered throughout several sections.

It assumes that all code, constants, and variables are in a single section.
Perhaps you are a little confused about local byte pointers in extended
addressing.  The "local" section they reference is the section the
byte pointer is stored in, not the section that the PC is currently in.
If you move a local byte pointer between the code section and the
stack section it changes its meaning.  This is why all stored byte
pointers, even those known to be to the code section, must be global.

    I don't have any problems with the idea of restricting
    code/consts/vars to a single section.  I don't think anyone is going
    to re-write the TOPS-20 monitor in C.

Last I heard Len had some ideas...

					   If this can be agreed upon, then
    certain optimizations become feasible which are currently not done;
    for example, all char pointers to anything with a fixed address can
    dispense with the byte-pointer construction code.  Thus the current
    very common sequence of
    	XMOVEI 12,<address of stuff>
    	IOR 12,$BYTE+<appropriate offset>
    	PUSH P,12
    can be replaced by
    	PUSH P,[<local-fmt-BP>]

No, for reasons described above.  Another good reason not to do this is that in
	extern int x, y;
	(char *) y;
	x = &y;
	(char *) x;
the resulting byte pointers would be different even though they point
to the same byte of the same location.  This would cause problems with
pointer comparisons.

    The implication of course is that local format BPs will be seen here
    and there by extended-address programs.  Considering that the vast majority
    of programs will not use extended addressing, and that pointer coercion
    is a fairly rare event (mainly associated with malloc), I think it
    is reasonable to allow this.  Other opinions should be solicited.

Again, my feeling is that local BPs should never be seen by an
extended addressing program, except when that BP is constructed
locally (in a literal) and can never escape to the stack or malloc space.


Date: Wed 13 Mar 85 10:26:56-PST
From: David Eppstein <[email protected]>
Subject: yet another try at (char *) => (int *)

I still have aesthetic problems with the four-instruction solution.
It just shouldn't take that many.  So, since my last attempt was
such a loser, let me try again:
	TLNE	R,400000	;Check whether local or global BP
	TLZA	R,770000	;Global, just clear P/S field
	TLZ	R,777777	;Local in section 0 or null, clear left half
Anybody see anything wrong with this one?


Date: Thu 14 Mar 85 02:50:02-PST
From: Ken Harrenstien <[email protected]>
Subject: Coercion OK, and a new soapbox platform (Ban $exadf)

    Perhaps you are a little confused about local byte pointers in extended
    addressing.  The "local" section they reference is the section the
    byte pointer is stored in, not the section that the PC is currently in.
    If you move a local byte pointer between the code section and the
    stack section it changes its meaning.  This is why all stored byte
    pointers, even those known to be to the code section, must be global.

You're right, I was assuming it was relative to the PC, not the BP
location.  Damn.  I definitely need to get another copy of the latest
processor manual, my copy at home doesn't have anything about extended
addressing and it seems I do most of my thinking there.

    ...  This would cause problems with pointer comparisons.

Another unexpected aspect, although char-pointer comparisons COULD be
done specially.  After all, long integers require special comparison
code in PDP-11 C, and KCC already hacks special-purpose code for doing
relative comparisons (of the form "xp > yp").  But never mind, I'm
convinced already.

About your new coercion code:
	TLNE	R,400000	;Check whether local or global BP
	 TLZA	R,770000	;Global, just clear P/S field
	  TLZ	R,777777	;Local in section 0 or null, clear left half

I think we have a winner here!  Congratulations!  Here's to elegance!

Now that you have gotten your itch out of the way, allow me to scratch
mine a little more.  I still don't think constant byte pointers should
always have to be put together dynamically.  Yes, I am now convinced
that we need to always use OWGBPs except where the BP will never
"escape", and yet I still think it is possible to dispense with the
painful construction of such pointers.  This really ties in with the
whole notion of trying to output code that is run-time compatible
with either extended or non-extended addressing; this often leads to
code which is not optimal for either case, and this irritates me.
Here are some possibilities:

	* Having the microcode understand global-format byte pointers
even when non-extended.  A minor thing, unlikely to happen.
	* Providing LINK with special knowledge of C.  There is an
incredible amount of this already for Fortrash and others.  Also
unlikely, although for such things as a C debugger it might eventually
be useful.
	* If the machine supports extended addressing, always
compiling and loading extended-code ONLY.  If not, then always
compiling/loading non-extended-code ONLY.  Simplifies things, no?
Just think of the new optimizations possible...  you still want that
TLZ R,770000?  Any problems with this?  Note KCC is already KA/KI/KL
machine-dependent and this just adds an extra aspect.
	* A compromise approach which relies on a CRT.REL "initial
module" file similar to UNIX "crt.o".  This is elaborated on at
length below.

COMPROMISE IDEA:
	Do away with $EXADF.  Require specification at load time
	of the section number (0 or N) to use for code.  When CC is doing the
	invocation of LINK, use the "-i" switch to request this.

Now, this may be completely impractical if there are no convenient
switches to LINK (let's see, add the LINK manual to my home-copy
shopping list), although my quick scan of the LINK sources indicates
that LINK is capable of loading multi-section programs (well I suppose
it ought to, if only to build the monitor!)  The key question is whether
it is possible to defer the section assignment until load time.  This
OUGHT to be possible.

If it is, then we win.  For example, whenever building a pointer to a
"known" location, such as a printf control string, instead of this
sequence:
	XMOVEI R,foo
	IOR R,$BYTE+x
	PUSH P,R
we can use this:
	PUSH P,[$$BPOx+foo]	; hip hip hooray!

Where $$BPOx is a set of symbols which have the appropriate BP values
depending on whether you are loading up an extended or non-extended
program.  eg for a non-extended program you would have 350700,,0 and
for the extended version you would have 620000,,0.  Note that KCC does
NOT need to know whether it is compiling for an extended or
non-extended program!  The .REL file will be the same either way, and
the loader does all the dirty work.  This is what loaders are supposed
to be good for!  This requires that any runtime symbols and code which
are different for extended and non-extended programs must be confined
to a single "C:CRT.REL" (akin to UNIX "crt.o") module, which precedes
any other load modules.  C:CLIB.REL trails all other load modules, as
usual.  CC, when doing a normal load, will use C:CRT.REL, when doing
an extended load (-i specified) will use C:CRTX.REL.

It remains to be seen whether FAIL/MACRO can produce the appropriate
polish fixup and whether LINK can be told where to start loading
code.  The answer ought to be yes.  It may even be that the CRT(X).REL
file specification alone is sufficient, depending on how well FAIL/MACRO
can communicate with LINK.

"But... but..." diehards may object, "you lose the wonderful $EXADF
feature, and need special loading..."  Bah.  I don't think this is as
important as optimal code production.  If you really wanted to, you
could make the code run-time compatible with KAs, KIs, KLs, and KSs;
but would anyone want to use the resulting mess?  I think using
a different CRT.REL initial module for extended and non-extended versions
is a perfectly simple and convenient method.  (It may be that forcing
always-extended on KLs is even simpler.)  Granted, the desire to
keep user .REL files identical either way does mean that certain things
cannot be easily excised into CRT.REL, but we still get an improvement.
Further points:
	* We are not really supposed to ever be talking directly to
LINK, just as on UNIX we almost never invoke "ld" directly -- and if
we do we have to know all about the nitty-gritty things which CC
shields us from.  Yes, I expect that KCC will eventually understand
"module.rel" and "-l" library specifications.
	* Traditional PDP-11 C uses a scheme almost exactly like what
I am proposing, and people are used to it.  "Profiling" works this
way, too.  Some situations take this even farther, such as the UCB
user-overlay scheme, where the compiler output is DIFFERENT and thus
different (parallel) libraries must be used when building an overlaid
program.  We don't need to go quite this far.
	* Compiler output is like microprogramming -- it is OK to
spend extra time and hair on optimization, because the final version
of a program will be used many many times.  Slight inconveniences when
turning out this final version (like extra switches, or extra compilation
time) are entirely reasonable.  Take me for example: this zealot will gladly
type 50 extra switches if it gets rid of those byte pointer contortions
(but I won't have to type any!)


Date: Thu 14 Mar 85 07:30:45-EST
From: Rob Gingell <GINGELL@CWR20B>

Re:

"	* Having the microcode understand global-format byte pointers
even when non-extended.  A minor thing, unlikely to happen."

In V6 and V6.1 of TOPS-20, the microcode does in fact support OWGBP's
in section 0.  However, JSYS's (primarily the I/O JSYS's) will not 
accept a OWGBP from section 0 to avoid conflicting with the universal
device designator argument format.

Of course, if one is running 6.0 or 6.1, you've got a model B KL and
the rest of the arguments presented here about always being extended
on such machines would seem to hold.  I personally would rather it
just always generate extended code too since if you're using my kernal
emulator you're going to be doing multi-section stuff anyway.
13-Mar-85 05:52:23-PST,879;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Wed 13 Mar 85 05:52:19-PST
Date: Wed 13 Mar 85 05:50:33-PST
From: Ken Harrenstien <[email protected]>
Subject: Sigh, new KCC still bites it
To: [email protected]
cc: [email protected]

This is probably the 4th time I have eagerly brought over the latest KCC
only to find that some aspect of its interaction with LINK still has
problems:


[PHOTO:  Recording initiated  Wed 13-Mar-85 5:43am]

@cc hock
KCC:    hock
<KLH>HOCK.FAI.1
FAIL:  hock
LINK:   Loading

?PA1050: Illegal instruction 0,,40 at user 227663
@load hock
LINK:   Loading

EXIT
@save hock
 HOCK.EXE.132 Saved
@pop

[PHOTO:  Recording terminated Wed 13-Mar-85 5:44am]

As you can see, the KCC-invoked LINK fails, but the EXEC-invoked LINK wins.
I hope this makes sense to someone.
-------
13-Mar-85 07:11:01-PST,950;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Wed 13 Mar 85 07:10:58-PST
Date: Wed 13 Mar 85 07:09:13-PST
From: Ken Harrenstien <[email protected]>
Subject: Bite, bite, bite
To: [email protected]

If you interrupt KCC (f'rinstance if it's obvious that a missing brace is
causing semi-infinite error messages) then starting it again causes the
following:
?Could not open output file -- test.fai
LINK:   Loading

I am painfully aware of the idiotic way in which RESET fails to clear
things sufficiently for the OPEN to succeed -- many other programs are
screwed by this -- but it seems undesirable that KCC should proceed to
invoke LINK after encountering such an error.  Sure, a .REL file may
already exist, but KCC should either ignore it, or should check the
creation dates to make sure the .REL postdates the source file.  Even the
latter cleverness may backfire some day.
-------
13-Mar-85 07:14:38-PST,882;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 13 Mar 85 07:14:34-PST
Date: 13 Mar 1985  10:12 EST (Wed)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Ken Harrenstien <[email protected]>
Cc:   [email protected]
Subject: Sigh, new KCC still bites it

Yeah, I remember seeing that one a couple days ago.
If hock doesn't have main() defined in it you are going to lose,
because KCC tells FAIL to tell LINK to produce an EXE file.
If it does have a main() and still does this I guess it is a bug.
If you just want a REL file you can use the -c flag.

By the way, the sources in <KCC.CC> are not likely to be stable for
about a week.  I'm busy making all sorts of changes.  SYS:CC.EXE will
usually work (if it doesn't I've made a mistake).
13-Mar-85 07:21:52-PST,653;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Wed 13 Mar 85 07:21:48-PST
Date: Wed 13 Mar 85 07:20:04-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: Sigh, new KCC still bites it
To: [email protected]
cc: [email protected], [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Wed 13 Mar 85 10:12:00-PST

hock does have main(), and I normally just copy the binary (CC.EXE.43 in
this case).  Incidentally, it is *NOT* producing an .EXE even when it
loads without error.  I still need to SAVE the image manually.  Is this
a clue?
-------
13-Mar-85 07:35:29-PST,863;000000000001
Mail-From: KRONJ created at 13-Mar-85 07:35:27
Date: Wed 13 Mar 85 07:35:27-PST
From: David Eppstein <[email protected]>
Subject: CC biting it
To: [email protected]
cc: [email protected]

Version 43 is before I made KCC tell FAIL to tell LINK to make EXEs.
So I'm not sure exactly why it dies like that but it probably doesn't
do the same any more.

By the way, in later versions I've also started adding the stronger
type checking that you asked for.  It now gives warnings for non-explicit
(char *) to (int *) coercions and vice versa, and also for symbols that
you define twice with different types.  I haven't gotten around to
warning about (int) to (int *) and (char *) coercions because the
multiple definitions with different types produced many many warnings
for KCC compiling itself that I want to straighten out first.
-------
13-Mar-85 08:55:10-PST,635;000000000001
Mail-From: SATZ created at 13-Mar-85 08:55:03
Date: Wed 13 Mar 85 08:55:03-PST
From: Greg Satz <[email protected]>
Subject: Re: Sigh, new KCC still bites it
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Wed 13 Mar 85 07:21:50-PST
Phone: (415) 497-1004

KCC currently won't produce an .EXE file. Someday a -o flag will exist
to do this. I thought it would be better if KCC left you a core image
like most TOPS compilers instead of an a.out file.  Seemed a little more
intuitive.

It is possible that your PA1050 is older then Sierra's? Bill?
-------
13-Mar-85 09:00:54-PST,867;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 13 Mar 85 09:00:46-PST
Date: Wed 13 Mar 85 11:59:42-EST
From: David Eppstein <[email protected]>
Subject: Re: Sigh, new KCC still bites it
To: [email protected]
cc: [email protected], [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Wed 13 Mar 85 11:55:28-EST

Nope, KCC now produces an EXE.  "Most TOPS compilers" are invoked with COMPILE,
and when KCC is run that way it produces a REL as it should.  When KCC is run
as CC then my feeling is it should act like UNIX cc.

I doubt the version of PA1050 has much to do with it.  KCC was getting these
errors on Sierra too a while back.  I don't know whether it was from anything
I did, but my change to produce EXEs seems to have stopped it.
-------
13-Mar-85 11:51:27-PST,533;000000000001
Mail-From: WHP4 created at 13-Mar-85 11:51:18
Date: Wed 13 Mar 85 11:51:18-PST
From: Bill Palmer <[email protected]>
Subject: kcc vs link
To: [email protected]

If people will kindly compile a list of their favorite bugs *with* files
that will tickle the bugs, I will fix them over spring break.  I'm too busy
right now with finals to worry about chasing these things.  Note that each
and every one of you is also welcome to look for the cause of these bugs
instead of merely pointing out that they exist.
-------
14-Mar-85 21:26:58-PST,743;000000000001
Received: from LOTS-A by Sierra with Pup; Thu 14 Mar 85 21:26:51-PST
Received: from LOTS-C by LOTS-A with Pup; Thu 14 Mar 85 21:26:07-PST
Date: Thu 14 Mar 85 21:25:58-PST
From: Frank Chen <F.Frank@LOTS-C>
Subject: [TUKKAR ERIK HOKANSO <C.CLARKE@LOTS-C>:]
To: BUG-C@LOTS-C
Also-known-as: Franky@SCORE, Frank@SIERRA, Frank@CSLI
Telephone: (415) 424-8166


Will LOTS be running the new Sierra SYS:CC.EXE and EXEC (with support
for the KCC compiler) soon?

Frank

                ---------------

Mail-From: C.CLARKE created at 14-Mar-85 20:54:55
Date: Thu 14 Mar 85 20:54:54-PST
From: TUKKAR ERIK HOKANSO <C.CLARKE@LOTS-C>
To: f.frank@LOTS-C




Is there a C compiler on LOTS??

Thanks Erik H.



-------
-------
15-Mar-85 08:17:56-PST,1046;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Fri 15 Mar 85 08:17:46-PST
Date: 15 Mar 1985  11:14 EST (Fri)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Ken Harrenstien <[email protected]>
Cc:   [email protected]
Subject: Some LINK info (re "Ban $exadf")
In-reply-to: Msg of 14 Mar 1985  23:33-EST from Ken Harrenstien <KLH at SRI-NIC.ARPA>

Note that if you are already concerned about taking extra memrefs to
construct byte pointers (which I think is a silly thing to be
concerned about, almost as silly as the number of instructions in
pointer conversions (but I was doing that for hack value not efficiency))
you should be even more concerned about putting global data and code
in a different section.  The simplest way of doing it that I can think
of would be to keep a permanent index register to the base of the data
section, and then all your global array references take another instruction.
15-Mar-85 08:18:55-PST,959;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Fri 15 Mar 85 08:18:48-PST
Date: 15 Mar 1985  11:15 EST (Fri)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Ken Harrenstien <[email protected]>
Cc:   [email protected]
Subject: Some LINK info (re "Ban $exadf")

Note that if you are already concerned about taking extra memrefs to
construct byte pointers (which I think is a silly thing to be
concerned about, almost as silly as the number of instructions in
pointer conversions (but I was doing that for hack value not efficiency))
you should be even more concerned about putting global data and code
in a different section.  The simplest way of doing it that I can think
of would be to keep a permanent index register to the base of the data
section, and then all your global array references take another instruction.
15-Mar-85 18:25:31-PST,3244;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Fri 15 Mar 85 18:25:20-PST
Date: Fri 15 Mar 85 18:22:42-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: Some LINK info (re "Ban $exadf")
To: [email protected]
cc: [email protected], [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Fri 15 Mar 85 11:14:00-PST

You have a good point about needing to compile extra references (or a
"base reg") if data might reside in other sections; however, Vic's
information seems to indicate that the same issue is being dealt with
by new FORTRAN/LINK conspiracies.  We'll have to wait and see.

I happen to be one of those people who think that machine-code
efficiency is important.  This doesn't mean I fanatically examine and
pare down every piece of code I run across, but I do think that in
very special cases, it is worthwhile to spend some time dealing with
microseconds.  A compiler is one such special case, since it is
responsible for producing many, many, many instructions which will be
used many, many times.  The more efficient its code is, the greater
the incentive to use the compiler.  The justification for putting sweat
into the optimization routines is perhaps not as great as for sweating
over microcode, but the same idea applies.

Of course people will disagree about how much sweat should be "required"
or "wasted" on particular optimizations.  I guess for this instance, I
can see a very simple and direct method of constructing the byte pointers
at load time rather than run time, and there is essentially no sweat
involved, so I think it is a worthwhile improvement.

In light of recent comments, I now suggest that PSECTs not be bothered
with for the time being, and $EXADF be retained; all the work would still
be done by the appropriate CRT.REL (selected by existence or nonexistence
of "-i" switch), in the following fashion:
	(1) Byte pointers to known data locations take the form of
		<$$BPsymbol>+<varsymbol>
	(2) $EXADF: $$EXAV
Where 
	$$EXAV is defined 0 by CRT.REL, 1 by CRTX.REL.
	$$BPsymbol is the appropriate P&S specifier.  All $$BPnn
		values defined in local format by CRT.REL,
		and global format PLUS <$$EXAV,,> by CRTX.REL.
	The $BYTE table would still exist for runtime coercions.

I have tested this out and it works.  I created CRTX and CRT, munged
the FAIL output of KCC to use link-constructed BPs, and then by saying
LOAD CRT,TEST or LOAD CRTX,TEST created either a section-0 or extended-addr
version of TEST.

I grant that the argument for this method would be somewhat stronger
if there were definitely other situations where LINK could help out
the optimization.  Unfortunately I'm not familiar enough yet with the
possible local vs extended differences to identify further instances
beyond the BP format ones.  Are there others?  Or is it known that
there are no others?

Of course all this can be junked if you just decide to always produce
extended-only code whenever the machine supports it, and local otherwise.
The only objection I can think of to this is that IDDT doesn't know how
to debug anything but section-0 programs.
-------
15-Mar-85 18:42:58-PST,958;000000000001
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Fri 15 Mar 85 18:42:52-PST
Date: 15 Mar 1985 18:36-PST
Sender: [email protected]
Subject: Re: Some LINK info (re "Ban $exadf")
From:  William "Chops" Westfield <[email protected]>
To: [email protected]
Cc: [email protected], [email protected]
Message-ID: <[SU-SCORE.ARPA]15-Mar-85 18:36:18.BILLW>
In-Reply-To: The message of Fri 15 Mar 85 18:22:42-PST from Ken Harrenstien <[email protected]>

On machine efficiency...

Although Im not sure, I think that you will find that on a KL with
an MCA20, putting code and data in seperate sections will be slower
than having them in the same section.  this is because the hardware
pager table has to be reloaded when going from address X in section
N to address X in some other section, which takes many memory references...

Things may be better if you have the MCA25 upgrade...

BillW
16-Mar-85 11:28:05-PST,1520;000000000001
Mail-From: WHP4 created at 16-Mar-85 11:27:56
Date: Sat 16 Mar 85 11:27:56-PST
From: Bill Palmer <[email protected]>
Subject: Re: Some LINK info (re "Ban $exadf")
To: [email protected]
cc: [email protected], [email protected], [email protected]
In-Reply-To: Message from "William "Chops" Westfield <[email protected]>" of Fri 15 Mar 85 18:36:00-PST

That's the understatement of the year if you pick your addresses right!


[PHOTO:  Recording initiated  Sat 16-Mar-85 11:19AM]

!exe gronk
LINK:   Loading
[LNKXCT PGRONK execution]
5E5 refs between 0,,20 and 2,,20 took 12406 ms cpu, 23842 ms console
5E5 refs between 0,,20 and 2,,400020 took 972 ms cpu, 1380 ms console
5E5 refs between 0,,20 and 1,,20 took 972 ms cpu, 1103 ms console
5E5 refs between 0,,20 and 1,,400020 took 976 ms cpu, 1359 ms console
!exe gronk4
LINK:   Loading
[LNKXCT PGRONK execution]
5E5 refs between 4,,20 and 2,,20 took 12483 ms cpu, 24167 ms console
5E5 refs between 4,,20 and 2,,400020 took 981 ms cpu, 1377 ms console
5E5 refs between 4,,20 and 1,,20 took 985 ms cpu, 1405 ms console
5E5 refs between 4,,20 and 1,,400020 took 974 ms cpu, 1341 ms console
!pop

[PHOTO:  Recording terminated Sat 16-Mar-85 11:21AM]


The program itself runs in section 1, and a reference is the pair

move 1,@2	;2/ 4,,20
move 1,@3	;3/ 2,,20

I would expect this behavior to be different on a system with an MCA25, but
alas, I have no access to such a beast for experimentation.

						Bill
-------
17-Mar-85 16:09:07-PST,2919;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sun 17 Mar 85 16:08:49-PST
Date: 17 Mar 1985  19:07 EST (Sun)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bug-KCC@Sierra
Cc:   Bosack@Score
Subject: Muchly improved KCC

Today's the last day of spring break for me, so I've stopped hacking
up the compiler for now.  Some recent changes:

- Real register allocation.  Double arithmetic should no longer lose
  registers.  Expressions may be arbitrarily complex without running
  out of registers.  Saving of intermediate values over function calls
  is now done more cleanly (but you shouldn't notice much difference
  in generated code).

- Code is no longer generated which generates references to storage
  above the top of the stack.  Assembly runtimes no longer do so
  either.  The calling conventions for the opcode simulation runtimes
  ($DFLOT etc) are now reentrant.

- Structs as arguments, return values, and in assignments.  This may
  have worked before in a somewhat buggy fashion.  Now it really
  works.  You can extract any member from a structure return value,
  including another structure.  Structures of two words or less are
  stored in a register or pair; structures bigger than that are stacked.
  Struct return values now go below the return value, so that another
  source of references to storage above the stack pointer is removed.
  The stack format has been more adequately documented in CC.DOC.

- Void is now a built in type.

- Much stronger type checking.  Many cases that were generating bad
  code are now errors; most other type mismatches now cause warnings.
  Some cleanup was needed to make the compiler itself compile without
  these warnings.  Function return values are also typechecked and
  coerced to the appropriate type.

- Pointer coercions now keep NULL unchanged.

- Floating point function arguments are now passed as doubles.  Double
  precision in printf() now prints out the whole doubleword (it used
  to throw away the second word).  Double comparisons now also compare
  both words of the doubles.

- LINK is no longer run if a file could not be opened.

- Enum specs with = are now understood.

- Unclosed comments no longer cause an infinite loop.

- Fixing a float now truncates (as per K&R and PCC) rather than rounding.

- Many other minor bugfixes.

Of course I don't guarantee that it is now bug-free, but at least it
should now mostly have different bugs than before (is this an advantage?).
Because it's schooltime again I won't be working on this for a while,
so someone else will have to fix any of the problems I've left behind
that can't wait for the summer.

Note that if you pick up the new version of the compiler you will also
need a new copy of the runtimes.
17-Mar-85 17:36:42-PST,911;000000000001
Mail-From: LOUGHEED created at 17-Mar-85 17:36:32
Date: Sun 17 Mar 85 17:36:32-PST
From: Kirk Lougheed <[email protected]>
Subject: [Tim Gonsalves <[email protected]>: (f)grep don't accept wildcard in filename]
To: [email protected]

It looks like the runtimes don't understand wildcards.  Another cute thing
to try is putting an "&" at the end of a command line.  I'm not sure how
you could persuade the EXEC to run the process background, however.

Kirk
                ---------------

Mail-From: FAT.TAG created at 14-Mar-85 11:00:55
Date: Thu 14 Mar 85 11:00:55-PST
From: Tim Gonsalves <[email protected]>
Subject: (f)grep don't accept wildcard in filename
To: [email protected]
Reply-To: [email protected]


fgrep string f*.xxx  results in:
fgrep: can't open f*.xxx
Similarly for grep. (Both in <kcc.unix.bin>)

	Tim Gonsalves
-------
-------
19-Mar-85 12:28:15-PST,368;000000000001
Mail-From: SATZ created at 19-Mar-85 12:28:09
Date: Tue 19 Mar 85 12:28:08-PST
From: Greg Satz <[email protected]>
Subject: macro bug fixed
To: [email protected]
Phone: (415) 497-1004

Using getchar(), which is a macro, would return an error about mismatched
arguments. The variable ac in expmacro wasn't being initialized to zero
(cclex.c).
-------
19-Mar-85 12:38:48-PST,520;000000000001
Mail-From: SATZ created at 19-Mar-85 12:38:43
Date: Tue 19 Mar 85 12:38:43-PST
From: Greg Satz <[email protected]>
Subject: Re: [Tim Gonsalves <[email protected]>: (f)grep don't accept wildcard in filename]
To: [email protected]
cc: [email protected], [email protected]
In-Reply-To: Message from "Kirk Lougheed <[email protected]>" of Sun 17 Mar 85 17:36:36-PST
Phone: (415) 497-1004

The runtimes already understand * and %. grep and fgrep just needed to
be recompiled.
-------
19-Mar-85 21:22:21-PST,716;000000000001
Mail-From: WHP4 created at 19-Mar-85 21:22:16
Date: Tue 19 Mar 85 21:22:16-PST
From: Bill Palmer <[email protected]>
Subject: &
To: [email protected]

The runtimes now understand '&', so you should be able to run your
favorite kcc programs in the background.  Of course, they will need to
be recompiled, first.  There is also a slight modification to EXECP
required.  If you don't use it, you don't need the EXEC mod.  I will
hopefully get around to putting together a set of REDIT files or the
equivalent for the EXEC mods I've made to support KCC in various ways;
those of you who aren't "Stanford sites" may find this an easier way
to install the mods if you want them.

					Bill
-------
20-Mar-85 12:05:51-PST,1749;000000000001
Mail-From: WHP4 created at 20-Mar-85 12:05:46
Date: Wed 20 Mar 85 12:05:46-PST
From: Bill Palmer <[email protected]>
Subject: [Ralph Gorin <G.GORIN@LOTS-A>: Re: FAIL question]
To: [email protected], [email protected]


This is wrt the bug where if you compile more than one file, FAIL complains
about the pseudo-ops we purge being undefined when it tries to purge them
for the second file.

					Bill
                ---------------

Received: from LOTS-A by Sierra with Pup; Wed 20 Mar 85 12:02:09-PST
Date: Wed 20 Mar 85 08:27:22-PST
From: Ralph Gorin <G.GORIN@LOTS-A>
Subject: Re: FAIL question
To: whp4@Sierra
In-Reply-To: Message from "Bill Palmer <whp4@Sierra>" of Wed 20 Mar 85 01:17:13-PST

My understanding of the PURGE pseudo-op is that it removes the
named entry from whatever symbol table it finds the entry in.

FAIL doesn't rebuild the initial symbol table prior to each
assembly (indeed, UNIVERSAL is defined to make a permanent
change in the symbols available (although SEARCH is needed
in the case of UNIVERSAL to activate that portion of the
symbol table)).

Once a symbol has been purged, it's gone.

Two ideas:
	1. get a new copy of FAIL prior to each KCC compilation
	2. fix PURGE to ignore symbols that can't be found.
	3. fix PURGE to move symbols (of the pre-defined variety)
           to the PURGED list; upon starting the assembly of
           another file, copy the PURGED list to the main symbol
           table and zero the PURGED list.

[Fail is not undergoing active maintenance at this time.  The
definitive sources are at SU-AI, on [CSP,SYS] I think.  Contact
Martin Frost if you want to feed improvements back to the real
sources.]
	Ralph
-------
-------
22-Mar-85 18:24:53-PST,2074;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Fri 22 Mar 85 18:24:44-PST
Date: Fri 22 Mar 85 18:23:44-PST
From: Ken Harrenstien <[email protected]>
Subject: New KCC can't talk with LINK either
To: [email protected]
cc: [email protected]

This one is worse than the previous one.  Now, instead of just bombing
during the automatic LINK invocation, it not only bombs during the automatic
invocation (with a different message, to be sure), but it also turns out
to be impossible to load the .REL file manually!!!  Sigh, yet another
roll-back to the previous version.

I pulled over CC.EXE.69 and CLIB.REL.107.  Was there anything else?

[PHOTO:  Recording initiated  Fri 22-Mar-85 6:15pm]

@cc hock
KCC:    hock
<KLH>HOCK.FAI.1
FAIL:  hock
LINK:   Loading
?LNKSIF Symbol insert failure, non-zero hole found
[       Type CONTINUE for more information]
@continue

        LINK's hashing algorithms  failed;   they  are  trying  to
        write  a  new  symbol over an old one.  You may be able to
        load your files in a different order.  This is an internal
        error.   This  message  is  not  expected to occur.  If it
        does, please notify your Software  Specialist  or  send  a
        Software Performance Report (SPR) to DIGITAL.

EXIT
@v hock.rel.0		; See if it wrote out a .REL file anyway.

   PS:<KLH>
 HOCK.REL.22;P775200       36 18090(36)  22-Mar-85 18:16:21 KLH       
@load hock		; Yeah, try to load it up.
LINK:   Loading
@i mem

1. pages, Entry vector loc 525030 len 254000

 Section 0      R, W, E,  Private
0        Private   R, W, E	; This doesn't look right...
@v hock.exe.0			; Look at previously compiled .EXE

   PS:<KLH>
 HOCK.EXE.155;P775200      42 21504(36)  22-Mar-85 18:13:20 KLH       
@save hock
 HOCK.EXE.156 Saved
@v hock.exe.0			; Now look at what new version gives us.

   PS:<KLH>
 HOCK.EXE.156;P775200       2 1024(36)   22-Mar-85 18:17:31 KLH       
@pop

[PHOTO:  Recording terminated Fri 22-Mar-85 6:17pm]
-------
23-Mar-85 03:33:17-PST,1454;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 23 Mar 85 03:33:12-PST
Date: Sat 23 Mar 85 03:32:20-PST
From: Ken Harrenstien <[email protected]>
Subject: Serious CLIB bug
To: [email protected]

I was trying some other ways of loading programs compiled by the new
KCC, and although I managed to finally do this (by using the
-c switch so as to bypass LINK altogether, snarl), I ran into other
problems which I eventually traced to a problem with $SUBBP.  The
calling/return sequence for this support routine has evidently
changed, but the library routine may or may not have been changed.
I was careful to FTP and install the current SIERRA version (107).
In any case, it DOES NOT WORK.  Since lots of my code depends on this,
I'll have to punt this version altogether until the bug is fixed.

[PHOTO:  Recording initiated  Sat 23-Mar-85 3:27am]

@v c:clib.rel.0

   PS:<SATZ.KCC>
 CLIB.REL.107;P775252      25 12418(36)  19-Mar-85 21:12:09 KLH       
@ty test.c
char buff[100];
main()
{
        char *cp;
        cp = buff;
        cp++;
        cp++;
        cp++;
        printf("Start: %o Ptr: %o (Ptr-Start)=%o\n",
                buff, cp, (cp-buff));
}
@sys:cc.exe.69 test
KCC:    test
<KLH>TEST.FAI.4
FAIL:  test
LINK:   Loading
@test
Start: 331100000140 Ptr: 1100000140 (Ptr-Start)=0
@pop

[PHOTO:  Recording terminated Sat 23-Mar-85 3:28am]
-------
23-Mar-85 17:30:35-PST,1289;000000000001
Mail-From: LOUGHEED created at 23-Mar-85 17:30:33
Date: Sat 23 Mar 85 17:30:33-PST
From: Kirk Lougheed <[email protected]>
Subject: Re: kcc
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "jld@sri-unix" of Thu 21 Mar 85 19:51:55-PST

KCC has no AT&T copyrighted code in the compiler or runtimes, so there are
no licensing issues.  Recent source versions of the compiler and runtimes
are to be found on SU-SIERRA in the following areas: PS:<KCC.C>,
PS:<KCC.CC>, and PS:<KCC.CLIB>.  All you really need is

SYS:CC.EXE			!The compiler
SYS:FAIL.EXE			!The assembler
PS:<KCC.C>*.*.*			!The runtimes and libraries

You will need a logical name C: pointing to PS:<KCC.C>.

Bug reports can be sent to BUG-KCC@SIERRA; let me know if you wish to
be on that mailing list.

A number of runtime routines have not yet been written.  If a standard
routine is missing that you need, it will appear sooner if you write it.
We would be glad to take back any such routines provided they are 1.)
not copied from AT&T copyrighted code and 2.) of acceptable quality.

Bear in mind that we (Stanford University) do not feel that the compiler
is mature enough for a general release; it may be ready for such a
release late this summer.

Kirk
-------
24-Mar-85 14:07:23-PST,747;000000000001
Mail-From: WHP4 created at 24-Mar-85 14:07:09
Date: Sun 24 Mar 85 14:07:09-PST
From: Bill Palmer <[email protected]>
Subject: Re: New KCC can't talk with LINK either
To: [email protected], [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Fri 22 Mar 85 18:24:52-PST

The problem with SAVEing what you get from LOADing HOCK.REL isn't surprising;
it's what is left of LINK after it shuts down having done a HOCK/SAVE/GO.
If you edit the .FAI file you can remove the offending .TEXT pseudo op and
load it again with probably no problem.

You didn't read my previous message about this sort of thing very carefully.
If you want the bug fixed, give me a file that will cause this problem.
-------
24-Mar-85 14:39:17-PST,669;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sun 24 Mar 85 14:39:08-PST
Date: Sun 24 Mar 85 14:35:54-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: New KCC can't talk with LINK either
To: [email protected], [email protected]
cc: [email protected]
In-Reply-To: Message from "Bill Palmer <[email protected]>" of Sun 24 Mar 85 14:06:43-PST

I was trying to do some experimenting to find a smaller test case,
but will cheerfully punt if you are willing to look at it.  I have
put a temporary copy in C:HOCK.C on SRI-NIC (anonymous FTP should
work).  It is a fairly large file.  Good luck!
-------
26-Mar-85 10:50:43-PST,1520;000000000001
Mail-From: SATZ created at 26-Mar-85 10:50:42
Date: Tue 26 Mar 85 10:50:42-PST
From: Greg Satz <[email protected]>
Subject: Re: BLISS-36
To: M.MACHEFSKY@LOTS-A
cc: [email protected], [email protected]
In-Reply-To: Message from "Ira Machefsky <M.MACHEFSKY@LOTS-A>" of Tue 5 Mar 85 09:27:49-PST
Phone: (415) 497-1004

	Date: Tue 5 Mar 85 09:28:19-PST
	From: Ira Machefsky <M.MACHEFSKY@LOTS-A>

	It depends on whether the work is connected with SUNDEC research.
	Is it? and if so, what is it?


Sorry for taking so long to get back to you with this. Here is the dirt:

As you probably know, Stanford is working on a C compiler for the
DEC-20s.  It turns out that the runtimes are being developed at Case
Western Reserve in the BLISS language. It is being developed like the
TOPS-20 PA1050 TOPS-10 compatability package in that it will be called
PAUNIX and make available all of the Unix system calls to any program
that wants to use them (not just C).

Since we have this package now, it would be useful for us to be able to
compile and debug or modify it if necessary. We can't do this without
BLISS-36.

How this will affect SUNDEC? The goal of the C compiler is to make it as
compatable with 4.2bsd as possible. In other words we want to be able to
copy a 4.2bsd program and compile and run it on TOPS-20 with little or
no modifications. SUNDEC will be able to take advantage of this
cross-development environment, especially with the plethora of DEC-20s
on campus.
-------
 4-Apr-85 18:14:37-PST,536;000000000001
Received: from LOTS-A by Sierra with Pup; Thu 4 Apr 85 18:14:31-PST
Received: from LOTS-B by LOTS-A with Pup; Thu 4 Apr 85 18:12:25-PST
Date: Thu 4 Apr 85 18:12:42-PST
From: Paul Feffer <D.DONPABLO@LOTS-B>
Subject: C [Gripe, TTY104:, LOTS Dial-in: 323-7635]
To: Bug-C@LOTS-B


Why am I asked to "Please retype the incorrect parts of the file specification"
when I try to EXECUTE a program?  The system says that PS:CLIB.REL was not 
found.  I'm new at this, so I don't know if the problem is with the system or
me.
-------
 6-Apr-85 08:36:26-PST,821;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sat 6 Apr 85 08:36:16-PST
Date: Sat 6 Apr 85 11:36:04-EST
From: David Eppstein <[email protected]>
Subject: minor bugfix
To: [email protected]

I discovered this in a program to calculate the date of Easter.
There was a problem involving a bad interaction between common
subexpression elimination and integer division, that had been
hidden by some code in the division code generation that could
no longer work under new register allocation and had therefore
been flushed.  The symptom was bogus results from a % operation
where the dividend was available as a common subexpression.

I've put the fixed version of cccode.c back in [Sierra]<KCC.CC>
but haven't recompiled it there.
-------
 6-Apr-85 13:12:32-PST,787;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sat 6 Apr 85 13:12:25-PST
Date: Sat 6 Apr 85 16:12:11-EST
From: David Eppstein <[email protected]>
Subject: Library for EMACS C mode
To: [email protected]
cc: [email protected]

I have created an EMACS library for C mode (and changed HAKLIB to use it).
So far this is only installed on Sierra, in EMACS:CMODE.*.
The code for tab is kind of ugly but it seems to work (at least for
the style of indentation I like; no hooks are provided to change the style).
Also included is a macro on C-M-* to create a block comment of the form
	/*
	** Comment
	** More comment
	*/
Tab or linefeed within such a comment will make more starred comment lines.
-------
 7-Apr-85 20:04:40-PST,396;000000000001
Mail-From: SATZ created at  7-Apr-85 20:04:33
Date: Sun 7 Apr 85 20:04:33-PST
From: Greg Satz <[email protected]>
Subject: Re: minor bugfix
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Sat 6 Apr 85 08:36:23-PST
Phone: (415) 497-1004

I recompiled KCC and installed a new binary on Sierra.
-------
 9-Apr-85 12:59:19-PST,774;000000000011
Mail-From: SATZ created at  9-Apr-85 12:59:07
Date: Tue 9 Apr 85 12:59:07-PST
From: Greg Satz <[email protected]>
Subject: Two bugs from Systems Concepts
To: [email protected]
Phone: (415) 497-1004

Here are two floating point problems:

Problem 1:
Bad code generated for floating point increment. f += 1; works
correctly.
--------------------
float f;

main()
{
    f = 0;
    f++;
    printf("f=%f\n", f);
}
main:
	SETZB	3,f
	AOS	10,f		;huh?
	SETZ	11,
	PUSH	17,10
--------------------

Problem 2:
Long floating point constants are turned into garbage.
--------------------
float f;

main()
{
    f = 1.01234567890123456789;
    printf("f=%f\n", f);
}
--------------------
main:
	MOVE	3,[8069415189E-20]
	MOVEM	3,f
-------
19-Apr-85 13:04:02-PST,1243;000000000001
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Fri 19 Apr 85 13:03:40-PST
Date: Fri 19 Apr 85 11:11:53-PST
From: Peter Samson <[email protected]>
Subject: remainder bug
To: [email protected]

The remainder function ("%") in KCC seems to have problems.
The following C program--
	main () {
		int i, j;
		for (i=1; i<11; ++i) {
			printf("\n");
			for (j=0; j<=(3*i); ++j)
				printf("%2d",j%i);
			}
		}
does not compile correctly.  The FAIL code for the inner
"for" loop (the one on j) looks like--
	SETZB	7,0(17)
$5::
	MOVE	12,0(17)	;(room for optimizing here)
	MOVE	14,-1(17)	;i
	IMULI	14,3		;3*i
	CAMLE	12,14		;j<=(3*i)
	JRST	$3
	IDIV	12,-1(17)	;for j%i
	MOVE	4,12		;*** DMOVE WOULD WORK, MOVE DOESN'T ***
	PUSH	17,5		;(PUSH 17,13 would work and save the (D)MOVE)
	XMOVEI	10,$7
	IOR	10,$BYTE+4	;concoct string pointer
	PUSH	17,10
	PUSHJ	17,printf
	ADJSP	17,-2
	AOS	12,0(17)
	JRST	$5		;(room for optimizing here too)

Note that my complaint is with the code that doesn't work, not with minor
lack of optimization.  (As one who was programming the PDP-6 before there
was a PDP-6, I just can't pass up the chance to point out optimizations.)
-------
20-Apr-85 10:16:29-PST,820;000000000001
Return-Path: <@SU-SCORE.ARPA:[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Sat 20 Apr 85 10:16:25-PST
Received: from COLUMBIA-20.ARPA by SU-SCORE.ARPA with TCP; Sat 20 Apr 85 10:15:46-PST
Date: 20 Apr 1985  13:15 EST (Sat)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Peter Samson <[email protected]>
Cc:   bug-kcc%SU-SIERRA.ARPA@Score
Subject: remainder bug (third try)

I can't seem to get mail directly to Sierra, probably because replies
come back via the new net 10 address and don't get recognized.
So here's this message for the third time (sorry Peter):

Yeah, I noticed that one a couple of weeks ago and fixed it on Sierra.
Maybe someone should move the latest version across to Score?
22-Apr-85 12:02:24-PST,595;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 22 Apr 85 12:02:18-PST
Date: 19 Apr 1985  16:06 EST (Fri)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Peter Samson <[email protected]>
Cc:   [email protected]
Subject: remainder bug
In-reply-to: Msg of 19 Apr 1985  14:11-EST from Peter Samson <G.PRS at SU-SCORE.ARPA>

Yeah, I noticed that one a couple of weeks ago and fixed it on Sierra.
Maybe someone should move the latest version across to Score?
22-Apr-85 12:02:41-PST,592;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 22 Apr 85 12:02:33-PST
Date: 20 Apr 1985  13:12 EST (Sat)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Peter Samson <[email protected]>
Cc:   [email protected]
Subject: remainder bug (retry)

This message didn't seem to make it out to the net.  Let me try again...

Yeah, I noticed that one a couple of weeks ago and fixed it on Sierra.
Maybe someone should move the latest version across to Score?
22-Apr-85 18:38:34-PST,1479;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 22 Apr 85 18:38:27-PST
Date: Mon 22 Apr 85 18:37:43-PST
From: Ken Harrenstien <[email protected]>
Subject: Solution to LINK problems
To: [email protected]
cc: [email protected]

I went ahead and took a look at this.  Fortunately it turns out that the
problem has nothing to do with PRARG%.  The problem is that KCC is mapping
LINK into its own address space before starting it, and LINK is being
screwed when it finds data in a place which it thinks should contain zero.
Yes, this is a bug in LINK, or in PA1050.  However, it is trivial to fix
this, by the following pragmatic expedient.  When KCC is setting up the
AC instructions, it should set up this sequence:
	1/ -1
	2/ .FHSLF,,1	= 400000,,1
	3/ PM%CNT,,777	= 400000,,777
	4/ .FHSLF,,<lj>	= 400000,,<jfn to SYS:LINK.EXE>
	5/ PMAP
	6/ ERJMP 7
	7/ MOVE 1,4
	10/ GET
	11/ MOVEI 1,.FHSLF
	12/ GEVEC
	13/ JRST 1(2)
and then do a JRST 5.  Note PM%CNT only works for TOPS-20, not TENEX.
Also, note the instruction in AC 13; I was horrified to find that KCC
currently uses an ADD 2,-4(17).  Ye gods, you have absolutely no
assurance that the stack will still be there after the GET!  If you must
use a variable entry vector offset, put it in an AC half!

I tried this out by hand, and my notorious LINK-breaking program then
loaded without complaint.

Put this in soon, huh?  Pretty please?
-------
22-Apr-85 19:19:30-PST,485;000000000001
Mail-From: WHP4 created at 22-Apr-85 19:19:26
Date: Mon 22 Apr 85 19:19:26-PST
From: Bill Palmer <[email protected]>
Subject: KLH's fix to LINK linkage installed
To: [email protected]

Sierra's sources now have Ken's fix in them, and the CLIB.REL and CC.EXE
files also contain the fix.  Source sites can just snarf <KCC.LIB>PFORK.FAI,
remake CLIB.REL, and relink CC.EXE; alternatively just steal CLIB and CC.

			wearer of much albumin on face,

					Bill
-------
22-Apr-85 20:38:18-PST,1220;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 22 Apr 85 20:38:07-PST
Date: Mon 22 Apr 85 20:37:23-PST
From: Ken Harrenstien <[email protected]>
Subject: New bugs
To: [email protected]
cc: [email protected]

Now that the new KCC is over here, I notice some new bugs.  The following
piece of code produces a reference to $SPUSH which FAIL complains about
as being undefined.

struct gtstuff { char *gt_ptr; int gt_val; int gt_another; };
struct win { int g_foo; struct gtstuff g_cur; struct gtstuff g_oth; } win;

int foo;
main() { xtest(&win, 0); }
xtest(gp, gt)
struct win *gp; struct gtstuff *gt;
{	foo = (gt == gp->g_cur) ? 0 : 1;
}


There is of course some question as to whether the $SPUSH should be invoked
at all.  The VAX seems to compile gp->g_cur as a pointer whereas KCC thinks
it refers to the whole structure.  The old KCC in fact also considers it
a pointer.  Using &gp->g_cur makes things less ambiguous and the code then
works, but the inconsistency bothers me (as well as the attempt to perform
a comparison that KCC has rendered meaningless by its misinterpretation).
So there are probably at least two bugs here.
-------
22-Apr-85 20:56:16-PST,1866;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 22 Apr 85 20:56:10-PST
Date: Mon 22 Apr 85 20:55:27-PST
From: Ken Harrenstien <[email protected]>
Subject: Semi-bug
To: [email protected]
cc: [email protected]

There is a minor problem with some warning messages, to wit:

[PHOTO:  Recording initiated  Mon 22-Apr-85 8:43pm]

@ty test.c
char *ptr;
main()
{       ptr = findp(1,
                2,
                3,4
                )
                ;
}
findp(){}
@cc test
KCC:    test

Warning at main+5, line 7 of test.c:
                ;
Implicit coercion assumed -- int to char pointer.
<KLH>TEST.FAI.8
FAIL:  test
LINK:   Loading
@pop

[PHOTO:  Recording terminated Mon 22-Apr-85 8:44pm]

The problem is that the warning message (which, by the way, is a Good
Thing!) can be misleading since the line pointed out has no apparent
relevance to the cause of the error.  It would be more useful if there
was some way that the line with the function call on it could be
printed instead.  This may not seem like such a big deal when given an
example as obvious as above, but when the arguments to a function are
rather complex (which is usually the reason they span more than one
line!) then you can be misled for quite some time (as I was) into trying to
figure out what is wrong with a line that in fact is perfectly all
right.

Alternatively, instead of trying to remember the right line (which
feels like it might be a tough job), KCC could simply identify the
mis-cast identifier in its warning message, eg

"Implicit coercion assumed - 'findp' - int to char pointer"

This would be a major improvement since then you can zero in on bugs
even in the middle of a big hairy line of code.  Same thing applies to
other types of error messages, of course.
-------
23-Apr-85 01:14:08-PST,1143;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 23 Apr 85 01:14:01-PST
Date: Tue 23 Apr 85 01:13:18-PST
From: Ken Harrenstien <[email protected]>
Subject: Simple optimization
To: [email protected]
cc: [email protected]

One minor problem with doubles is that the code:
	double foo;
	foo = 0;
results in a baroque assemblage of instructions that starts with
a MOVEI 16,0 and goes on to do some random MOVEs, EXCHs, and MOVEMs,
in the middle of which is a call to $DFLOT.  Two points should be
brought up here:
(1) if the assigned value is a constant, the compiler should figure out
the value of the constant and use that, rather than doing it dynamically
every time.  This is really basic stuff.
(2) If the value happens to be zero, just do a SETZB AC,AC+1.  This gets
kind of ridiculous when you have a function which decides to do a
return(0)... SETZB 1,2 would suffice, but no sir, KCC goes whole hog.

Incidentally, $DFLOT is not really that complex, and could just as well
be a piece of inline code (which allows you to flush a lot of overhead).

Onward...
-------
23-Apr-85 11:35:46-PST,3879;000000000011
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Tue 23 Apr 85 11:35:16-PST
Date: 23 Apr 1985  11:44 EST (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Ken Harrenstien <[email protected]>
Cc:   [email protected]
Subject: New bugs

    Now that the new KCC is over here, I notice some new bugs.  The following
    piece of code produces a reference to $SPUSH which FAIL complains about
    as being undefined.

    struct gtstuff { char *gt_ptr; int gt_val; int gt_another; };
    struct win { int g_foo; struct gtstuff g_cur; struct gtstuff g_oth; } win;

    int foo;
    main() { xtest(&win, 0); }
    xtest(gp, gt)
    struct win *gp; struct gtstuff *gt;
    {	foo = (gt == gp->g_cur) ? 0 : 1;
    }

    There is of course some question as to whether the $SPUSH should be invoked
    at all.  The VAX seems to compile gp->g_cur as a pointer whereas KCC thinks
    it refers to the whole structure.  The old KCC in fact also considers it
    a pointer.  Using &gp->g_cur makes things less ambiguous and the code then
    works, but the inconsistency bothers me (as well as the attempt to perform
    a comparison that KCC has rendered meaningless by its misinterpretation).
    So there are probably at least two bugs here.

What I had done to make structure assignments work right was to always
treat a ref to a structure as the whole structure (and push the thing
onto the stack if bigger than 2 words) so I wouldn't have to have lots
of special cases in the assignment, function call, and function return
code generation (when any of those gets a structure its now always
pushed; before it was sometimes pushed and sometimes an address and
the code in any two places to figure out which was never the same).

There is supposed to be code in I think cctype.c that will make sure
two operands are compatible and if not do things like coerce a struct
ref into its address.  I don't know why it isn't being done in this
case, maybe I neglected to call it for comparisons.  I agree that this
should either produce correct code with a warning, or no code with an
error, but certainly not the current bogus code silently.

There is another bug here which is why FAIL was barfing: KCC somehow
forgot to declare $SPUSH as an extern.  I don't know why.

    One minor problem with doubles is that the code:
    	double foo;
    	foo = 0;
    results in a baroque assemblage of instructions that starts with
    a MOVEI 16,0 and goes on to do some random MOVEs, EXCHs, and MOVEMs,
    in the middle of which is a call to $DFLOT.  Two points should be
    brought up here:

    (1) if the assigned value is a constant, the compiler should figure out
    the value of the constant and use that, rather than doing it dynamically
    every time.  This is really basic stuff.

I've never managed to figure out how to tell FAIL and MACRO to produce
doubleword floating point constants.  I agree with you that this
should happen.

    (2) If the value happens to be zero, just do a SETZB AC,AC+1.  This gets
    kind of ridiculous when you have a function which decides to do a
    return(0)... SETZB 1,2 would suffice, but no sir, KCC goes whole hog.

Once the first part happens, this will be easier.

    Incidentally, $DFLOT is not really that complex, and could just as well
    be a piece of inline code (which allows you to flush a lot of overhead).

This is in fact on my list of things to be done to KCC once quals and
the other end-of-semester things are over here, and once I've fixed
the more urgent places where correct C becomes incorrect FAIL.
Another is to fix up the code around the runtimes too complicated to make
inline to avoid all those stupid DMOVEs and EXCHs.
23-Apr-85 15:24:55-PST,2105;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 23 Apr 85 15:24:18-PST
Date: Tue 23 Apr 85 15:22:58-PST
From: Ken Harrenstien <[email protected]>
Subject: Re: New bugs
To: [email protected]
cc: [email protected], [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Tue 23 Apr 85 11:44:00-PST

I think I can help out with one of those problems.  You want to know how
to get floating constants (which must be doubles) into FAIL or MACRO?  Nothing
to it!  Just compute the value of the constant inside KCC and then output
an octal constant, for example
	DMOVE 1,[123456,,32131
		342423,,343200]

Ah, but how to compute the value?  Simple, cheat and use the DFIN
JSYS!  I could not find anyplace in the TOPS-20 doc (even in the monitor
code!) that describes the acceptable input format, but from what I saw
it should handle anything that is legal C syntax -- if screw cases exist
KCC should have no trouble massaging the input string into a more standard
syntax.  DFIN also exists on TENEX, although there is some strange hack
to extend the exponent if over E+32.

There will be some problems using doubles on non-T20 systems not only
because of the lack of DFIN but because KA double-precision format
is different and the KL double floating instructions are missing.  Probably
the most convenient hack for the time being would be to continue allocating
two words for doubles, but do all arithmetic with single-precision
instructions (which even exist on the PDP6 - terrific for nifty spacewar
orbit calculations!), so that the 2nd word is basically ignored on KAs.
This avoids futzing with the argument passing stuff (1 wd vs 2) and still
allows code to work instead of blowing up completely.  At some later time,
if it becomes desirable, someone can endeavor to fix up the KA code
generation so double-precision arithmetic uses the KA double instructions
(like DFN, FADL, etc) instead of cheating with FADR, etc...

Let's all not even think about the new G format.
-------
27-Apr-85 15:48:08-PST,385;000000000011
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Sat 27 Apr 85 15:48:01-PST
Date: Sat 27 Apr 85 15:48:01-PST
From: Len Bosack <[email protected]>
Subject: Register allocation error: unreleased registers left over from previous code.
To: [email protected]

...is elicited by [score]<kcc.ctex1>c2b.c and a few others.
-------
28-Apr-85 04:43:10-PDT,1502;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sun 28 Apr 85 04:43:06-PDT
Date: Sun 28 Apr 85 04:41:24-PDT
From: Ken Harrenstien <[email protected]>
Subject: New KCC found harmful
To: [email protected]
cc: [email protected]

Sigh.  I have had to roll back to KCC version 43 (again) because
the current version (71) turns out to generate some incorrect code.
I have narrowed it down to the code fragments which I have put in the
test file [SRI-NIC]<KLH>CC-BUG.C as it is a little long for inclusion in
a message  (the code itself is very small, but the declarations of the
structures referenced are lengthy).  Basically KCC is losing track of
what values are in what registers, and computing an index offset twice.
Just stupid luck that this put data in unused locations rather than
corrupting good stuff.  Anyway, I'm not going to test #71 any further.

Also, while this is not relevant to the current bug, I note that
KCC does not complain about pointer references to undeclared structures,
at least if nothing uses the pointer.  For example,
	struct foo { struct bar *ptr; };
will not complain if nothing ever references foo->ptr.  I don't know
enough about KCC internals to know if it would be easy to do a sweep for
undefined pointer types at the end of processing a file, but it would
be a nice touch if KCC could print a warning about them, I think.
(Has to be at the end, so that forward pointer defs will work...)
-------
27-Apr-85 10:02:39-PST,2277;000000000001
Return-Path: <@COLUMBIA-20.ARPA:GINGELL@CWR20B>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Sat 27 Apr 85 10:02:32-PST
Received: from CWR20B by CUCS20 with DECnet; 27 Apr 85 13:02:15 EST
Date: Fri 26 Apr 85 14:15:14-EST
From: Rob Gingell <GINGELL@CWR20B>
Subject: PAUNIX status
To: Satz%SU-Sierra@CUCS20

Been a while, so I thought I'd let you know what I've been up to.

PAUNIX now supports a 4.1bsd-style terminal driver.  Some of the features
are not useful (such as the job control characters) because the supporting
stuff (like extended signals) are not implemented -- however, the hooks are
there so that when such things become available the tty driver should start
working with them right away.  The tty driver performs the CR/LF -> NL
canonicalization as one would expect for UNIX.  It is also efficient, using
as much of the monitor as possible to do echoing and such.

In addition to some simple test program, it also appears to do the right
thing for the 4.1bsd stty.c program.

PAUNIX now reads directory files.  In particular, pwd.c works under it.
Reading "/" will give you a pseudo-directory of all the mounted structures,
for instance, on our 20B machine it reading "/" will return directory
entries for:

	.
	..
	psb
	ps3
	arjcc
	ps1
	ps2

and a lot of empty entries.  Reading "/dev" will give you all the non-disk
devices, so for instance:

	.
	..
	mta0
	mta1
	...
	tcp

Running pwd.c on the directory where I do PAUNIX development yields:

	/ps3/gingell/paunix/1/sources

where the TOPS-20 directory is

	PS3:<GINGELL.PAUNIX.1.SOURCES>

I am working with getting byte-size handling fixed up, as well as getting
fork() and exec() completed.  Uncertain as to how much progress I'll make
on those until the end of next week, as I'll be out of town this weekend
and I just got the next 6.1 FT set so I know what I'll be doing next week.

---

For generating a really good test I'm working on getting the Bourne shell
compiled under KCC.  Did you ever decide what pre-defined symbols should be
available for the pre-processor to indicate TOPS-20?  It'd be nice to be
able to say

	#ifdef TOPS20

That's about it for now.  Hope things are well out with you, take care.

-------
12-May-85 04:24:03-PDT,391;000000000011
Mail-From: WHP4 created at 12-May-85 04:24:00
Date: Sun 12 May 85 04:24:00-PDT
From: Bill Palmer <[email protected]>
Subject: #undef bug
To: [email protected]

Trying to undefine something that isn't defined gets an error.  For example,
this fragment will do the trick:

#undef FOOBAR
main(){}

The 4.2 compiler doesn't seem to get upset with this.

					Bill
-------
25-May-85 11:42:56-PDT,2330;000000000001
Date: Friday, 24 May 1985  18:53-PDT
From: David Fuchs <DRF at SU-SCORE.ARPA>
To:   kronj at SU-SCORE.ARPA
Subject: Sorry, but...

I'm still getting "Register allocation error: unreleased registers..."
when I compile <4SCRATCH>CA.C at Score.  Back to the old drawing board.
	-david

Date: Friday, 24 May 1985  23:32-PDT
From: David Fuchs <DRF at SU-SCORE.ARPA>
To:   kronj at SU-SCORE.ARPA
Subject: more CC problems

The runtimes seem to have a routine called END in them?  Shouldn't it
be _END or something, to keep from conflicting with my program?

Date: Saturday, 25 May 1985  00:10-PDT
From: David Fuchs <DRF at SU-SCORE.ARPA>
To:   kronj at SU-SCORE.ARPA
Subject: No I/O buffer left.

You must be forgetting to release JFNs or something, because when I do
an EXECUTE C1.C, C2.C... C15.C, I eventually get the message "No I/O buffer
left" from the compiler.
	-david

Date: Saturday, 25 May 1985  02:35-PDT
From: David Fuchs <DRF at SU-SCORE.ARPA>
To:   kronj at SU-SCORE.ARPA
cc:   bosack at SU-SCORE.ARPA, drf at SU-SCORE.ARPA
Subject: Bugs!

The following program produces bad code on CC.  I also believe that
CC does not handle source files that end without a CRLF gracefully.
	-david

char Pmode[20];

foo()
	{
	Pflsho(1);
	printf("Done.\n");
	}

Pflsho(f)
int f;
	{
	int l,x;
	if (Pmode[f]!='W') printf("Bad!!!\n");
	}
main()
	{
	Pmode[1]='W';
	foo();
	}

Date: Saturday, 25 May 1985  11:41-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   David Fuchs <DRF at SU-SCORE.ARPA>
cc:   Bosack@Score
Subject: Sorry, but...

The version of CC on Score was the same one as you had gotten the
register allocation errors before from.  I have put up a (with luck
improved) version now.

END is not really used by KCC, but it is a pseudo-op in both FAIL and
MACRO.  Another argument for KCC generating REL files itself rather
than leaving it to the assembler.  A way around this is to #define
end to be something else (or just use a different symbol).

KCC is supposed to handle multiple files (as in EXECUTE ...) but I
don't think it really does.  What I have been doing when building KCC
itself is to COMPILE the files separately, then LOAD or EXECUTE the
REL files so produced.

Tell me if any of your bugs have gone away or are still around.
25-May-85 12:41:29-PDT,514;000000000011
Date: Saturday, 25 May 1985  11:46-PDT
From: David Fuchs <DRF at SU-SCORE.ARPA>
To:   Kronj at SU-SIERRA.ARPA
Subject: Sorry, but...

OK, thanks.  The bug that causes "Bad!!!" to print out on the following
program still exists.  KCC did handle about a dozen files in a multiple
execute, then it bombed out.	-david


char Pmode[20];

foo()
	{
	Pflsho(1);
	printf("Done.\n");
	}

Pflsho(f)
int f;
	{
	int x;
	if (Pmode[f]!='W') printf("Bad!!!\n");
	}
main()
	{
	Pmode[1]='W';
	foo();
	}
25-May-85 20:48:58-PDT,2360;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 25 May 85 20:48:47-PDT
Date: Sat 25 May 85 20:48:50-PDT
From: Ken Harrenstien <[email protected]>
Subject: News
To: [email protected]

I recently learned about a previously unknown (to me) C compiler for the
PDP-10.  While talking with some ACCENT-R (DBMS) marketing reps, they
said that they had re-written their DBMS in C so as to use the same
software on both PDP-10 (T-10, T-20) systems and VAX/VMS systems.
The package was originally developed on the PDP-10 in Macro.

Anyway, the compiler they used is evidently known as the "Saratoga" C
compiler, written by some professor (named Saratoga?) at Tufts who
then decided to market it.  Anybody else ever heard of it??

I took a look at the binary of ACCENT-R, not to see how it worked (we
have a test version under the usual non-disclosure agreement) but to
see what kind of stuff the compiler put out.  Some things are still
unclear (most of the standard libc routines are not there or are not
used by this program) but it looks like a fairly straightforward
implementation, sort of what you would expect.  That is, ac 17 is the
stack, 16 is a frame pointer, and all others are temporaries, with
function values returned in ac 0 (ugh, KCC's ac 1 is much better).
Args are passed on the stack; routines are responsible for setting
the frame pointer and saving/restoring all registers.  It does PUSHes or
a BLT depending on the # of registers.  Char pointers are byte pointers
which point to the first byte, like KCC.  Constant strings are stored
as 7-bit bytes; I couldn't tell whether a generic (char *) is 7, 8, or 9
bits since I didn't run across any in my brief looksee.  Uses ADJSP,
ADJBP, DMOVx; the code is not set up for extended addressing, although
that may only mean the compiler optimizes for extended or non-extended
depending on a compile-time switch.  Either the compiler has a hell of
an optimizer, or there are still a lot of MACRO-coded routines in there,
because I noticed a fair number of routines that never bothered to
deal with the frame pointer and did other things you won't expect a compiler
to be smart enough to do.  Considering the original ACCENT-R was in MACRO,
I doubt it's the compiler.

Just thought this would be interesting.
-------
25-May-85 22:33:27-PDT,809;000000000005
Date: Saturday, 25 May 1985  19:44-PDT
From: David Fuchs <DRF at SU-SCORE.ARPA>
To:   kronj at SU-SCORE.ARPA
cc:   bosack at SU-SCORE.ARPA
Subject: status

OK, the register problem has sure enough gone away.  I'm still left not
being able to run because of the wierd array-reference bug I reported
earlier.

Another problem I've noticed:  My program does a

	char ary[99];
	cnt=read(0,ary,99);

when it wants to get input from the terminal.  The idea is that at
most 99 characters will be read in, but when the user hits <RETURN>,
then "read" fills in as much of the array as it can, and returns the
number of characters it got to be put into "cnt".  I use this on
a number of machines and it works, but it seems that KCC insists
on reading in 99 characters from the terminal...
	-david
28-May-85 16:12:18-PDT,1358;000000000001
Date: Tuesday, 28 May 1985  15:27-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: sources still not stable

Since this is going to be a fair amount of work, I'd like to start
with the most recent stuff.  So let me know when they're ready.  The
other problem with the *.*.-2 versions is that if they correspond
to the "current" binary then they contain several bugs that I mentioned
(and which you have recently fixed)... I would much prefer to get the
fixed sources.  On the other hand, I need to get something up during
this week.  Time pressure.

Date: Tuesday, 28 May 1985  16:11-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: sources still not stable

Well I think I am about ready to have a stable version...we will see
after this next compile.  Actually the sources to the current
SYS:CC.EXE are in <KCC.ATBAT> as I had to recompile the "old" CC with
some fixes to enums to be able to compile my newer sources.
I can't think of any important bugs that were in the "current" CC
which aren't still with me in the new one.

In any case, you might as well at least take something to use in
working out your modified runtimes, and then maybe later upgrade it to
a newer and with luck more bugfree version.
28-May-85 17:40:42-PDT,268;000000000001
Date: Tue 28 May 85 17:40:42-PDT
From: David Eppstein <[email protected]>
To: [email protected]
Subject: bad bug
In-Reply-To: Message from "David Fuchs <DRF at SU-SCORE.ARPA>" of Sat 25 May 85 12:41:29-PDT

Ok, the "Bad!!!" bug should now be fixed.
-------
28-May-85 17:42:44-PDT,255;000000000001
Date: Tue 28 May 85 17:42:44-PDT
From: David Eppstein <[email protected]>
Subject: sources stable again
To: [email protected]

that is, until i start working on them again tomorrow.
a few bugs have been fixed between then and now, also.
-------
29-May-85 10:06:11-PDT,369;000000000001
Date: Wednesday, 29 May 1985  05:36-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: sources stable again

OK, as of this timestamp I've snarfed <KCC.CC> plus SYS:CC.EXE and
C:CLIB.REL.  (already have the .H stuff).  Will work on that, and let
you know how it goes (probably take a couple days).
30-May-85 13:34:51-PDT,1528;000000000005
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 30 May 85 13:31:01-PDT
Date: Thu 30 May 85 01:52:02-PDT
From: Ken Harrenstien <[email protected]>
Subject: Unflavorful feature
To: [email protected]
cc: [email protected]

I don't like the following "feature" of KCC which I just discovered
when testing the latest version.

A command string of the form
	CC FOO
produces a FOO.REL and loads it, and makes LINK produce a SAVE file
called FOO.EXE.  Okay, this is usually what you want, but the FOO.REL
file is still there, and if I try loading it by hand (in order to link with
a different library for example) I find that apparently there is something
funny about the .REL file -- LINK automatically produces an .EXE without
telling me.  This makes for large amounts of confusion.  If I try
recompiling with -c I get a .REL file that does not exhibit this behavior.
If I recompiled with -S I get a .FAI file which, when assembled, is identical
to the .REL file of -c.  The .REL file produced without switches is
DIFFERENT from these latter versions!!!

I don't know what arcane magic is being injected into the .FAI file to
make this happen, but I VERY STRONGLY DISAGREE with this method.  It
should be possible to achieve the same thing in a more logical way, by
furnishing a /SAVE argument to the invocation of LINK.  The user will notice
no difference in behavior, and the .REL files will be consistent, without
including any invisible bombs.
-------
30-May-85 15:49:34-PDT,983;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Thu 30 May 85 15:44:39-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA06557; Thu, 30 May 85 15:44:07 pdt
Message-Id: <[email protected]>
Date: Thu May 30 14:58:50 1985
From: jld@sri-unix
To: bug-kcc@su-sierra
Cc: jld@sri-unix
Subject: Flavorful feature

	I VERY STRONGLY DISAGREE :-) with Ken Harrenstein's compaint
about the new save feature in KCC.  On all the other systems I've used
(about 10), either the link step creates an executable file or it
automatically branches to the new executable in memory.  I originally
found "nosave" to be quite baffling -- the program has compiled
correctly, but the executable has vanished without a trace!  Automatic
"saving" seems much more normal to me.

	Of course, "nosave" is useful, so a NOSAVE option should be
provided.  While I'm on the subject of options, how about a -D option
(e.g., -DTOPS20)?

Jim Dein
30-May-85 17:00:11-PDT,759;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 30 May 85 16:58:37-PDT
Date: Thu 30 May 85 16:57:10-PDT
From: Ken Harrenstien <[email protected]>
Subject: Re: Flavorful feature
To: [email protected], [email protected]
cc: [email protected]
In-Reply-To: Message from "jld@sri-unix" of Thu 30 May 85 15:55:02-PDT

Note, I am not objecting to having KCC produce an .EXE file -- in fact
I think that is a good idea.  What I object to is the method -- by
inserting some bogosity in the .REL file which causes any attempt to
LOAD that .REL to produce an .EXE whether or not that is what is
desired.  KCC should produce a normal .REL and then add a /SAVE switch
to its LINK invocation.  Got it?
-------
30-May-85 17:34:40-PDT,775;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 30 May 85 17:30:28-PDT
Date: Thu 30 May 85 17:29:12-PDT
From: Ken Harrenstien <[email protected]>
Subject: Brain-damaged TENEX flushage
To: [email protected]

I note with disgust that someone removed all the TENEX conditionals
from the TOPS20.FAI runtime support which I had so carefully written a
couple years ago.  This means that I now have to do it all over again
in order to port the latest version back to our F4.  Gee, thanks a
bunch.  This was supposed to be a PDP-10 C compiler, not a
DEC-20+TOPS-20 C compiler.  Will future hackers please refrain from
the temptation to flush all references to differently colored
operating systems.  Thank you.
-------
30-May-85 18:20:47-PDT,346;000000000001
Mail-From: KRONJ created at 30-May-85 18:20:22
Date: Thu 30 May 85 18:20:22-PDT
From: David Eppstein <[email protected]>
Subject: /SAVE
To: [email protected]
cc: [email protected]

Would it be ok to simply delete the REL file?  That would be more consistent
with its other behavior and with the UNIX version in any case.
-------
30-May-85 18:24:23-PDT,1089;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 30 May 85 18:24:01-PDT
Date: Thu 30 May 85 18:22:47-PDT
From: Ken Harrenstien <[email protected]>
Subject: OS conditionals
To: [email protected]

A while back, Greg Satz asked what kinds of names the compiler pre-processor
should have predefined, like "tops20" etc.  I'd like to revive the subject
because right now I have to decide what to use for the following:
	(1) Code for TOPS-20 only
	(2) Code for TENEX only
	(3) Code for either TOPS-20 or TENEX

Mainly I'm trying to decide on a name for (3), since the bulk of the
runtime stuff falls into that category and the other two names are
obviously TOPS20 and TENEX.  In assembler programs I've used TNX (as
opposed to T20 and 10X) but that isn't very clear.  Some possibilities
are TEN20, TOPSEX (!), or plain TENEX_TOPS20 or TOPS20_TENEX.  Sigh, I
guess I'll be conservative and use the latter.

No, I'm not going to install anything in the preprocessor, but would
like to encourage some thinking about it.
-------
30-May-85 18:38:00-PDT,1396;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 30 May 85 18:34:24-PDT
Date: Thu 30 May 85 18:33:09-PDT
From: Ken Harrenstien <[email protected]>
Subject: Re: /SAVE
To: [email protected]
cc: [email protected], [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Thu 30 May 85 18:19:55-PDT

That's a possibility.  However, UNIX does not necessarily flush the .o
version.  If there is more than one thing on the line, as in
	cc foo.c bar.c
then foo and bar are both linked together into a.out and both
foo.o and bar.o are left lying around.  This actually has been handy
in my experience, especially for modules that take a long time to
compile.  KCC could not do this and still stick a /SAVE request into one
or more of these .REL files -- well it could but it would be just as
wrong as its current behavior.

What I would suggest, then, is:
	Always produce .REL files in the same way.  No funny switches.
	Specify /SAVE to LINK (not to the .REL) if we are trying to make
		an .EXE.
	If there was only one module specified, then delete the .REL
		if the loading has no problems.  (I'm not sure if it is
		possible to detect whether there were problems.)  How
		about deleting it normally (no expunge), so that the user
		can undelete the file if problems require inspection?
-------
30-May-85 20:28:45-PDT,832;000000000005
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 30 May 85 20:28:28-PDT
Date: Thu 30 May 85 20:27:17-PDT
From: Ken Harrenstien <[email protected]>
Subject: Yet another quirk - environment vars
To: [email protected]
cc: [email protected]

(Don't worry, this is the last one for today)

While checking out the C library for non-TENEX code, I ran into getenv()
which turns out to use TOPS-20 logical names as environment variables.
Is this really what we want to do?  I'm worried about conflicts between
standard UNIX env vars and TOPS-20 file search paths.  In any case, I
will need to provide another mechanism for TENEX as logical names are
not implemented there.  If anyone has thought about this before and
arrived at some alternative scheme, let's hear it.
-------
30-May-85 20:46:33-PDT,1131;000000000001
Return-Path: <[email protected]>
Received: from RUTGERS.ARPA by SU-SIERRA.ARPA with TCP; Thu 30 May 85 20:46:08-PDT
Date: 30 May 85 23:46:05 EDT
From: Mel <[email protected]>
Subject: Re: /SAVE
To: [email protected]
cc: [email protected], [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of 30 May 85 21:33:09 EDT
Uucp: ...{seismo, allegra, ihnp4!packard}!topaz!pleasant
Work: Hill-28, PO Box 879, Rutgers U, Piscataway, NJ (201) 932-2287
Home: 81 Gage Road, East Brunswick, NJ 08816, (201) 828-8252

No, no... please don't change the behavior of TOPS-20 just because
you're writing a C compiler.  If you were emulating the entire Unix
environment that would be one thing.  This type of stuff only tends to
confuse users.  Please, if I'm running on TOPS-20, give me TOPS-20.  If
I'm running on Unix then give me Unix.  If you're going to write a
complete Unix emulator then give me a complete emulator.  But don't give
me something half-assed :-).  We hackers :-) can deal with this but the
unsophisticated user is only going to get confused.....

-Mel
-------
30-May-85 22:34:12-PDT,518;000000000005
Date: Thursday, 30 May 1985  18:54-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Mods to files

I am munging TOPS20.FAI, RUNTM.C, and RUNTM.T during my F4 port.
When I am reasonably sure they are stabilized I will let you know,
so you can pull them back over.  Fortunately I don't think there are
actually very many things that need changing, since the compiler/library
is pretty modular.  I was surprised how quickly it all went together!
30-May-85 22:39:01-PDT,1350;000000000001
Mail-From: KRONJ created at 30-May-85 22:38:52
Date: 30 May 1985  22:38 PDT (Thu)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Mel <[email protected]>
Cc:   [email protected]
Subject: /SAVE
In-reply-to: Msg of 30 May 1985  20:46-PDT from Mel <Pleasant at RUTGERS.ARPA>

    Date: Thursday, 30 May 1985  20:46-PDT
    From: Mel <Pleasant at RUTGERS.ARPA>

    No, no... please don't change the behavior of TOPS-20 just because
    you're writing a C compiler.  If you were emulating the entire Unix
    environment that would be one thing.  This type of stuff only tends to
    confuse users.  Please, if I'm running on TOPS-20, give me TOPS-20.  If
    I'm running on Unix then give me Unix.  If you're going to write a
    complete Unix emulator then give me a complete emulator.  But don't give
    me something half-assed :-).  We hackers :-) can deal with this but the
    unsophisticated user is only going to get confused.....

If you want TOPS-20 you can use the COMPILE command and it should work
exactly like for other languages.  The idea is to also make running CC
directly look like running cc on a UNIX machine.  Since that's
something TOPS-20 users normally don't do, there's no conflict with
how it would be expected to behave under TOPS-20.
30-May-85 22:43:53-PDT,1939;000000000001
Date: Thursday, 30 May 1985  19:13-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: KCC switch initializations, and portability enhancements

Hmm, how would you suggest dealing with this problem.  I would like
"noadjsp" and "noadjbp" to default ON if the compiler is running on a
machine that doesn't support them.  It would become a pain to always
use -ab for every compile done on the F4.  Some possibilities:

(1) Initialize them to a preprocessor var value which is defined in
	a site-dependent configuration file.  Generalize this to
	allow specifying default value for any switch.
(2) Surround the init code in CC.C with #ifdef's corresponding to system
	types.  Quick, dirty, and (I think) wrong.
(3) Provide for some run-time checking.  Runtime code to determine system
	and machine at init time and set appropriate vars/flags.
	Goal is to reduce system/mach #ifdefs to absolute minimum.
	This could be part of runtime support code so that any C program
	could use it, not just the compiler.  Code to do machine/system
	identification already exists.

Of course, this only applies during the common case where the target
machine is the same as the compiling machine.  Doing cross-compiles
will (should) still require explicit switch setting to determine what
gets output from the compilation.  In this respect there is one
problem I noticed during a brief scan -- in CCDATA.C there are a
number of "Cross-compilation" definitions that I think should not
exist.  Can we flush them?  Use a couple new switches?  You should not
have to generate a new compiler just to compile code for another
machine.  Well, at least I don't think this should be needed for KCC;
the -a and -b switches go far towards making this unnecessary.  Now if
we wanted to cross-compile for a 68000 then I'd grant a new compiler
needs to be built!

Thoughts?
30-May-85 22:58:33-PDT,1674;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Thu 30 May 85 22:57:23-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA10225; Thu, 30 May 85 22:56:52 pdt
Message-Id: <[email protected]>
Date: Thu May 30 20:58:09 1985
From: jld@sri-unix
To: bug-kcc@su-sierra
Cc: jld@sri-unix
Subject: kcc environ

	I have been writing C source to run on both TOPS20 and 4.2
UNIX, and would like to see as much convergence as possible to minimize
the amount of conditionalizing.  Of course, we can't escape the fact
that in some ways TOPS20 is very un-UNIX-like.

	When a UNIX process modifies an environ var, its parent does
not see the change.  TOPS20 logical names, however, are seen by all
processes in the job at once.  To maximumize both compatibility and
utility, I propose the following:

1.  There are two logical KCC environments: a TOPS20 one in system
space, and a UNIX one in user space.

2.  The UNIX env is passed to the child process at the point of an
exec in usual UNIX fashion (the method depends on which execxx).

3.  The TOPS20 environ can be read only via getenv().  To avoid name
collision, TOPS20 logical names are available in lower case translaton.
When asked for USER, HOME, TERM, and PATH, getenv() looks up the
appropriate TOPS20 names instead.

4.  Getenv() searches first the UNIX space, then the TOPS20 space.
Thus one could make the result of a getenv("HOME") different in a
subprocess.

5.  To change logical names, the setlnam() runtime routine is provided
to perform the necessary system call.

6.  Some appropriate substitute is supplied for TENEX.

Jim Dein
30-May-85 23:00:03-PDT,1782;000000000001
Date: Thursday, 30 May 1985  22:58-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: KCC switch initializations, and portability enhancements

What I want to do is instead of the standard assembly preamble
hardwired into KCC, have KCC include a preamble from a file in C:.
Then the file could on a F4 define ADJSP and ADJBP to be an ADD and a
call to $ADJBP respectively.  I guess I would have to tell the
peepholer that ADJBP can be more than one instruction but no big deal.
The -a, -b, and -m flags could be dumped and replaced by one flag to
control which header file is included.  This also has the advantage
that it can easily be made to use PSECTs without complicated casing in
the compiler for whether FAIL or MACRO is being used.

In the meantime I guess you should default the switches on for your F4,
or do something more complicated if you wish.  Feel free to make the
assembly conditionals more rational (or better nonexistent).  I will
be continuing to hack the compiler (but not the runtimes) but it isn't
likely to be much hassle merging whatever changes you make back.

Re: getenv().  It was done with logicals because that's the way the
Utah PCC did it.  The other advantage of doing it this way is things
like TERM can be set globally on login and everybody gets them.  The
disadvantages are they are job wide rather than per fork/inferiors,
they are ugly, the values must look like file name lists, and they may
conflict with real logicals.  If you can find a better way of passing
them around please tell me.  Something similar to think about would be
a better way (than RSCAN) to pass arguments in exec() at least when
the program being executed is known to be a KCC product.
31-May-85 10:02:15-PDT,2997;000000000001
Date: Friday, 31 May 1985  03:06-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: KCC switch initializations, and portability enhancements

Hmmm, I think your basic idea is real good, since that would replace
a whole bunch of special switches and conditionals with a single
switch that points to an large amount of arbitrary setups.  I'm not
sure you can do it all just with an assembler preamble, though;
I think KCC has to understand the contents of this file to some
extent.  That is, it should be in a KCC-specific format, and the
assembler-preamble would merely be one datum of several in the file.
The specification of how many locations an "instruction" takes up is
one example of a datum that KCC needs to know.  Besides hacking
ADJBP and ADJSP, this also allows hacking DMOVx which is the only
other non-portable instruction I have seen used.  Oh yes, another
advantage to KCC's grokking it is that you can use this to define a
bunch of preprocessor switches (like "unix", "tops20", etc) without
needing to use -D in the command line (yet another hairy switch to
remember).  This would make it REAL easy to do cross-compiles,
set up for FAIL or MACRO (or even MIDAS!), etc.

I have vague recollections that there may be some reason why it
is better to use SUB than ADD if the ADJSP has a negative E, but
I won't swear to it.  In any case, ADJSP should never be seen with
anything but a constant E, so a macro can always do the right thing.

Re getenv().  I agree that a "real" solution to this will encompass
both environment and argument vars.  RSCAN is definitely not the best
way to set things up, although for non-KCC programs it's pretty good.
PRARG would seem much better for KCC programs, although I feel
constrained to point out that it, like RSCAN, isn't available on TENEX.
There seem to be only two general solutions that avoid depending on OS
funnies: munging the address space of the new process before starting
it, or passing on a pointer to some mutually accessible place (eg a
sharable file).
	The former is what UNIX does, in effect, by setting
up the stack space.  But this only works because everything on UNIX
knows about this.  Not true for our environment!  It would certainly
be fast, though.
	The latter approach has the advantage of preserving the
environment across layers of non-KCC processes, and doesn't require
crossing your fingers and hoping the new process isn't being clobbered
by the munging, but the file has to be structured carefully to
preserve per-process variable associations.  I can see some advantages
and disadvantages to having one file for all processes in a job, or
one file for each process; speed, robustness, etc.  Basically it seems
like a do-able challenge.  I will continue to think about it.

Question of the day: is there any way one can tell if a particular
.EXE is a KCC product?  Seems like a good idea, but how?
31-May-85 10:02:32-PDT,1015;000000000001
Date: Friday, 31 May 1985  04:33-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: KCC switch initializations, and portability enhancements

Oh!  I just thought of another thing the setup/preamble file stuff can
do.  It could say whether the code should run non-extended, extended,
or both.  Non-KLs will never have any need for the extended hair, and
doing without it removes a needless handicap when comparing KCC with
other compilers.  (I think extendedness is a vital feature, I just
want to beat the others at their own game too!)  This would be more
straightforward and efficient than the complicated load-time hacks I
suggested before.

Incidentally, I consider KCC so fast that I think you could pack a lot
more processing into it without noticeably slowing it down, which is
one reason I feel free to suggest new frobs.  I don't know if it's the
program design, or just the PDP-10 showing its muscles, but it sure is
nice.
31-May-85 10:06:31-PDT,2694;000000000001
Date: Friday, 31 May 1985  05:15-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Register variables

Throwing in another topic is probably a bad idea but as long as we are
exchanging lots of mail I thought I might as well ask about plans for
handling register variables.  I recall you hinted at improving this
during the summer.  While I don't really know what I am talking about,
not having examined the relevant KCC code or studied the subject in a
compiler course or anything, I find it interesting both because I'm
one of those instruction-efficiency nuts (as you well know by now) and
because this, to my knowledge, is the only remaining area where KCC
might possibly be regarded as inferior to other C-10 compilers.

I assume one straightforward step would be to identify loop
constructions that don't include external-to-loop gotos or procedure
calls, and then use registers for any local vars in the loop that don't
need their address taken (as in &var).

Paying attention to "register" declarations in KCC would imply either
using a traditional save/restore in the called routine of all ACs that
might possibly be declared for use by the caller, OR (if we follow
what appears to be the KCC philosophy) saving/restoring them across
every subroutine call in that procedure.  At least they already have a
place for them on the stack, so (D)MOVE(M)s would work instead of the
slower PUSH/POPs, which is a good thing.
	I find this inversion one of the most interesting and potentially
winning (jury still seems to be out) features of KCC.  This puts the
burden of saving on the caller rather than the callee which overall is
probably better, since the small subroutines at the bottom of the
hierarchy, which call nothing else but are called the most often, then
have no saving/restoring overhead.  If the compiler is real clever it
might even be able to figure out when to postpone the restore after
the call returns (cuz another proc is called "too soon").  Maybe there
should be some ratio of references to calls which determines whether
it makes sense to use registers or not.  This can actually be computed
to yield a true optimization (assuming straight in-line execution of
the references) that balances reg+save/restore expected time versus
stack-only expected time.  Vars declared "register" would have their
balance point artificially shifted farther to the reg decision as
compared with normal locals.

Oh well, I am just daydreaming.  I should stop before I get into
really radical schemes for arm-twisting LINK into doing the work based
on reg-use external symbols!
31-May-85 10:07:12-PDT,677;000000000001
Date: Friday, 31 May 1985  10:06-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: Register variables

    Date: Friday, 31 May 1985  05:15-PDT
    From: Ken Harrenstien <KLH at SRI-NIC.ARPA>

					If the compiler is real clever it
    might even be able to figure out when to postpone the restore after
    the call returns (cuz another proc is called "too soon").

This already happens.  When a register is saved on the stack it
doesn't get taken back off until somebody wants to use it.
I don't currently have any real plans to do register variables but it
is fun to daydream about, and maybe someday...
31-May-85 14:01:24-PDT,342;000000000001
Date: Friday, 31 May 1985  12:17-PDT
From: Greg Satz <SATZ at SU-SIERRA.ARPA>
To:   KLH at SRI-NIC.ARPA
cc:   kronj at SU-SIERRA.ARPA
Subject: IOCTL.C trashed?

That file wasn't trashed, just not finished. I started writing it
but stopped when I found out about Gingell's intentions. It didn't
seem worth it to duplicate the effort.
31-May-85 14:59:12-PDT,1142;000000000001
Return-Path: <fortune!redwood!rpw3@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Fri 31 May 85 14:58:53-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA19038; Fri, 31 May 85 14:57:52 pdt
Received: by fortune.UUCP id AA04170; Fri, 31 May 85 14:56:47 pdt
Date: Fri, 31 May 85 14:45:41 PDT
Message-Id: <[email protected]>
From: fortune!redwood!rpw3@sri-unix (Rob Warnock)
To: [email protected]
Cc: [email protected]
Subject: Re: OS conditionals
In-Reply-To: Your message of Thu 30 May 85 18:22:47-PDT
References: <[email protected]>

+---------------
| ...Mainly I'm trying to decide on a name for [either TOPS-20 or TENEX],
| since the bulk of the runtime stuff falls into that category and the
| other two names are obviously TOPS20 and TENEX.  ...but would like to
| encourage some thinking about it.
+---------------

What about the venerable "TWENEX"???    ;-}


Rob Warnock
Systems Architecture Consultant

UUCP:	{ihnp4,ucbvax!dual}!fortune!redwood!rpw3
DDD:	(415)572-2607
USPS:	510 Trinidad Lane, Foster City, CA  94404

31-May-85 17:54:25-PDT,552;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Fri 31 May 85 17:51:33-PDT
Date: Fri 31 May 85 17:50:18-PDT
From: Ken Harrenstien <[email protected]>
Subject: Re: OS conditionals
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "fortune!redwood!rpw3@sri-unix (Rob Warnock)" of Fri 31 May 85 15:00:00-PDT

Unfortunately, in my experience TWENEX has normally been considered a
synonym for TOPS-20 rather than referring to either T20 or 10X.  Shucks...
-------
 1-Jun-85 09:19:55-PDT,339;000000000001
Date: Sat 1 Jun 85 09:19:55-PDT
From: David Eppstein <[email protected]>
Subject: Tenex and pipes
To: [email protected]

Does Tenex have PTYs?  I had been thinking about making the pipe support
use them if PIP: didn't work, and I guess you would need to do something
like that for Tenex (which I assume never has PIP:).
-------
 1-Jun-85 14:22:47-PDT,980;000000000001
Date: Saturday, 1 June 1985  13:05-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Bosack at Score
Subject: thought for PDP-10 opcode hack

Next time you're building a PDP-10, you might want to think of this:
The address field of a POPJ instruction is currently useless, and in
general left zero.  How about making POPJ do an ADJSP of that value
before what it normally does.  Then the usual POPJ 17, would do the
same as it always did but the combination of ADJSP 17,-n and POPJ 17,
often found in KCC output could be compacted into POPJ 17,-n.  I guess
a positive value would be pretty useless but that's better than all
but one value being useless.

The other possible thing to do with the address of a POPJ is use it as
a skip count, i.e. RETSKP would become POPJ 17,1.  I suppose both of
these meanings could exist concurrently (one for negative numbers, one
for positive, both meaning the same as now for zero) but that's
getting a little hairy.
 1-Jun-85 14:23:26-PDT,1297;000000000001
Date: Saturday, 1 June 1985  14:22-PDT
From: Bill Palmer <whp4 at SU-SIERRA.ARPA>
To:   KLH at SRI-NIC.ARPA
cc:   kronj at SU-SIERRA.ARPA, satz at SU-SIERRA.ARPA
Subject: KCC -> TENEX hassles

A few thoughts:

The reason that I did getenv() with logical names was that a) pcc did it
that way and b) I didn't/don't anticipate having a shell around any time
soon to do it any differently (e.g. a pointer to an array or whatever it
is that unix usually does).  I didn't know that TENEX doesn't have logical
names.  There are lots of things I don't know about TENEX; not too surprising
since I have never used it.

I thought out what needed to be done with .TMP files, at least for TOPS20,
when I originally did the LINK stuff because I discovered that running out
of PRARG block space was one of the things that made it impossible to just
build the compiler by doing something like "LOAD @CC.CMD".  I haven't really
had a chance to sit down and do it; maybe when June 12 rolls around I'll
get going on it if someone doesn't beat me to it (don't all jump at once).
I can attempt to pass on all of my knowledge about this cruft to anyone who
wants to try to make it work under TENEX - or if someone can give me access
to such a machine I could give it a go myself.

					Bill
 1-Jun-85 14:23:51-PDT,2801;000000000005
Date: Saturday, 1 June 1985  04:37-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA, whp4 at SU-SIERRA.ARPA,
      satz at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: KCC -> TENEX hassles

(this is addressed just to those people who seem to be doing most
of the hacking so far)

Well, I think the only real problem I will have with bringing up the
new KCC on TENEX again is the CCASMB.C module.  The runtime library is
a different story.  Here is what I've done so far:
	I've defined a global runtime flag, "_tenexf", which is set
non-zero by the startup code when it detects it is running on a TENEX
system.  All CLIB routines check this flag and do the right things at run
time.  There are still some routines I have not been able to completely
convert because no equivalent exists (eg GETENV since TENEX does not
have logical names); however, I expect that even these will succumb to
simulations written as the need arises.  The only CLIB routine that relies
on compile/assembly time system determination is TOPS20.FAI, and I may
make this runtime-dependent too (not finished with it yet).
	The reason I took this approach was to get things working more
quickly, and avoid having to carry around lots of duplicates.  [Note:
To some extent the NIC can also make good use of binary portability
since we would like to shuffle things back and forth between our 2065
and our F4 fairly often; this IS possible, but I don't think I'm
really serious about it, and this shouldn't be a factor.]
	Here is what I would like to do:
(0) Put more organization into the machine-lang support code by
	using a single CRUNT.FAI file which will .INSERT others as
	required by the various conditionals.  $SPUSH, $ADJBP, etc
	would be part of this.  I can do this.
(1) Arrive at some agreement on KCC-predefined macro names to indicate
	system types.  Right now I need something for tops20, something
	for tenex, and some way of assembling code for either.
	Someone else to implement this.  KCC only.
(2) Implement the -D switch in KCC!  If I had this I probably won't have
	used _tenexf.  Someone else to do this.  KCC only.
(3) Design and code a TOPS20_TENEX-compatible way of passing arguments
	and environment vars.  Me to do this.  CLIB stuff.
(4) Revise the CLIB routines to implement (3).  Me to do
	some or all of this.  CLIB stuff.
(5) Fix KCC to at least call FAIL when on TENEX.  If possible, LINK would
	be nice, but the invocation is so hairy I don't know for sure what is
	going on.  I would hope WHP4 knows enough to know what needs to
	be done.  There is some mumble about .TMP files??

Too tired now to think further.  Most of these items are pretty
independent of each other and so could be done in parallel, I hope.
 2-Jun-85 09:15:16-PDT,3064;000000000001
Date: Sunday, 2 June 1985  01:47-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: Tenex and pipes

I'm not sure about PTYs.  The idea seems fraught with unknown perils
to me.  I think that a more reliable fallback scheme would simply be
to crudely simulate pipes by writing stuff out to a temporary file, then (when
the writer is done) starting up the next process and SPJFN'ing its
primary input to that temporary file.  If you wanted to get fancier,
and the next process is known to be a KCC product, you could run in
parallel since the reader will know how to detect that its input is a
pipe-type temporary file (use some weird name), and will be able to
cooperate with the writer.  Actually, once we have a better way of
passing info to the other process, we can do spiffier things like
sharing a mapped page buffer so that pipe-ing (or general real-time
IPC) is super fast, better even than PIP:.  The only problem that comes
to mind is detecting EOF if the writing process croaks unexpectedly.

My current opinion is that TENEX KCC can live without pipes.  There are things
to worry about but that isn't one of the big ones.

More on KCC-ness detection after thinking a bit: I don't see any good
way to do this prior to a GET.  After the GET, there are any number of ways
to detect KCC-ness, and if the fork is an inferior, everything's great.  The
problem happens when you are trying to replace yourself with the new program,
since you then have less than 16 instructions to do whatever needs to be done.
One real straightforward, if drastic, solution is the following.  If you
are about to do a real EXEC() and replace yourself with a new program, and
need to know what it's going to be so you can set things up properly, then
call this little hack routine which:
	(1) creates a new inferior fork,
	(2) GETs the program into that,
	(3) Inspects what it needs to (eg the entry vector, or magic words
		at magic locations)
	(4) Reports back, without killing the fork.

	Then one step in the AC program will kill this extra fork right
	after the GET.

	The reason for postponing the kfork is so the pages mapped
into memory by the first GET will still be available to share with the
second GET and thus the second invocation will be very fast, hopefully
reducing the overhead to roughly that for CFORK/KFORK alone.  A hack,
but not too bad a trick, especially if KCC-ness allows the use of
efficient hairy features like mapped pipes, args, etc.  Isn't shared
memory wonderful?

Date: Sunday, 2 June 1985  09:13-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: KCC-ness prior to GET

The only way I can think of checking this before doing the GET is to
somehow set the user settable word in the FDB to some special value
(this is for instance how MM knows an editor is TECO based).
But it seems more reasonable to wait until after the GET and check for
some magic number somewhere.
 2-Jun-85 11:13:19-PDT,1021;000000000001
Date: Sunday, 2 June 1985  02:03-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   whp4 at SU-SIERRA.ARPA
cc:   kronj at SU-SIERRA.ARPA, satz at SU-SIERRA.ARPA, KLH at SRI-NIC.ARPA
Subject: KCC -> TENEX hassles

I suspect that .TMP files will work on TENEX, since they are really a
TOPS-10 hack which PA1050 supports, and PA1050 definitely exists on TENEX.
So if it is made to work on TOPS-20 I am pretty sure it will work on TENEX
with minimal (if any) tweaks.

As far as TENEX access goes, I would dearly love to throw the machine open.
Unfortunately it is supposed to be "protected" (not secret, just hard to get
at), which means for example that it neither runs any network servers nor
has any dialup access.  A real pain especially when one is accustomed to
working from home!

Supposedly Tymshare or Foonly has the capability to beef up the machine
to run TOPS-20 (this is now being done for SRI-CSL!) but our breath ran
out a long time ago and we gotta do things now with what we have now.
 3-Jun-85 17:42:20-PDT,616;000000000001
Mail-From: KRONJ created at  3-Jun-85 17:42:07
Date: Mon 3 Jun 85 17:42:07-PDT
From: David Eppstein <[email protected]>
Subject: New features
To: [email protected]

Mostly lately I've been fixing bugs and improving the code KCC generates.
But I just implemented some switches y'all might be interested in:

-Dname or -Dname=expansion  as with other cc's
-Uname  (this is a no-op since at that point nothing is defined)
-C  pass comments through preprocessor (only if -E is on)
-E  now sends output to stdout like the 4.2 cc
-w  don't type out any warnings (errors will still be shown)
-------
 3-Jun-85 23:00:22-PDT,2256;000000000001
Date: Monday, 3 June 1985  18:07-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Plan and suggestion

Unless you scream quickly, I will start restructuring the library stuff
as described in the following plan.  Note the implied suggestion that -I
will someday be implemented, as well as the possibility of applying
the preprocessor to assembler files.  (This has been a very useful feature
in configuring unix systems).  I no longer think that we need to have
predefined macros in KCC, as I have found some correspondence in a unix
mailing list which strongly suggests this will not be a panacea.
	------------
Overall setup:

Files in standard include directory:

	C-ENV.H		- C Environment definitions.  Should be included
			by every CLIB routine which has any system or
			environment dependencies.
			Use IFNDEFs to allow testing changes.
			Different versions to exist for different
			systems/machines/configurations.

	C-ENV.FAI	- Definitions & parameters inserted into every
			assembly-language routine, especially CLIB.
			Equivalent of C-ENV.H with additional assembly-lang
			macro defs, etc.

All source files for KCC and the CLIB routines should refer to these
definition files with "" instead of <>, so that different versions
of KCC/CLIB can be tested easily.  Normal compilation for one's own
system will simply point to the standard include-file location with
the -I switch.  Unfortunately, .FAI files cannot use logical names in
their .INSERT request since not all systems have logical names.  This
problem can be solved (and much interesting functionality added) if
KCC were able to run its preprocessor over assembly files like Unix CC does.

Files in CLIB:

	CRT.FAI	 - Standard C runtime startup & auxiliaries for KCC.
			All necessary assembly time definitions
			are furnished by C-ENV.FAI.
			Other files may be inserted as needed.
			May want to allow variant startups to exist in
			the .REL library, with different names (CRTxxx).
			Or maybe not.

	<funct>.FAI	- Try to isolate all external subroutine calls into
			individual files, with separate OS condits
			within each file.  All insert C-ENV of course.
 3-Jun-85 23:00:28-PDT,584;000000000001
Date: Monday, 3 June 1985  19:01-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Argh - DMOVx

Foo, it turns out the F4 differs from the F3 in that DMOVx is not supported.
What are the chances of adding another switch to disallow production of
DMOVx in KCC code?  Alternatively, would it be easier for the switch to
prevent anything from trying to skip over a DMOVx?  The latter would allow
us to get away with redefining the DMOVE/M instructions as macros (DMOVN is
kind of screwy, I hope it is not used).
 3-Jun-85 23:25:45-PDT,1934;000000000001
Date: Monday, 3 June 1985  23:22-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: Argh - DMOVx and runtime rearrangement

Sure, I could add another switch for DMOVx.  I do use DMOVN; I don't have my
documentation with me but I thought that was the same as two MOVNs.  Does the
F4 have DFAD etc?  If not maybe I should flush doublewords altogether when
generating code for it and treat doubles identically to floats.  I would like
to be able to keep the switches used by KCC simple; is there any reason why you
would want one of ADJSP, ADJBP, and DMOVx but not the others?  If not, I could
just have one switch for KL features.

Re your C-ENV plan.  If I understand your scheme, you plan to merge TOPS20.FAI
and WAITS.FAI into CRT.FAI, and then split out as many routines as possible
into separate files.  I assume C-ENV.* would have definitions for things like
operating system, whether PIP: exists, and so on.  Sounds reasonable enough.
For now I would like to keep extended addressing out of it, and keep with the
current scheme of the same code running extended and not.

I had been basing the decision of keeping in TOPS20 and RUNTM or separately on
whether the routines were needed by startup code (e.g. pipe() is in TOPS20 and
not separate because if the startup sees a | it will call it).  But maybe
splitting them further is a good idea.  I also hadn't come to a decision about
whether routines that called JSYSes should be in FAIL or in C calling jsys().
Currently some are one way and some are the other.  Your decision whether you
want to change this.

I don't understand why you would want to pass assembly output through the
preprocessor, except perhaps to generate C-ENV.FAI from C-ENV.H.  Isn't that
what macro assemblers are for?

Enough for now.  Have fun munging the runtimes.  It will be interesting to see
how they come out.
 4-Jun-85 10:24:25-PDT,6624;000000000005
Date: Tuesday, 4 June 1985  04:32-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: Argh - DMOVx and runtime rearrangement

Yes, your understanding of my plan is correct, and I will take care to
follow the current extended-addressing scheme (ie provide for either
if processor allows it).  Thanks for your comments about the
JSYS-routine locations.  As for running the preprocessor over
assembler files, it's not a big deal.  I could show you an example of
how a C include file is used to configure things for an AS assembly
program (the kernel, in fact); it's real convenient to have just one
place where switches are set, and use the same search-path conventions
for both C and assembler files.  Admittedly the fact that AS is not a
macro assembler makes it much more in need of help, but for us it is
primarily the switch-definition convenience that looks attractive.
Now for the REAL problem: floats vs doubles vs KA-10.

		-------------------

First, the F4 should be considered a KA-10 (several times faster, but
the current microcode still only implements the KA-10 set).  Thus,
DFAD, DFSB, etc are out, as well as DMOVx.  Instead there is FADL,
FSBL, etc.

More on DMOVN:
	Ah, ha, I knew keeping my old DEC10 Hardware Ref Manual around
would come in handy someday!  It gives me an exact sequence for
simulating DMOVN on a KA (whereas the newer 10/20 Processor Ref Man
just lists DMOVN and that's that).  It is rather bulky:
	DMOVN AC,E ==>
		SETCM AC,E	; Take ones complement of high wd
		MOVN AC+1,E+1	; Take twos complement of low wd
		TLZ AC+1,400000	; Clear bit 0 in low word
		SKIPN AC+1	; If low word is all zero,
		ADDI AC,1	; change high word to twos complement.

However, I think I know why this sequence was flushed.  This simulation
is only necessary if you are trying to support hardware doubles on a
machine without DMOVN.  There is no such machine, since the KI was the
first processor to hack hardware doubles, and it had DMOVN.  The KA
uses what is called "software format" for doubles and DMOVN will not
work for it anyway -- there is something else called DFN for that case.


About your suggestion for equating doubles to floats, and thus doing
away with double-word floating point altogether on KAs.  Well, this is
a tougher problem than I expected!  I came up with the following points:
	- KAs do have instructions for hacking a double precision
		format.  However, this so-called "software format"
		is not the same as the "hardware format" used by KI/KL
		double-precision instructions.
	- I have this horrible fear of running into problems with
		routines that expect a 2-word double argument on the stack
		and are given a 1-word float instead.  Everything needs
		to be appropriately conditionalized... ugh!  MAYBE this
		is only a problem with printf.  Maybe.
	- This shouldn't be your concern, but happens to be mine: I have
		a package of routines which pass 2-word string descriptors
		around by hiding them as doubles.  This code also works
		on other machines because double (as far as I've ever seen)
		is always at least 2 machine words, which is good enough
		for the descriptor.  Guess what happens if doubles are
		one word.  Hack hack to change package, and bye bye to all
		the nifty DMOVE/M optimization on the 20.  I know it was
		a hack, but gee, not all compilers know how to pass
		structures like KCC does!
	- Trying to do software double-prec stuff on the KA will make
		things somewhat more complicated to compile.  Maybe not
		very much.
	- However, this is what a compiler is supposed to do.  KCC is
		supposed to make it easy to hack double-precision stuff;
		if the user wanted single-prec, it's trivial to write
		your own stuff in assembler with FADR, etc!


My overall inclination is that I would like KCC to hack true doubles on
the KA.  It is okay if the resulting code is not as efficient as possible,
as long as the arithmetic result is correct.  Since I seem to be the
main agitator for this (why?  why?  why me?) I guess I would be willing
to help dig up the appropriate code sequences for the basic arithmetic ops,
if you tell me what is needed. (oh no... I can't believe I am saying this...)

The real problem to be faced is that the PDP-10 has, to date, six (6) number
representation formats:				C declaration:
	(1) single-word fixed			int
	(2) double-word fixed			-
	(3) single-word float			float
	(4) software double-word float		-
	(5) hardware double-word float		double
	(6) G format double-word float		-

I think it would be nifty if KCC's code generation tables for "double"
were capable of generating code for any selected one of the 3 double-prec
formats.  After all, adding the capability of selecting one of two
means all the work is done for selecting one of three, and the G format
instructions are parallel and comprehensive.  What the hell.
(Hmmm... who knows, maybe someday "superlong" will be a defined C type and
then we can use double fixed too!)

P.S. on switches:
	The DMOVx switch (or hardware-double switch) should be
independent of the others because the KI has hardware-double stuff but
no ADJSP/ADJBP.  However, it should be OK to combine the ADJSP and
ADJBP switch, because if one exists, the other should also; at least
on DEC processors.  I think the only reason they were separate was
because our (dear departed) F3 supported ADJSP but not ADJBP, and
Kok Chen wanted to be able to use ADJSP when possible.  I consider
this a fairly worthless distinction, since I have never seen KCC code
that used anything but a constant for ADJSP, and thus an ADD or SUB
does as well.  (I did find one runtime support hack ($PUSH) that will
need changing, but that's easy).

You may simply want to provide a single processor-type switch, such as
"-C=KA".  Or make this part of the idea for specifying a
configuration/ preamble file, which can contain a processor
definition.  Clearly all of the above cruft derives directly from the
processor in question, and this can appear as a single switch to the
user, even though there will probably be several individual-feature
switch variables within KCC.  The ones to know about are the
KA,KI,KS,KL.  If we ever optimize for extendedness too, then we would
also need to distinguish between what I call the KLS (single-section
KL) and KLX (Extended-section KL).  Presumably if some non-DEC processor
comes along that doesn't quite fit one of those categories we can add
another type.

Whew, am I done yet?
 4-Jun-85 22:54:10-PDT,1014;000000000005
Date: Saturday, 1 June 1985  18:14-PDT
From: David Fuchs <DRF at SU-SCORE.ARPA>
To:   kronj at SU-SIERRA.ARPA
Subject: compiler problem

KCC is giving a nonsense error message on SCORE:<4SCRATCH>C1F.C,
but only when I say COMP C1F.C, not when I say CC -c C1F.C.  I tried
using the -E switch to get a complete, single file to gripe to you with,
but the resulting .I file does NOT contain all the macro expansions!  So,
it fails to compile for other reasons.  To try C1F.C, you'll need C1F.H,
C0.H and PASCAL.H.
	-david

Date: Saturday, 1 June 1985  18:31-PDT
From: David Fuchs <DRF at SU-SCORE.ARPA>
To:   kronj at SU-SIERRA.ARPA
Subject: more 

OK, so I got it all to compile and link, and the result is on
SCORE:<4SCRATCH>010LNK.EXE, but when I run it, it doesn't even
get to main() before bombing out with a Pushdown overflow (that
somehow seems to keep repeating).  DDT shows that we've only got
a dozen or so instructions into the startup code when this happens,
at $ISTART+5.
	-david
 4-Jun-85 23:00:48-PDT,349;000000000001
Date: Tue 4 Jun 85 23:00:48-PDT
From: David Eppstein <[email protected]>
Subject: hiding struct in double hack
To: [email protected]

Actually KCC will use DMOVx and register pairs just as nicely for
two-word structs as it will for doubleword floats.  Or is there some
reason why you want to use this hack for some other machine?
-------
 5-Jun-85 11:26:23-PDT,4330;000000000001
Date: Wednesday, 5 June 1985  03:38-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: New features

That was almost too fast!  Hmm, I think almost all of the necessary
pieces and tools are now in place.  I'm going to finish up CLIB, then
I'll copy over KCC again and try to put it all together.

Another victory for packrats dept: I found in another of my old DEC
processor manuals a complete set of KA-10 double precision routines,
which I instantly transformed into macros for DFAD, DFSB, DFMP, DFDV,
DMOVN, and DMOVNM.  We win!  Win!  BUT...  there's just one little
problem.  They require the use of a 3rd AC.  That is, DFAD A,M (for
example) will leave the result in A and A+1, but will also clobber A+2
to something random.  Can KCC be informed about this somehow?

Note there is more to it than just improving optimization.  Even if
the macros did a PUSH/POP of A+2, grave trouble would result if the
compiler ever output something like DFAD 15,M.

You know, it bothers me a little bit that we are faking things out by
using macros to simulate instructions.  I think it might be better to
be more explicit about what is going on, by having KCC generate macro
calls rather than "instructions".  These can be flagged by having a
"%" as the first character.  For example, the operation for double
floating add would be output by KCC as %DFAD(A,M).  Note the use of
explicit parentheses.  This approach would also let us handle fix and
float conversions better (in line or via call) by just using %DFIX
and %DFLOT, although I guess some KCC assumptions about register use
might need updating.  (more on this farther on)

[By the way, one of the wins of the new G format is the complete set
of instructions for transforming G format doubles to and from (1)
integers, (2) double integers, and (3) single-word floats!  Enticing.]

This scheme has the advantage of being very simple to implement, and
making the KCC/runtime interface a lot cleaner.  However, I still
don't think this is the ultimately "right" approach, because KCC will be
unable to apply any optimization without special knowledge hacked into
some table, like whether the macro will expand into one or N
instructions, whether certain ACs are preferable for setup -- certainly
an issue with the current routines like $SUBBP, $DFIX, etc -- and thus
you don't really have very much flexibility after all.

My thinking is that the right way would allow a data file to specify,
at compile time, what code needs to be generated for specific operations.
In other words, we need some way for KCC itself to understand what these
"macro calls" would actually generate, and generate that itself, so that
the assembler only sees real instructions.

When KCC encounters an operation that has a non-default code specifier,
it would expand this code (sort of like macro expansion) and then knows
whether it is dealing with a skippable instruction, whether special
AC setup is needed, and what ACs might need saving.  Because it understands
the specification, it can derive this information itself and readily use
it to optimize the code however it wishes.

The catch is that I don't know how difficult it would be to implement
such a scheme; the data structures of KCC may require specifications
to be written in an exotic fashion (i.e. not simply a macro-type list
of instructions).  However, I would suggest a simple compromise:
provide a way for the same data file to both specify the assembler
macros (as part of the literal-string preamble spec) AND to tell KCC
about the relevant features of each operation, without trying to
completely describe the associated macros.  The latter spec can be in
some trivially parsed format, whatever is easiest.  This would be
satisfactory, I think, since all information would be in the same
place, making it easy to coordinate, and dynamic, making it easy to change.

Anyway, using explicit macro calls in place of the "instructions" like
DFAD is still a good idea, because it helps whether or not the further
step of allowing dynamic operation-attribute specification is ever added.
This first step can also be taken without breaking anything since
your simple preamble file feature now exists.
 5-Jun-85 11:26:51-PDT,899;000000000001
Date: Wednesday, 5 June 1985  03:46-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA, satz at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: WAITS?

I suspect the WAITS code has not been used in a long time.  There are
many things missing, and many artifacts that I recognize as belonging
to the Kok Chen era.  I am not sure whether I should try to bring that
stuff up to snuff or not.  Currently what I am doing is enclosing dubious
fragments in comments (ie preserve but don't use) and inserting things like
IFN WAITS,<.FATAL Routine FOO not coded yet>
where I notice deficiencies.  There isn't much of it anyway.  Do you
know what the story is about it?  Should BUG-KCC be queried?

When I get done with this, it should be much easier to port KCC elsewhere,
so I don't see any reason why WAITS could not be supported, if someone wanted
to do so.
 5-Jun-85 11:31:47-PDT,305;000000000001
Date: Wednesday, 5 June 1985  11:28-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
cc:   satz at SU-SIERRA.ARPA
Subject: WAITS?

I don't think anyone else on Bug-KCC knows much about this either.
If you want to make it work you're welcome to do so.
 5-Jun-85 11:31:55-PDT,179;000000000001
Date: Wednesday, 5 June 1985  11:30-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   KLH at NIC
Subject: need third AC for double macros

That's what AC16 is for...
 5-Jun-85 13:16:51-PDT,601;000000000001
Date: Wednesday, 5 June 1985  11:58-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: need third AC for double macros

The third ac must be A+2.  It cannot be a random AC.  So unless you
are saying that A will always be 14, I don't see how this helps, or how
this prevents A from being 15?

Date: Wednesday, 5 June 1985  13:16-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: need third AC for double macros

So do an EXCH A+2,16 before the sequence and again after.
 6-Jun-85 10:12:42-PDT,265;000000000001
Date: Thursday, 6 June 1985  07:15-PDT
From: Rich Cower <COWER at COLUMBIA-20.ARPA>
To:   Eppstein at COLUMBIA-20.ARPA
cc:   COWER at COLUMBIA-20.ARPA
Subject: c compiler

no reason not to have two - actually, i hear the new mex one is very
good.

..rich
 6-Jun-85 10:17:54-PDT,1182;000000000005
Date: Thursday, 6 June 1985  02:27-PDT
From: Jim Lewinson <Jiml at SU-SUSHI.ARPA>
To:   Kronj at SU-SUSHI.ARPA
Subject: Structure members in KCC?

Is it possible to get better support for structure members in KCC?

For example, making the set of offsets be local to the structure,
so that you can two structures that are different, but have the same
members in a different order, for example?

Also, a possible side effect of the above and more important - Could
it check that the member given is correct for the given structure?
That is, if I have foo in structure A and bar in structure B, then
B.foo should be invalid.

					Jim

Date: Thursday, 6 June 1985  10:11-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Jim Lewinson <Jiml at SU-SUSHI.ARPA>
Subject: Structure members in KCC?

It already lets you have different named members in different structures.
This has been true for almost a year.  There have been some bugs in
supporting this; maybe you were using an outdated version of the
compiler.

Re warnings if the member isn't correct.  Maybe I'll do this once I
fix the compiler sources to use correct members from their structs.
 6-Jun-85 10:18:03-PDT,1615;000000000001
Date: Thursday, 6 June 1985  10:17-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: need third AC for double macros

    Date: Thursday, 6 June 1985  02:49-PDT
    From: Ken Harrenstien <KLH at SRI-NIC.ARPA>

    The fact that there is a register available for saving stuff in is
    meaningless (I could have used register 0, or done a PUSH/POP) -- the
    point I'm trying to make is that the float operations require 3
    sequential ACs and unless you know that you'll never invoke them on 15
    (or 16 or 17 obviously) then trouble is possible.

    The question is: are you certain that none of the floating instructions
    will ever be invoked with an AC argument >= 15 ???

That's what I meant when I said that AC16 was free to be used.
All registers used by KCC internally are 1-15, with 1&2 being special.
0 is never used so dereferencing NULL won't lose too badly.
17 is never used because it's the stack pointer.
16 is reserved for situations just such as this one.
Therefore the highest numbered register pair you will see is 14.

    You also didn't answer the original question, of whether KCC can be told
    about the AC usage of the operation.  The implication I gather is that
    whether it is or isn't, you'd rather not bother with it.  That's okay but
    it is better to just say so.

I suppose it could, but I'd rather not.  The way I was thinking of it
was that KCC wouldn't know the difference between these ops and the
regular KL instructions, except that it couldn't skip over the
simulated ones.
 6-Jun-85 11:50:27-PDT,885;000000000001
Date: Thursday, 6 June 1985  10:34-PDT
From: Jim Lewinson <a.Jiml at SU-GSB-WHY.ARPA>
To:   Kronj at SU-SIERRA.ARPA
Subject: Structure members in KCC?

I am working on Sushi and I am having lots of problems with structures
that have a similar named member with different offsets.  That is:

Foo {
	Y
    }

Bar  {
	X
	Y
}
 

Is bad because Y has "two" different offsets.  However, Unix C is quite happy
with this and it makes porting Unix code a real nightmare.

						Jim

Date: Thursday, 6 June 1985  11:49-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Jim Lewinson <a.Jiml at SU-GSB-WHY.ARPA>
Subject: Structure members in KCC?

Calm down. I already said that KCC is supposed to have been working
this way for a long time.  Your problem was merely that the version on
Sushi was a little buggy. I have installed what should be a better one.
 6-Jun-85 15:12:05-PDT,890;000000000001
Date: Thu 6 Jun 85 15:12:04-PDT
From: David Eppstein <[email protected]>
Subject: nasty assembly language problem
To: [email protected], [email protected], [email protected]

I have this code in a switch statement that wants to do
	JRST	@$56-10056(4)
This is what it should want to do, no problem with that.
The problem is that the address it assembles to is completely
bogus, causing the switch statement to jump into never never
land.  This is in KCC, so it is not good that it dies horribly.

My feeling is that it's related to falling off the front of
highseg.  But I don't know whether that's the real problem,
or what to do about it in any case.

Have either of you seen anything like this before?
Do you have any ideas about how to get around it?

In case it matters, it should be assembling to 431605 and
it actually assembles to 645071.
-------
 6-Jun-85 15:39:25-PDT,217;000000000001
Date: Thursday, 6 June 1985  15:38-PDT
From: Jim Lewinson <Jiml at SU-SUSHI.ARPA>
To:   Kronj at SU-SIERRA.ARPA
Subject: Thank You!!!

It works now.  I copied over C-HRD.FAI as well as the new CC.DOC.
					Jim
 8-Jun-85 09:48:34-PDT,4255;000000000001
Date: Saturday, 8 June 1985  07:43-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Win! Win! If, if...

This started out as a progress report, but I began explaining what the
last major hassle was, which led to a suggestion for fixing it, which
led to a much better idea that not only fixes that but allows us to
fix problems that haven't even come up yet!  In an attempt to limit my
messages to one topic at a time, I'm just going to describe the idea
here, with some of the justification leading up to it.

	The main current irritant in the porting procedure is the
inability to use a context-free ".INSERT C-ENV.FAI" in every .FAI
module.  FAIL has nothing like CC's -I, nor #include <>, and it's
important not to put in a directory specification since that requires
changing EVERY source file when the pathname changes (and it
definitely changes from system to system!) If you recall, this was one
of my motivations for wishing that KCC could apply its preprocessing
to assembler files; #include <> (or "") then wins and everything is
consistent.

	However, I've reviewed my UNIX makefiles which do something
similar and have concluded that KCC should not be doing this for files
of type .FAI or in fact anything but .C.  With the new switches you've
added, we can right now win to some degree by doing the following:
	Rename all *.FAI modules to *.C, and stuff in preprocessor
	control statements, with #includes and whatnot -- the whole works.
	Then, in the .MIC (or equivalent) file to build a program, these
	modules are built as follows:
		@CC -E module.C > module.I
		@FAIL
		*module=module.i

	Nifty, huh?  The only thing you have to be careful of is
remembering to use the -E switch rather than trying to actually
compile them like real C code.  This works, but is still a somewhat
inelegant kludge, whereas the right solution is MUCH more powerful:

	==============================================================
	Since KCC is already able to hack -E mode (just scanning for
	preprocessor stuff) then it should not be difficult to turn
	this mode on and off dynamically within the file itself, by using
	directives such as #ASM and #ENDASM.  Thus we can pass (assembler)
	text directly to the output file.
	==============================================================

	I've seen this in other C compilers; it's not standard, but
this doesn't matter since its use is always conditionalized with
system-dependent switches.  This would allow winnage similar to SAIL's
"quick!code" construct where you can embed assembly text directly in
the high-level routines; AUGMENT's L10 language (which a great deal of
our stuff is written in, and which we want to convert to C) also has a
similar feature.  Very, VERY convenient!  The combination of #ASM and
C-HDR is so powerful that it is hard to believe just how many formerly
hard or impossible things become easy and possible.  Some sample
winnages:

WIN: Just compile the .C module without worrying about what's inside it.
No special funnyness with -E (or anything else) required.

WIN: By being a little careful about what is included within #ASM's,
and providing appropriate C-HDR files to define the preliminary macros
and stuff, people can generate code suitable for any assembler they please.

WIN: There are several things in KCC and the library that depend on using
FAIL.  By using KCC's preprocessing we can achieve a meta-selection of
assembler syntax and no longer depend on FAIL.

WIN: Cross-assemblies become real easy, especially with -I and -D.

WIN: We can use a single environment file for KCC/CLIB, not two or more.

WIN: We can put all possible versions of a particular routine into the
same .C file, even if the code is partly in C, partly in assembler, and
differs drastically depending on the system.  Much easier to keep track
of the software.

WIN: For a number of reasons (standardization, commonality, enveloping,
integration) it will become easy to write assembler code in such a way
that many implementation details of KCC can be changed without
breaking the hand-coded portions.

and on and on.  Gee, I can't wait...
 8-Jun-85 09:50:35-PDT,1685;000000000001
Date: Saturday, 8 June 1985  00:41-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Snarfed files

OK, I have snarfed (via FTP UPDATE -- wonderful feature!) the latest
KCC.CC stuff, and will be working with that over the weekend.  This is
when I will actually modify some compiler source (conditionalized), so
need to coordinate things a little.  I'll let you know as soon as I
have something put together.

Date: Saturday, 8 June 1985  01:52-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Oops!  Bug in new KCC (85)

I tried the current [sierra]sys:cc.exe.86 on my favorite KCC bug testing
program and it came up with several error messages, all saying
"Register allocation error: release of a spilled register"

If you want to see this for yourself, the program is <KLH>HOCK.C.  It
is pretty big.  The previous KCC compiles it OK.

Date: Saturday, 8 June 1985  01:53-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: "previous KCC" => version 76.

Got the version digits transposed.  The one I tested was 83 (not 86).
The previous version, which appeared to work OK on the same test program,
was 76.

Date: Saturday, 8 June 1985  09:50-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: Snarfed files

Actually the current version of the sources is not exactly together.
I think it mostly works, but there is some strangeness I was still
investigating.

I'll try your HOCK.C bug generator...
 8-Jun-85 10:50:04-PDT,264;000000000001
Date: Sat 8 Jun 85 10:50:04-PDT
From: David Eppstein <[email protected]>
Subject: release of spilled register
To: [email protected]

I found the bug.  Seems I broke nested ?: constructions when I fixed
something else.  I'll get to this Monday...
-------
 8-Jun-85 11:33:43-PDT,2050;000000000001
Date: Saturday, 8 June 1985  08:32-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: #ASM approach

I looked through the code and think it is possible with a little work.
This is what I would try:

CC.C:
	Define another EXT prepsw; as the -E switch indicator.
	Also define EXT asmstate;
	Replace prepf with prepsw at:
		start of main() (inits to 0)
		cswitch() (when -E seen)
	As part of init():
		prepf = prepsw;
		asmstate = 0;

At the beginning of each file, prepf now contains the global -E switch value.
prepf will be turned on whenever KCC is to simply pass stuff through, without
looking at it except to handle preprocessor stuff.

During processing:

CCINP.C:
	In preprocess(), add two new cases to the switch statement:
		for #asm,	if(asmstate) {error - #asm during #asm}
				else if(!prepsw) prepf = asmstate = 1;
				else {ignore line}
		for #endasm,	if(!asmstate) {error - #endasm without #asm}
				else if(!prepsw) prepf = asmstate = 0;
				else {ignore line}

The latter two cases need a little elaboration to fill in the
hand-waved (ignore & error) statements; I'm not familiar enough with
the reading routines to do it myself.  Anyway, it looks as if this is
simple enough to be worth trying.  The main reason for the asmstate variable
is so that unmatched-type errors will be uncovered even when doing just -E.

Date: Saturday, 8 June 1985  11:32-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: #ASM approach

I'm a little wary of adding a #asm directive for a couple of reasons:
First, because at some point I want to be able to generate REL files
directly without passing anything through FAIL, and second, because it
will encourage people to use assembly when it's not absolutely necessary.
But as you suggest the implementation would be particularly easy.
I guess I can add it to my list, and stop caring about the possibility
of losers doing themselves in.
 8-Jun-85 20:21:40-PDT,2536;000000000001
Date: Saturday, 8 June 1985  17:36-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: #ASM approach

Some thoughts about your #ASM concerns:

Generating .REL files directly is a nasty job.  But if this is ever done,
then it would not be very hard (relatively) to add some extra code
which understands a simple assembler syntax.  This is actually what
SAIL and L10 do; they do the "assembly" themselves, although of course
they only parse a very simple syntax.  That's fine.

I confess I've had notions of porting KCC to ITS.  Given the
newly available features and modularity it will be pretty easy.
However, the ITS .REL format is STINK, not DECREL.  What this means is
that retaining the capability of using an assembler stage will buffer
us from this problem.

The best defense against people shooting themselves in the foot with #ASM
will be the documentation, which should have several warnings about using
the feature.  If they then insist on using it without a good reason, that is
definitely their problem.  Otherwise, there must be a good reason, and I
see the feature as actually reducing lossage, because KCC will take care of
most of the dirty details in a standard/consistent way, and you only need
to #asm the bare minimum of things.  Just think how much cruft is saved
by, for example:

	#include "c-env.h"
	entry sleep;
	sleep(secs)
	unsigned secs;
	{
	#ifdef SYS_T20
	#asm
		move 1,arg1(p)
		imuli 1,^D1000
		disms%
	#endasm
	#endif
	}

Date: Saturday, 8 June 1985  20:21-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: #ASM approach

Your example reminds me of another reason why I am wary of #asm.
People are going to expect it to work in the middle of loops and such.
And they are going to expect nice symbols for things like picking up
arguments (which is not simple, because KCC doesn't bother with a frame
pointer and so code generation has to track a moving stack offset).
This is much harder to implement than merely making it a switch to
pass text straight through without processing.

Oh, and I forgot.  The way -E is done now is completely unrelated to
the way C parsing is done.  So in any case it would take a little more
work than you previously suggested.  But that's no big deal.

If you think KCC would be useful for ITS, go ahead.  I was under the
impression that ITS was dead, at least in terms of software development.
 9-Jun-85 17:58:58-PDT,1090;000000000001
Date: Sunday, 9 June 1985  17:47-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: #ASM approach

Right, those are some of the caveats.  I can think of some straightforward
ways to give #asm code some hooks into the surrounding C stuff (for example,
emit a few standard symbols every time #asm is encountered, like %arg1==-5)
but really that is overkill.  Plain pass-through is good enough for now.

MIT-AI was recently resurrected by a bunch of people who hacked the
microcode of a KS10 (2020) and did all kinds of other impossible things
to bring up ITS on it.  (Evidently 2020s are almost being given away since
they can't run the DEC monitor any more!)  I would worry more about WAITS
than I would about ITS, but probably both will always have a few hard-core
fanatics around to keep them alive.  It may be that the ability to run
C code will help keep the PDP-10 people going, considering how much stray
software nowadays is in C.  And a 10 still leaves most other UNIX processors
choking in the dust.
10-Jun-85 09:50:04-PDT,527;000000000001
Date: Monday, 10 June 1985  02:32-PDT
From: Jim Lewinson <Jiml at SU-SIERRA.ARPA>
To:   Kronj at SU-SIERRA.ARPA
Subject: CLib Hex output bug

In PRINTF.C, you use a recursive procedure to do both Hex and Octal output.
However, you call the Octal routine in the Hex one in order to print
out the first part of the number.  Needless to say, this does not
work at all.

I can't figure out how to rebuild the library, so I guess I will let you fix
it and suffer with the broken one for the evening.

Argh.
						Jim
10-Jun-85 09:52:41-PDT,913;000000000001
Date: Monday, 10 June 1985  02:55-PDT
From: Jim Lewinson <Jiml at SU-SUSHI.ARPA>
To:   Kronj at SU-SIERRA.ARPA
Subject: New CLIB built...

I noticed CLIB.MIC, and so I build a fixed library on Sushi.  The sources
are now in <KCC.LIB> on Sushi, and the library is in <kcc.C> also.

I had got an error from runtm.rel, so I suggest you rebuild the library
on Sierra with all the right .H files.  I am not using runtm, so I don't
care - I do care about printf.  __PO was also missing a set of parens
in the if statement, so it was always doing a-f when it should have been
doing 0-9 some of the time.

					Jim

Date: Monday, 10 June 1985  09:51-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Jim Lewinson <Jiml at SU-SUSHI.ARPA>
Subject: New CLIB built...

Actually someone else (KLH@NIC) is working on CLIB right now.  When he
gives the sources back to me I'll look at hex for you.
10-Jun-85 10:22:19-PDT,275;000000000001
Date: Monday, 10 June 1985  10:20-PDT
From: Jim Lewinson <a.Jiml at SU-GSB-WHY.ARPA>
To:   Kronj at SU-SIERRA.ARPA
Subject: New CLIB built...

That's cool - my version works, so I am happy.  :-)

I guess not many people want to do hex output on DEC-20's.

					Jim
10-Jun-85 16:28:50-PDT,1283;000000000005
Date: Monday, 10 June 1985  16:22-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: SIGNAL.FAI

There are some problems with this module.  I don't want to spend time
on it now, or divert your attention from getting the current KCC bug
fixed, but will set them down for later reference.

The code probably will not work in a non-zero section because CHNTAB addresses
do not have the proper section number.  Will need runtime initialization.
LEV1PC also needs 2 words.  Minor stuff which I will fix as part of TENEX
edits.  A more serious problem is what happens when a C function responsible
for handling a signal decides to do a longjmp.  The stack is okay but no
debrk% is ever done.  This technique is fairly common in V7 UNIX programs
and I am not sure how to deal with it.  Between two wrongheaded interrupt
systems (Unix and TOPS-20) it is pretty tough.  Well... later.

Date: Monday, 10 June 1985  16:28-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: SIGNAL.FAI

I thought I returned from the interrupt before I called the handler
routine, so that the longjmp() problem could be solved.  If not,
that's the way it should be done.
10-Jun-85 16:36:46-PDT,412;000000000001
Date: Monday, 10 June 1985  16:32-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: SIGNAL.FAI

No, and yes, that change will satisfy V7 -- but 4.2BSD has some more
sophisticated hacks for deferring interrupts and I'm not completely
sure (left my manual in wrong place) that debrk% is always the right
thing.  I will get to it eventually.
11-Jun-85 10:09:43-PDT,1373;000000000001
Date: Tuesday, 11 June 1985  05:21-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Progress

How goes it?  While waiting for the fixed KCC (the only thing holding
up the TENEX port) I have occupied myself with more runtime twiddling.
I discovered that the ITS version of FAIL knows how to output stuff in
the ITS relocatable format (it actually translates it on the fly) which
eliminates a whole bunch of potential hair.  Should be fun.  As for WAITS,
it's become quite clear that the WAITS code was totally broken a long time
ago.  I have been fixing it up where it was obvious what needed to be done,
but I don't seriously expect it to work unless some WAITS person (who is
familiar with SAILON 55, the U-boat book) takes an interest in bringing it
up.  At least it will be better organized to begin with...

Date: Tuesday, 11 June 1985  10:09-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: Progress

Which fixes are you waiting for?  I fixed the ?: problem on Sunday.
If it's just a working version with up to date source you're waiting
for, you would do better by taking *.*.-2 (this almost always matches
SYS:CC.EXE.0, and typically will be the same as *.*.0 exactly when I
know the most recent sources to be working).
11-Jun-85 17:39:53-PDT,634;000000000001
Date: Tuesday, 11 June 1985  17:27-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: Retrieval

Uh huh, though I sometimes find the auxiliary stuff (eg NOTES) interesting
too.

Incidentally, although I am now proceeding OK with the port, I don't want to
give back the runtimes yet until #asm is there, because otherwise the
documentation has to go into all kinds of grungy things that I intend to
flush as soon as possible anyway.  Once that is in place I can make a final
sweep and everything will then be nicely wrapped up.  (that's the plan,
anyway...)
12-Jun-85 17:14:50-PDT,456;000000000001
Date: Wednesday, 12 June 1985  17:11-PDT
From: Bill Palmer <whp4 at SU-SIERRA.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   Lougheed at SU-SIERRA.ARPA
Subject: reloading host table

Well, you run KCC all the time.  KCC doesn't yet produce perfect, compact
code in all cases.  Therefore, you are causing the load to go through the
roof, and ought to stop reading your mail and get back to work on speeding
up KCC and the code it produces.  Geez!  :-)
12-Jun-85 19:34:50-PDT,555;000000000001
Mail-From: WHP4 created at 12-Jun-85 19:34:49
Date: Wed 12 Jun 85 19:34:49-PDT
From: Bill Palmer <[email protected]>
Subject: cmode bug?
To: [email protected]

I was having all sorts of problems just now with CMODE - getting
UEB errors whenever I tried to do C-I or C-J.  When I turned on
matching paren, I discovered that a single-quote I had in a printf
format string was matching a double-quote at the beginning of said
string and removing it seemed to fix things.  Does a syntax table
need diddling somewhere?  

					Bill
-------
12-Jun-85 19:43:35-PDT,735;000000000001
Date: Wed 12 Jun 85 19:43:35-PDT
From: David Eppstein <[email protected]>
Subject: #asm
To: [email protected]

#asm now works as an assembly language pass-through using the same code as -E.
I hope someday to turn this into a real assembler (that happens to re-emit
FAIL unless I do REL files someday) that will be able to do things like stick
code in the middle of functions with symbols defined for the arguments and
local variables and such ala SAIL, so you should keep stuff inside it simple.
Currently if you try to stick #asm text in the middle of a function (or even
immediately after the function) it will get emitted before any of the function
(except maybe static local variables).

Hope this helps.
-------
12-Jun-85 19:40:46-PDT,440;000000000001
Mail-From: KRONJ created at 12-Jun-85 19:39:27
Date: Wed 12 Jun 85 19:39:27-PDT
From: David Eppstein <[email protected]>
Subject: Re: cmode bug?
To: [email protected]
In-Reply-To: Message from "Bill Palmer <[email protected]>" of Wed 12 Jun 85 19:34:50-PDT

Yeah, probably.  There are some other problems I've been having with CMODE
(notably indentation after unbracketed if/while) that I may also "improve"
soon.
-------
13-Jun-85 09:59:00-PDT,704;000000000001
Date: Thursday, 13 June 1985  02:11-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: #asm

Hmm, I think I understand why this is so (#asm stuff appearing before
a function, if stuck inside it)... the whole function is parsed up and
digested before anything is output, right?

So that means #asm stuff has to encompass entire functions from label to
final RET.  That's fine.  Re simplicity, I'll definitely keep things
minimal, however I plan to use a few macros to help in this.  It occurs
to me that I might be able to use C macros instead of FAIL macros... hmmm...
an interesting notion... exciting even.  I'll experiment!
13-Jun-85 10:00:34-PDT,1901;000000000001
Date: Thursday, 13 June 1985  03:16-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: #asm prelim results

#asm basically works; #ifdefs and C macro expansions within the #asm
also basically work.  However I found the following glitches.  When
expanding a macro while in #asm, if the first char in the macro is
something "unusual" like ^ or % (I didn't make an exhaustive test for
others), then the resulting expansion leaves the macro name in, and skips
the first char!
	For example:
#define DEC(a) ^D<a>
#asm
	move 1,DEC(23)
#endasm

results in
	move 1,DECD<23>

That should be considered some kind of bug.  Interestingly enough,
when I tried using CC -E instead of CC -S, the macros expanded
properly, but at the end I got the warning message "Unterminated #asm
at end of file".


One annoying problem that I encountered is the fact that newlines
cannot be made part of a macro definition.  This is not a KCC bug, it is
another one of those language defects.  This makes it hard to
expand a C macro into a series of instructions.  This would not be a
problem with MIDAS ("?" can separate words), or with MACRO (the EXP
pseudo should work), but for FAIL one would have to define some kind
of word separator macro and in that case one might almost just as well
use a FAIL macro for the whole thing.

I think what I'll do is define a few standard macros (maybe just %EXP)
and put them in the header file.  All will start with % so there is no
possible symbol conflict.  One reminder for us when the doc gets
written: IFN, IFE pseudos cannot be used in code or in macros, because
they get purged!  (Could equate them to %IFN etc if really necessary).
I guess the problem of collision between assembler pseudos and
user-defined symbols is one argument for eventually skipping the
assembler step.
13-Jun-85 10:01:05-PDT,447;000000000001
Date: Wednesday, 12 June 1985  17:22-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: New KCC bug

The file <KLH>RUNTM.C causes KCC to go into an infinite loop printing
error messages.  Try it.  (You also need the include file <KLH>C-ENV.H).
This stuff is not supposed to work (I started an accidental compile) but
at least it did something useful by uncovering a KCC bug.
13-Jun-85 10:01:22-PDT,805;000000000001
Date: Wednesday, 12 June 1985  16:27-PDT
From: Bill Palmer <whp4 at SU-SIERRA.ARPA>
To:   kronj at SU-SIERRA.ARPA
Subject: am I being dense?

I have this little program called <whp4>prtld.c which doesn't seem to behave
correctly.  Specifically, it does a whole series of iread() calls but after
the first one iread() doesn't seem to get called with the right arguments -
it looks like the stack pointer is getting messed up or something.  Or, maybe
I'm being dense and making some really stupid mistake.  Could you take a look
at it some time at your convenience (like when the load is in single figures)?

					Bill

Date: Wednesday, 12 June 1985  16:28-PDT
From: Bill Palmer <whp4 at SU-SIERRA.ARPA>
To:   kronj at SU-SIERRA.ARPA
Subject: prev. msg.

oops, it is <whp4.load>prtld.c.
13-Jun-85 17:28:42-PDT,717;000000000001
Date: Thursday, 13 June 1985  16:57-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: More bugs

Sigh, this one is a real problem.  Apparently KCC does not understand
#if when using the -E switch.  (as opposed to #ifdef, which is OK).

There is also an apparent bug with #endasm; a following #endif is not
recognized properly.

There may be a problem with ENTRY and TITLE statements not separated by
a newline.   Not sure what is going on.  Look at the output .FAI file.

All of the above can be demonstrated by trying to compile (-c, -S, or -E)
the file <KLH>CPUTM.C (also uses C-ENV.H and C-ENV.FAI, both of which
have been revised).
17-Jun-85 22:37:45-PDT,564;000000000001
Date: Thursday, 13 June 1985  18:40-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   satz at SU-SIERRA.ARPA, kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: FAIL BUG IN SEARCH

The following test program causes FAIL to spill its guts on the floor:

TITLE FAILER

	PURGE IFE,IFN,IFG,IFGE,IFL,IFLE,IFDEF,IFNDEF,IFIDN,IFDIF
	SEARCH MONSYM

GO:	HRROI 1,[ASCIZ /Hello World
/]
	PSOUT
	HALTF

END GO


There is some strange interaction between the PURGE and the SEARCH.  If
the SEARCH happens prior to the PURGE, everything is OK.
17-Jun-85 22:38:56-PDT,1935;000000000001
Date: Thursday, 13 June 1985  19:27-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Argh Argh Argh

Sometimes it feels like there is some vast conspiracy attempting to
frustrate everything I try.  The latest screw is that FAIL doesn't have
any way to equate one symbol with another; thus there is no way to, for
example, transfer the meaning of IFE to %IFE prior to purging IFE.
MIDAS lets you do this.  What this means is that I cannot use any macros
within an #asm which depend on IF-type tests, because KCC is going to
include the C-HDR, which purges all of the IFs (I can see why this would
be considered helpful, although there are plenty of other assembler
pseudos still at large).

This for example screws up the straightforward way of defining ADJSP
as either ADD or SUB depending on the sign of the constant.  (This does
make a difference; ADD P,[-1,,-1] will not do what you think!  It has
to be either SUB P,[1,,1] or ADD P,[-2,,-1]).  I managed to overcome
this one with a complicated expression, but other situations are
tougher.

The SEARCH MONSYM bug came about because there is some code that wants to
use bit values and JSYSes and the like from MONSYM.  However, if the code
does the SEARCH MONSYM itself, it is going to fail dismally because
the C-HDR's PURGE already happened and FAIL just messes itself up.  I can
include the SEARCH in C-HDR itself prior to the PURGE, but then I worry
about what might happen to symbol references in user code which happen to
match one of the hundreds of monitor symbols.  Barf.  I suppose it can
be done without, but anyone writing any system programs in C is still going
to wish that some easy access to monitor symbols existed.  Unless we want
to fill out a truly monstrous .H file...

Well, let me know when the preprocessor bugs are vanquished and I'll get
on with the rest.
17-Jun-85 22:39:30-PDT,888;000000000001
Date: Saturday, 15 June 1985  01:11-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: More on #if vs -E

The following test program suffices:

#define AMAC 0
#if (AMAC+2)
main(){}
#endif

I looked at the code and the problem appears to lie with the
"expmacro" routine in CCLEX which would normally do a nextoken() before
returning, but checks prepf and inasm and does something else if
either is on.  Normally this works OK except when the #if handling
routine (cif in CCINP) is actually trying to read in a value,
unlike anything else in the macro processor.  I figure this could
be kludged around either by having CIF and EXPMACRO conspire with
some flag unique to those two, or by having CIF save, reset, and
later restore the prepf and inasm flags.  The latter is probably the more
correct method.
17-Jun-85 22:40:28-PDT,2355;000000000001
Date: Saturday, 15 June 1985  01:40-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: More on "apparent bug with #endasm"

I assumed this problem had to do with not recognizing a following #endif.
Wrong.  I was being faked out because an #endif just happened to be the
last thing in the file, and that is what KCC printed out when giving
the following error messages.  This program demonstrates the "bug":

[PHOTO:  Recording initiated  Sat 15-Jun-85 1:27am]

 End of PS:<KLH>COMAND.CMD.1
@ty test.c
entry foo;
#ifdef BAR
#endif /*BAR*/
@cc -c test
KCC:    test

Error at line 3 of test.c:
#endif /*BAR*/
Symbol previously defined -- .

Error at line 3 of test.c:
#endif /*BAR*/
Expected token not found -- semicolon.
?2 error(s) detected
@pop

[PHOTO:  Recording terminated Sat 15-Jun-85 1:27am]

There are two problems here.  The first is that when KCC sees an
"entry" statement without any code, it barfs.  It does not barf if it
sees an entry statement with some code, even if the entry is never
defined!  That is, "entry foo; bar(){}" will pass through KCC without
complaints, although FAIL will catch it (properly).
The second problem is that KCC's error message is pretty unhelpful.

A related problem is that this behavior makes it impossible to use
#asm for its intended purpose, because a library routine which happens
to be entirely in assembler will crap out... since the "entry" is
specified to KCC, but all the remaining code is within #asm, including
the entry point label.  This is why I wanted to be able to put #asm
code within a function definition as long as that was the only thing
inside the braces, e.g.
	entry foo;
	foo()
	{
	#asm
		..code..
	#endasm
	}

Well, I can think of two interim kludges.  One is to simply not specify
an entry statement, and instead make "ENTRY FOO" part of the #asm stuff.
However, I strongly suspect this will bomb too because the C-HDR.FAI
file is likely to emit a couple words before it encounters the ENTRY.
FAIL lives up to its name again.  The other kludge is to always
provide a null routine like "static flushme(){}" in such library
module files, which can be taken out later when this stuff is done
right (ie #asm allowed in function body).

Progress flounders on...
17-Jun-85 22:42:01-PDT,1154;000000000001
Date: Saturday, 15 June 1985  14:48-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: -E vs #if: fix

Well, after waking up I tried out the fix and it seems to work OK.
I am including a srccom so you know what I tried; it's not intended to
be a real edit, so feel free to redo it if you incorporate it.


;COMPARISON OF SS:<C.KCC.CC>CCINP.C.48 AND SS:<C.KCC.CC>CCINP.C.49
;OPTIONS ARE    /3

**** FILE SS:<C.KCC.CC>CCINP.C.48, 6-34 (14081)
**** FILE SS:<C.KCC.CC>CCINP.C.49, 6-33 (14079)
	int saveprepf, saveinasm;	/* KLH: ADDED */
***************

**** FILE SS:<C.KCC.CC>CCINP.C.48, 6-38 (14239)
    nextoken();				/* start up token parser again */
    yes = pconst();			/* parse constant expression */
**** FILE SS:<C.KCC.CC>CCINP.C.49, 6-39 (14284)
	saveprepf = prepf;	/* KLH: ADDED */
	saveinasm = inasm;	/* KLH: ADDED */
	prepf = inasm = 0;	/* KLH: ADDED */
    nextoken();				/* start up token parser again */
    yes = pconst();			/* parse constant expression */
	prepf = saveprepf;	/* KLH: ADDED */
	inasm = saveinasm;	/* KLH: ADDED */
***************
17-Jun-85 22:43:17-PDT,1406;000000000001
Date: Sunday, 16 June 1985  16:40-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Something for the to-do list

Two (three?) things.

(1) The string handling loop at "passthru" in CCINP.C needs to have its
handling of \ beefed up so that it recognizes things like \034, i.e.
the stuff after a \ is not always a single char.

This doesn't impact anything I am doing at the moment.  I noticed it
when I was trying to figure out why one of my files got an "unexpected
EOF" message while using -E.  Turned out to be because the handling of
' and " is the same, and there was a line like this in my file:
	; Don't change this definition

(2) I think it is safe to assume that character constants will only consist
of a single character value, thus finding a ' should process only
one char (invoking \-handling subroutine if necessary) and then barf
if a ' isn't seen thereafter.  The C ref man consistently states
that a char constant is a single char value.  (I recall seeing somewhere
that some compilers allow more than 1 char, but this is highly non-portable
as well as non-standard.)

(3) As an added feature, if an unexpected EOF does happen, it would be
nice if the error message could tell you where the scan (for what)
started.  It took me quite a while to figure this one out, even though
it was my fault.
18-Jun-85 14:40:47-PDT,455;000000000001
Date: Tuesday, 18 June 1985  13:54-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: KCC portability changes

I have modified some KCC files in order to make them more portable.
Here are the ones that should be copied from SS:<C.KCC.CC>:
	CCSITE.H	- New file
	CC.H
	CC.C
	CCDATA.C
	CCASMB.C

There is also C-ENV.H which is not strictly a KCC file but is
included by CCSITE.H.
18-Jun-85 16:46:20-PDT,720;000000000001
Date: Tue 18 Jun 85 16:46:20-PDT
From: David Eppstein <[email protected]>
Subject: recent changes
To: [email protected]

I think I've fixed all of your preprocessor problems, and more:
 - Entry with no code or data now works.
 - Entry checks a little harder for correct format.
 - Entry() etc at start of file work.
 - #if works with -E (expmacro no longer looks at inasm or prepf).
 - Single quote in #asm is no ordinary character (easier than doing right).
 - Semicolon comments are skipped in #asm unless -C set.
 - /* */ comments in #asm are emitted (as ; or COMMENT) if -C set.
 - I've brought your new files across (they seem to work fine).
 - Code generation has been slightly improved.
-------
18-Jun-85 16:54:14-PDT,447;000000000001
Date: Tuesday, 18 June 1985  16:51-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   Kronj at SU-SIERRA.ARPA
cc:   KLH at SRI-NIC.ARPA
Subject: recent changes

OK, this should be the final round then... I will go through the CLIB
files, take out the various hacks I used to get around the previous
problems, recompile everything, and barring new problems it will then
be ready for slurping back.  There are no longer any .FAI files.
18-Jun-85 22:31:35-PDT,932;000000000001
Date: Tuesday, 18 June 1985  18:37-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Serious KCC optimization bug

There's a bad bug in the latest KCC which I just noticed.  It may have
existed for a while (like the past week or so) without being noticed; I
ran into it just now when I recompiled CLIB and found that PRINTF was
broken!  Try recompiling PRINTF and you'll see that the piece of code here:

	case '%':
	    control++;
	    if (ladjust = (*control == '-')) control++;
	    if (zfill = (*control == '0')) control++;
	    width = 0;			/* initialize width */

becomes:

$4::
	ILDB	7,-44(17)
	CAIE	7,55
	TDZA	11,11
	MOVEI	11,1
	MOVEM	11,-41(17)
	CAIE	11,0
	ILDB	5,-44(17)		;;; BAD!!!!!!!  What if we skip?
	CAIE	5,60
	TDZA	7,7
	MOVEI	7,1
	MOVEM	7,-40(17)
	CAIE	7,0
	IBP	-44(17)
	SETZB	14,-37(17)

Better fix this soon, huh?
18-Jun-85 22:34:03-PDT,3255;000000000001
Date: Tuesday, 18 June 1985  20:12-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
To:   kronj at SU-SIERRA.ARPA
cc:   klh at SRI-NIC.ARPA
Subject: Additional fix

Hmm, there is one minor problem with the new runtime stuff, which I'm
not sure yet how to fix.  Here's the background:
	The new C-HDR.FAI required for the runtimes is very careful
NOT to emit any code, so that #asm stuff can use ENTRY pseudos.  This
is necessary because sometimes the entry symbols cannot be represented
in C.  The C-HDR file wins by putting any code-emitting initialization
into a macro which the first %PURE or %IMPURE ($$CODE or $$DATA)
invokes.  Any code within an #asm has to be sure to use one of those;
my CLIB guidelines explain this.  The new macros are clever about
knowing the segment state, and do nothing if you are already in the
right segment, so it is OK to use them indiscriminately.  However,
there is only one caveat: KCC must not output either one until it is
actually outputting C code, otherwise it would screw up any #asm ENTRY
statements.
	Currently, KCC uses its own cleverness and knows that the preamble
will start it off in the code segment, and thus it doesn't do a $$CODE
right after the preamble.  This allows #asm stuff to win, which is good, BUT
(and here is the problem) if there is no #asm stuff, and no C data either,
then KCC will never output either a $$DATA or $$CODE and thus the
initialization macro will never be invoked!  A little pitfall more visible
to tiny test programs than big real ones...

After some thought, I have two suggestions for fixes.  The first one is
faster, but the second one seems more elegant.  Either would work, or perhaps
you have another idea.

-----------------------------------------------------------------
(1) Replace the invocation of "codeseg()" by #asm with a call to a function
like this: preseg(){ if(whichseg==0) outpreamble(); }
In other words it will output the preamble if necessary, but do nothing else.
Then, change codeseg() so that it ALWAYS outputs a $$CODE even if outpreamble()
was invoked.
	As long as codeseg() is only called once for each complete
function being output, the unnecessary $$CODEs shouldn't matter very
much (remember the macro does nothing if already in the right seg).
	You cannot use the new C-HDR directly without the new
runtimes, but I have copied the seg-switch macros into the old C-HDR
so that you can use them compatibly if you decide to use the above
sort of fix.  (File is C:C-HDR.FAI)
-----------------------------------------------------------------

(2) Modify the "entry" keyword syntax to accept a wider range of
characters as part of a symbol.  Specifically, $ and % would be
acceptable.  Since such code would never be seen in portable programs
anyway (in the worst case, such PDP-10isms would simply be surrounded
by conditionals), this seems to be an acceptable extension, especially
since the "entry" statement by itself is already non-standard.
	Then we can let KCC worry about putting the ENTRY statements
at the beginning, and C-HDR can forget about hairy initializations
altogether!  Of course it is still a good idea to retain the cleverness
of the seg-switch macros for robustness.
19-Jun-85 10:21:26-PDT,1119;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Wed 19 Jun 85 02:10:22-PDT
Date: Wed 19 Jun 85 02:09:53-PDT
From: Ken Harrenstien <[email protected]>
Subject: One other oddity
To: [email protected]
cc: [email protected]

There is another strange bug which causes trouble when building a new
CLIB.  Compiling a file FOO.C with an "entry foo,bar;" statement, and
then just a bunch of #asm stuff, produces a FAIL file where the first
line is like this: "ENTRY foo,bar TITLE foo".  Fail doesn't complain,
but the resulting .REL file introduces a lot of confusion into the
library!  The "entstmt()" code looks as if it should add a newline,
but I suspect something else is invisibly reaching an "EOF - no more
tokens" type of state and spewing out the #asm contents before the
entstmt() code gets a chance to break out of its loop and reach the
call to nl() just before returning.  This could be patched kludgily by
having the preamble's TITLE line be prefaced with a newline just in
case, but it is probably better to fix the underlying reason, whatever
it is.
19-Jun-85 14:05:12-PDT,475;000000000001
Mail-From: SATZ created at 19-Jun-85 14:04:07
Date: Wed 19 Jun 85 14:04:07-PDT
From: Greg Satz <[email protected]>
Subject: Re: FAIL BUG IN SEARCH
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Thu 13 Jun 85 18:44:35-PDT
Phone: (415) 497-1004

	PURGE IFE,IFN,IFG,IFGE,IFL,IFLE,IFDEF,IFNDEF,IFIDN,IFDIF
	SEARCH MONSYM

Kirk pointed this out. MONSYM uses some of the things that you just
purged.
19-Jun-85 15:58:14-PDT,658;000000000001
Date: Wed 19 Jun 85 15:58:14-PDT
From: David Eppstein <[email protected]>
Subject: optimizer bug and funny chars in entry
To: [email protected]

The bug where a skipped-over IBP was being folded into a following DPB/LDB
has been fixed.  There was code in the routine that did it (the common sub
finder) to be careful of skipped ops but it wasn't being called in this case.

Entry statements can now have % and $ in their identifiers.

I also rearranged some internal numbers, which led me to a typo in the code
to make automatic coercions from doubles to integers.  I guess this didn't
happen often enough for people to notice/care.
-------
19-Jun-85 21:25:43-PDT,1105;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Wed 19 Jun 85 20:10:08-PDT
Date: Wed 19 Jun 85 20:08:46-PDT
From: Ken Harrenstien <[email protected]>
Subject: Re: optimizer bug and funny chars in entry
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Wed 19 Jun 85 15:58:12-PDT

Good, I have now put together a new CLIB.  Unfortunately, the current KCC
still has some over-optimization problems -- I noticed this bug earlier but
didn't track it down because I thought it might be caused by the other bug
I did find.  Anyway, my favorite bug finding program (HOCK.C) has found
another one.  <KLH>HOCK.FAI has the output -- search for the function
"xgame:" and the line marked with "===>" to find the error.  I have edited
in the sections of the source that the code was compiled from, but you can
get the whole thing as HOCK.C of course.  I believe this bug dates from
the same time as the earlier one.

I hope (!!!) this is the very last hangup... I can't think of anything else.
20-Jun-85 11:43:42-PDT,3858;000000000011
Received: from LOTS-B by Sierra with Pup; Thu 20 Jun 85 10:38:09-PDT
Date: Thu 20 Jun 85 10:38:15-PDT
From: Andrew "666" Gideon <G.GIDEON@LOTS-B>
Subject: A problen with KCC
To: kronj@Sierra
cc: A.Jiml@GSB-WHY
Reply-To: Gideon@Sierra
Office Phone: (415) 497-4816

Dave:

I have a problem with the C compiler on which you were working before
you left.  I am lucky that you are around now (although you may not,
of course, see it this way) as I get to ask the question of you.  If
you are no longer a person to whom I may quest in this matter, I would
appreciate knowing who is.

By the way, is there an "official" name for the compiler?  I never
do know how to refer to it.

At the end of this message, I am enclosing a C program (a series of
procedures, actually).  In it are two comments which I have inserted,
"/*XXXX*/" and "/*YYYY*/".  These are indicators of location.

The program is part of a set I am trying to move to TOPS-20 from UNIX.
When I compiled the program as it was, I got errors due to the fact that
STDIO.H was "#include"d twice.  The second time was in location XXXX.

When I removed this, the program would no longer yield arrors, or anything
else.  It would merely sit (compiling) forever (it appeared to be looping
around FINDPA if cntl/T is to be trusted).

On a hunch, I removed the "(double)" at location YYYY.  Compilation then
occurred properly, yielding the errors (variables not unique in only
six characters, mostly) I expected (actually, those errors occurred in
the FAIL compilation stage, which it never reached with the "(double)"
in there).

Please, if you do look at this, let me know what you find.  I am quite
curious as to why this would cause the compilor to effectively loop
infinitely.

			Andy


/* dimen.c	Samuel W. Bent		7/30/84 */
/* utilities for dimensions */

#include <stdio.h>
#include "dimen.h"

int PXLperIN;
double PXLperFIX;
double PXLperDVI;
double PXLperDVIunmag;

static long magnification;
static double realmag;
static long numerator;
static long denominator;

set_device_resolution(p)
int p;
{
    PXLperIN = p;
    PXLperFIX = (double) PXLperIN / FIXperIN;
}

set_DVI_dimens(n,d,m)
long n,d,m;
{
    magnification = m;
    realmag = GetRealMag(magnification);
    numerator = n;  denominator = d;
    PXLperDVIunmag =  /*YYYY*/ n/d * INperRSU * PXLperIN;
    PXLperDVI = PXLperDVIunmag * realmag;
}

/*XXXX*/
long DUtoDVI(du,z)
long du;	/* tfm width in DUs (=2^-20 DSs) */
long z;		/* DVIs per DS */
{
int a,b,c,d,e;
long alpha;
long w;
    a = (du>>24) & 0377;  b = (du>>16) & 0377;
    c = (du>>8)  & 0377;  d = du & 0377;
    alpha = z<<4;  e = 0;
    while (z>=040000000) {
	z >>= 1;  ++e;
	};
    w = ( ( (((d*z)>>8) + c*z) >>8) + b*z) >> (4-e);
    if (a!=0 && a!=0377) Error(0,"Bad tfm width %d",du);
    if (a==0377)  w -= alpha;
    return(w);
}

pxlwidth(du,fix_per_ds,realmag,pxl_per_in)
long du, fix_per_ds;
float realmag;
int pxl_per_in;
{
    return ( FIXtoPXL((int) (DUtoDS(du) * fix_per_ds * realmag) ));
}

/* this routine takes a integer representation of a mag factor (value of the
   magnification times 1000) and returns the float representation (no 1000
   factor).  The routine does a certain amount of faking to make sure that
   the magnification returned is correct. */
double GetRealMag(intmag)
int intmag;
{
double realmag;
    if (intmag == 1095)		realmag = 1.095445;	/* stephalf */
    else if (intmag == 1315)	realmag = 1.314534;	/* stepihalf */
    else if (intmag == 2074)	realmag = 2.0736;	/* stepiv */
    else if (intmag == 2488)	realmag = 2.48832;	/* stepv */
    else if (intmag == 2986)	realmag = 2.985984;	/* stepiv */
    else			realmag = (double) intmag / 1000;
    					/* remaining mags have been ok */
    return (realmag);
}
20-Jun-85 11:44:18-PDT,7876;000000000005
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 20 Jun 85 11:38:24-PDT
Date: Thu 20 Jun 85 11:36:53-PDT
From: Ken Harrenstien <[email protected]>
Subject: FAIL is not MIDAS
To: [email protected]
cc: [email protected]

Sigh, the floating point macros are being screwed up because KCC outputs
stuff like this:
	DFAD 2,[0
		0]
The macro cannot handle such things.  It would be better to avoid all use
of multi-word literals.  For the above case, do something like
	DFAD 2,$ZERO
and the KCC runtime will have a $ZERO: location defined which will contain
enough zero words to cover the largest basic data type (for now this is just
2 words).  The header file will take care of the EXTERN.


Date: Thu 20 Jun 85 13:27:06-PDT
From: David Eppstein <[email protected]>

Ok, you want to fix this in your version?  Then I can back-edit it in if
necessary to my copy of CC when I bring your stuff across.

Oh, I found and fixed your hock bug.  I think I'll keep around a copy of
hock.c so I can track how the assembly output changes with changes in
the peepholer and find some of these bugs before they get out...I already
do this with KCC itself but apparently that isn't enough...


Date: Thu 20 Jun 85 13:29:25-PDT
From: Ken Harrenstien <[email protected]>

No problem, it's pretty trivial.  So I'll snarf the latest stuff now,
edit, and report either "OK, it's ready" or "Uh, another bug..."


Date: Thursday, 20 June 1985  13:33-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>

Well you'd actually do better to wait until later this evening.  I've
just found another bug (I rearranged register allocation and there's a
problem with case statements).


Date: Thu 20 Jun 85 13:35:44-PDT
From: Ken Harrenstien <[email protected]>

Oh well, I guess I will amuse myself by re-arranging more runtimes in
the interim...


Date: Thursday, 20 June 1985  17:25-PDT
From: Ken Harrenstien <KLH at SRI-NIC.ARPA>
Subject: DFLOT fix, and DFIX buggy

While you are fixing the other stuff why don't you just change the DFLOT R,
output in CCOUT.C to spit out a DFAD R,$ZERO instead of its current text.
Very simple.  For compatibility until the new runtime is there, add this to
your C-HDR.FAI file.
	DEFINE $ZERO
	<[ 0
	   0]
	>

By the way, while looking at CCOUT I noticed that the calling
sequences for DFIX are vulnerable to screwage if R=14 or =15.  In
general, I think it would be a lot simpler, faster, and safer for the
runtime stuff to have both 15 and 16 available to it instead of just
16.  The hassle, overhead, and risk don't seem to be worth the
marginal benefit of one extra AC.  Comment?


Date: Thu 20 Jun 85 18:06:14-PDT
From: David Eppstein <[email protected]>

Ok, C-HDR and CCOUT now use $ZERO.  I've also fixed DFIX for the case
DFIX 14,14 (the only case as far as I can tell that was producing bogus
results) and improved the code it emits for the case DFIX 15,X.
I'd rather not give up any more registers for such things (especially
as I consider things like DFIX rather unimportant).

What I would like to do eventually is change the calling convention
for things like DFIX to return values in ACs 1 and 2 like normal
functions, but also to take arguments there.  Then code generation can
force the value into those registers and return it from there without
all these worries about exchanging things and doing different code
sequences to preserve the meaning of AC15 even if it isn't being used
(and with my latest register allocation, AC15 is very rarely used).
The DFIX op would continue to be used so that I can distinguish it
from a normal PUSHJ, though.


Date: Thu 20 Jun 85 19:23:33-PDT
From: Ken Harrenstien <[email protected]>

Well, I'm not real hung up about which registers are used, but am not
sure I understand what you have in mind.  From what I know, the balance
seems to tilt towards reserving 15&16 rather than 1&2.  Here are some
observations:
	15 is claimed to be "very rarely used" already.
	15&16 would never have to be saved/restored, whereas:
	1&2 by contrast are frequently used and have to be saved/restored.
		Alternatively, keeping them out of the code gen stuff
		(like 15&16) could involve more frequent shuffles for
		the function return values.
	Note that the high ACs always have to be treated specially since
		a double can never be stored in the last one.  In fact,
		if a double is stored in 15+16, the KA double-prec macros
		will smash 17.  There is no good way to fix this.
		So you have to ignore 16 (at least) anyway.
	DFIX may be infrequent, but it certainly isn't the only thing that
		can benefit from having 2 reserved ACs instead of one.
		All the other runtime hooks (SPUSH,SPOP,ADJBP,SUBBP)
		can use them too, and collectively they amount to something.
	I shudder to think about having to massage the ADJBP macro to
		save/load/restore 1&2 especially if they might already
		contain an address/value needed for loading.
		KCC would have to know a lot more than it does now about
		what is really going on.

Gotta go, sys coming down.


Date: Thursday, 20 June 1985  22:50-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>

Well, the register pair used isn't really important.  I just thought
it would be easier to use 1/2 since there are already routines for
allocating those specific registers (all other register allocation
stuff picks its own rather than being given numbers).  This is
currently not a consideration because the current way of doing things
doesn't go near the register allocator.

The current code that emits a double to int coercion looks like
	r = genstmt (n->left);		/* get double into register pair */
	code0 (DFIX, r, r);		/* emit DFIX R,R pseudo-op */
	narrow (r, 0);			/* forget about R+1 */
	return r;			/* result is in R */
and all the work is hidden within the DFIX op.

What I would do would instead look something like
	release (getrpair());		/* kludge make sure ACs 1/2 unused */
	code0 (DMOVE, RETVAL, genstmt (n->left)); /* get double into 1/2 */
	code5 (DFIX, 0);		/* fix double in ACs 1/2 into AC1 */
	return getret();		/* result is in AC1 */
where the calling and return conventions of the routine are more
explicitly set out here rather than hidden to all but ccout.

The internal code after a little peepholing would then tend to look like
	DMOVE	1,x
	DFIX			;really PUSHJ 17,$DFIX
(depending on where the double came from) rather than the current
	DFIX	R,x
but that DFIX R,x would have expanded out to several more instructions
than the two of the DMOVE and PUSHJ except in the rare case that R=15.

Drawbacks would be that if this result is to be combined with some
value already calculated into ACs 1 or 2, then that other value would
have to be spilled (during the call to getrpair()), and that common
subexpressions previously available in ACs 1 or 2 would be lost.  It
would be possible to avoid both of these problems if we used two
reserved registers (say by making 15 reserved and using the current
calling conventions) but then we would lose another instruction
putting the result back into a normal register - results can't live in
reserved registers or we lose the point of having reserved them.  I
think that it would be rare for the drawbacks to add up to more than
one memref, and common for them to not apply at all; whereas if we
used 15 as a new reserved register we'd always lose that instruction
as well as having one less normal register.

Anyway, I'm not going to do anything about this until I have the
runtimes back and until I have nothing better to be working on instead.
Since the calling conventions seem to work now, and since I don't see
these routines getting much use, there should be plenty of things that
count as better.
20-Jun-85 13:25:04-PDT,1116;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 20 Jun 85 13:19:24-PDT
Date: Thu 20 Jun 85 13:17:58-PDT
From: Ken Harrenstien <[email protected]>
Subject: Minor annoyance for later
To: [email protected]
cc: [email protected]

This isn't important, but I thought I would mention it for posterity.

It is often convenient for a function to have a little assembler "helper"
routine defined next to it in the same file.  This reduces the number of
random external/entry symbols and improves locality.  However, when KCC
sees the C function's reference to its helper function, it has no way of
knowing that the helper is actually internal, rather than external, and
so KCC adds an EXTERN statement for the helper's name.  FAIL, having already
seen this symbol, barfs about it.

Now, this doesn't affect the goodness of the .REL file so it isn't
anything that has to be fixed; it just scares the user for a moment.
Of course this will go away if/when KCC is able to grok #asm within a
function body (even if #asm stuff is the only thing inside the body).
20-Jun-85 13:40:20-PDT,454;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 20 Jun 85 13:37:11-PDT
Date: Thu 20 Jun 85 13:35:44-PDT
From: Ken Harrenstien <[email protected]>
Subject: Re: DFAD [0 \ 0]
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Thu 20 Jun 85 13:33:00-PDT

Oh well, I guess I will amuse myself by re-arranging more runtimes in
the interim...
20-Jun-85 15:37:54-PDT,958;000000000001
Date: Thu 20 Jun 85 15:37:54-PDT
From: David Eppstein <[email protected]>
Subject: Re: A problen with KCC
To: G.Gideon@LOTS-B
cc: A.Jiml@GSB-WHY.#Pup
In-Reply-To: Message from "Andrew "666" Gideon <G.GIDEON@LOTS-B>" of Thu 20 Jun 85 11:43:42-PDT

(1) The compiler is called KCC
(2) I get some errors when I compile it but they are all supposed to happen.
    Most of them are due to not having a copy of "dimen.h".  One is because
    GetRealMag() is implicitly declared (int()) by its first use and then
    redeclared as (double()) at its definition.  I don't see the looping you
    describe.  Therefore I have to assume you have an old version of KCC.
(3) You can get a new version from SYS:CC.EXE and C:*.* on Sierra.
(4) Please fix the "To:" fields you are sending out to conform to RFC822.
    The current format (with the "666" in the middle) causes MM to
    want to reply to Gideon locally rather than G.Gideon@LOTS-B.
-------
20-Jun-85 23:22:45-PDT,1510;000000000001
Date: Thu 20 Jun 85 23:22:45-PDT
From: David Eppstein <[email protected]>
Subject: angle brackets in C program args
To: [email protected]

Is it my imagination or does  KCC <dir>file.c  no longer work?
Has somebody (I would assume neither you nor me since neither of
us has been hacking the CLIB I'm running) broken this recently?


Date: Fri 21 Jun 85 02:10:19-PDT
From: Ken Harrenstien <[email protected]>

I don't even have write access to the SIERRA CLIB stuff.  That syntax
also fails here.  Since I am hacking CLIB anyway I can investigate and
fix it in my copy.


Date: Fri 21 Jun 85 02:24:42-PDT
From: Ken Harrenstien <[email protected]>

OK, I found the problem with CC <dir>fname.  The code that WHP4 added to
handle "&" in the JCL was inserted in a place where the <dir> handling
expected to drop through to the default arg delimiter parsing.  So trying
to use that syntax made the runtime think you were using &, and other
nonsense.

I will fix it here, so you will get it when the new CLIB is brought back
to SIERRA.  Assuming the current KCC has no remaining funnies, that should
be possible sometime tomorrow, though I'm not holding my breath!  Will let
you know.


Date: Friday, 21 June 1985  16:30-PDT
From: Bill Palmer <[email protected]>

I could have sworn that both CC <dir>foo  and   foobar -k baz & worked
at the same time when I put in the "&" code.  I just noticed the funny
behavior today myself.  Of course, I could be mistaken.

					Bill
21-Jun-85 10:16:57-PDT,771;000000000005
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Fri 21 Jun 85 05:54:12-PDT
Date: Fri 21 Jun 85 05:54:58-PDT
From: Ken Harrenstien <[email protected]>
Subject: KCC error message
To: [email protected]
cc: [email protected]

Something that ought to be improved:

[PHOTO:  Recording initiated  Fri 21-Jun-85 5:53am]

 End of PS:<KLH>COMAND.CMD.1
@ty test.c
extern int bar;

entry foo;

foo()
{       return(1);
}
@cc -c test
KCC:    test

Error at line 3 of test.c:
entry foo;
Expected token not found -- semicolon.
?1 error(s) detected
@pop

[PHOTO:  Recording terminated Fri 21-Jun-85 5:53am]

The error message is pretty cryptic.  Presumably KCC is saying the "entry"
statement has come too late.
21-Jun-85 10:17:24-PDT,5095;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Fri 21 Jun 85 07:23:57-PDT
Date: Fri 21 Jun 85 07:22:05-PDT
From: Ken Harrenstien <[email protected]>
Subject: FAIL FAIL FAIL
To: [email protected]
cc: [email protected]

Well, not today.  I have beat my head to a pulp trying to figure out why
FAIL is suddenly assembling PRINTF into something that LINK barfs on.
I have at least determined that the file without a SEARCH MONSYM will
load OK, but with it the loader complains.  I don't understand why
this doesn't happen with any other files except this one.  Have you
ever experienced LINK messages like this?

[PHOTO:  Recording initiated  Fri 21-Jun-85 7:20am]

 End of PS:<KLH>COMAND.CMD.1
@link
*prin:long
*printf
%LNKJPB Junk at end of polish block
        Detected in module PRINTF from file DSK:PRINTF.REL
        The  specified  module  contains  an incorrectly formatted
        polish fixup block  (type  11).  Either  the  last  unused
        halfword  (if  it  exists) is non-zero, or there are extra
        halfwords following  all  valid  data.  LINK  ignores  the
        extra  data.  This  error is probably caused by a fault in
        the language translator used for the program.  This  error
        is  not expected to occur.  If it does, please notify your
        Software Specialist or send a Software Performance  Report
        (SPR) to DIGITAL.
*^C
@pop

[PHOTO:  Recording terminated Fri 21-Jun-85 7:20am]

What I need to figure out what is going on is a program that can
list out and interpret the blocks of a .REL file.  Have you ever
heard of such a utility?  Seems there must be one someplace.


Date: 21 Jun 1985  10:30 PDT (Fri)
From: David Eppstein <[email protected]>

Presumably some macro from MONSYM gets confused by the PURGEs.
FAIL is easy to break.  Why do you even need MONSYM loaded?
Surely you can do everything you want with #defines...


Date: Friday, 21 June 1985  11:14-PDT
From: Ken Harrenstien <[email protected]>

I want to avoid having to look up all the bit values for all the
JSYSes and JSYS arguments and stuff.  Note: the same behavior happens
even when the PURGE is commented out!  Now THAT really bothers me.
Oh well, back to sleep.


Date: Friday, 21 June 1985  16:32-PDT
From: Bill Palmer <[email protected]>

Greg Satz has a RELPRT program (available from <SU-UTILITIES>RELPRT.PAS)
that does at least part of what you want - I imagine it could be easily
beaten into doing all of what you want.

					Bill


Date: Friday, 21 June 1985  16:35-PDT
From: Ken Harrenstien <[email protected]>

Aha, interesting.  I should probably suppress my knee-jerk loathing for
PASCAL for a moment and see what this frob has...


Date: Friday, 21 June 1985  15:51-PDT
From: Ken Harrenstien <[email protected]>
Subject: FAIL bug fixed (%LNKJPB - Junk in Polish Block)

OK, I can hardly believe it, but I was able to track down the cause of
this lossage.  Basically LINK was complaining that there was junk in
a polish block when loading one of the C library files.  I found that
removing the SEARCH MONSYM caused the LINK error message to go away.
Well, the reason why it wins without, and loses with, is because the
SEARCH loads up a lot of stuff into FAIL's temporary storage area, which
is also used later on in the assembly to build polish fixup blocks (type 11)
for output.  The code was not clearing memory before writing into it,
so if there was anything not actually written into (such as unused bits in
the relocation word, or an extra halfword at the end), it would acquire the
value of whatever happened to previously be there... which of course is
almost random.  LINK thinks a non-zero extra halfword implies "junk" (and,
you know, I kind of agree).

Anyway, I fixed FAIL's POLOUT routine to zap the extra halfword, and for
good measure it also makes sure that unused reloc bits are clear (they are
pretty confusing when trying to compare binary programs, or make sense of
the .REL file).

You can get the new source from [SRI-NIC]SRC:<LOC.SUBSYS>FAIL.FAI.
I believe it is anonymously accessible.  If you discover that my source
in fact is an old version, please merge the stuff and let me know!!

This has nothing to do with the "FAIL BUG IN SEARCH" bug, which remains.


Date: 21 Jun 1985  17:04 PDT (Fri)
From: David Eppstein <[email protected]>

If there's any way to avoid having to distribute a fixed version of
FAIL with KCC, I would prefer to do things that way.

Which is not to say that I don't want the fix.  I just don't want to
depend on having the fix.


Date: Friday, 21 June 1985  17:15-PDT
From: Ken Harrenstien <[email protected]>

LINK loads the .REL anyway -- this just prevents it from complaining
bitterly while doing so.  Actually it would be better to depend on MACRO
than on FAIL.  Now that everything is in .C I think I could toggle between
MACRO, FAIL, and MIDAS with little trouble.  Would you like me to set things
up to allow this?
21-Jun-85 10:18:31-PDT,1094;000000000001
Return-Path: <[email protected]>
Received: from SU-GSB-HOW.ARPA by SU-SIERRA.ARPA with TCP; Fri 21 Jun 85 09:09:11-PDT
Date: Fri 21 Jun 85 09:08:58-PDT
From: Andrew "VaxBuster" Gideon <[email protected]>
Subject: Re: A problen with KCC
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Thu 20 Jun 85 15:47:04-PDT
Office Phone: (415) 497-4816

I just got a version of KCC from SIERRA last week.  That may be the
problem, I suppose, so I will get it again.

I got no errors at all, but then I had a copy of the .H file.  Do you care
if the looping occurs with the new compiler, and the .H file?  Should
I let you know?

			Andy

P.S.	The "To:" field is set by MM.  I have never heard a complaint
	about it before.  I must confess that I have not read RFC822, but
	I assume that the reason it tried to reply locally is that
	I put a "Reply-To" address in the header of Gideon@SIERRA.  I
	do this from machines not in the NIC host table by habit, using
	my mailbox at either SIERRA or SCORE.
21-Jun-85 10:32:06-PDT,1416;000000000001
Mail-From: KRONJ created at 21-Jun-85 10:31:51
Date: Tuesday, 18 June 1985  17:10-PDT
Message-ID: <[email protected]>
Sender: David Eppstein <[email protected]>
From: David Eppstein <[email protected]>
To: [email protected]
cc: [email protected]
Subject:   compiling nested conditionals
ReSent-From: [email protected]
ReSent-To: bug-c-archive
ReSent-Date: Fri 21 Jun 1985 10:31-PDT

Sure, I do it in the DEC-20 C compiler I maintain.  I picked up
the trick in the introductory compilers course I took at Stanford
but I don't remember whether I was taught it there or made it up.
Sure beats fixing it up afterwards.

Date: Thursday, 20 June 1985  06:36-PDT
From: seismo!philabs!sbcs!debray at columbia.arpa (Saumya Debray)
To:   eppstein at columbia.arpa
Re:   compiling nested conditionals

Thanks for the message.  In the intro compiler course I took here at
Stony Brook, we generated the goto-chains and fixed them afterwards, so
I thought it really neat when I came up with this trick of passing labels
around to generate optimal branching code directly.  I've discovered,
since, a very similar idea for generating code for nested conditionals
and "while" statements in Aho & Ullman's "Principles of Compiler Design".
Guess I'll have to look elsewhere for my claim to fame! :-)

Saumya Debray
SUNY at Stony Brook
21-Jun-85 17:46:33-PDT,634;000000000001
Mail-From: KRONJ created at 21-Jun-85 17:46:28
Date: Fri 21 Jun 85 17:46:28-PDT
From: David Eppstein <[email protected]>
Subject: WHP4's file I/O bug
To: [email protected]


WHP4, TTY164, 20-Jun-85 4:22PM
sigh.  at least part of the lossage (probably all) with the program I asked
you to take a look at is my fault.  

WHP4, TTY164, 20-Jun-85 4:23PM
it got primary input confused with a file jfn - the problem was that I 
was doing iread (foo->_file, &rest) when I wanted iread(foo, &rest) and
so it did an extra reference and ended up reading from jfn -1 which of
course was the terminal.

-------
22-Jun-85 08:51:18-PDT,819;000000000001
Mail-From: KRONJ created at 22-Jun-85 08:51:16
Date: Friday, 21 June 1985  19:01-PDT
Message-ID: <[email protected]>
Sender: Ken Harrenstien <[email protected]>
From: Ken Harrenstien <[email protected]>
To: [email protected]
cc: [email protected]
Subject:   KCC -m slightly broken
ReSent-From: [email protected]
ReSent-To: bug-c-archive
ReSent-Date: Sat 22 Jun 1985 08:51-PDT

It turns out that when -m is specified, KCC continues to emit code
with references to $$ONE, but neglects to include the specifications
that it normally puts into the preamble and trailer of a FAIL file.
I'm not sure whether MACRO has the same problem with relocations to
the right segment, so am not sure whether -m should skip hacking $$ONE or
if it should include the same preamble/trailer stuff.
23-Jun-85 10:27:58-PDT,1515;000000000001
Mail-From: KRONJ created at 23-Jun-85 10:27:57
Date: Sun 23 Jun 85 10:27:57-PDT
From: David Eppstein <[email protected]>
Subject: Broken 18-bit OWGBP ADJBP in 6.1
To: [email protected]

Something to be very careful of if we ever want to do 18-bit shorts.
Maybe it would be better merely to use numbers, complicated load
sequences.  Maybe it would be better to forget the idea entirely...
                ---------------

Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Wed 19 Jun 85 13:49:43-PDT
Date: Wed 19 Jun 85 16:46:24-EDT
From: Bill Schilit <[email protected]>
Subject: TOPS-20 6.1 and PSL
To: [email protected], [email protected]
cc: [email protected]


In an earlier message I told you about a problem we were having
with PSL and tops-20 version 6.1 -- this is an update.

We found a microcode bug in our current field test version,
tape #5 of 6.1.  This bug caused the ADJBP instruction to
fail about every 4000-30000 times it was executed when the
byte pointer was a OWGP of size 18.

I sent a four line program to DEC which displayed this problem
and they have a u-code fix.  Kevin Paetzold says it will be
coming my way in a matter of days and that the release of
6.1 will include the fix.

In the meantime any 6.0 or 6.1 sites, as well as TOPS-10 sites
with new microcode might have problems running the extended
addressing version of Portable Standard Lisp.

- Bill
-------
-------
23-Jun-85 21:55:45-PDT,737;000000000001
Mail-From: KRONJ created at 23-Jun-85 21:55:42
Date: Sunday, 23 June 1985  18:28-PDT
Message-ID: <[email protected]>
Sender: Ken Harrenstien <[email protected]>
From: Ken Harrenstien <[email protected]>
To: [email protected]
cc: [email protected]
Subject:   KCC update
ReSent-From: [email protected]
ReSent-To: bug-c-archive
ReSent-Date: Sun 23 Jun 1985 21:55-PDT

I am adding a few things to KCC for flexibility.  Changed files are
CCASMB, CCDATA, CCOUT, CC, plus CC.H and CCSITE.H.  SS:<C.KCC.CC>.
Basically just replaced "fail" with "asmtyp".  I need to split now,
but when I get back later tonight (or tomorrow) unless I hear from you
I will look at the $$ONE problem and fix it if possible.
23-Jun-85 21:56:41-PDT,3516;000000000001
Date: Sunday, 23 June 1985  18:58-PDT
From: Bill Palmer <[email protected]>
To: [email protected]
Subject: bizarreness with kcc

I decided I was too lazy to lookup the values of some jsyses and bits I wanted
to use in a program, and remembered that greg had at one point generated a
monsym.h file.  So, I stuck #include <sys/monsym.h> in my program and tried
to compile it only to run into some bizarre behavior from kcc.  Here's the
simplest case:


[PHOTO:  Recording initiated  Sun 23-Jun-85 6:55PM]

!ty test.c
#include <sys/monsym.h>

main(){}
!cc test.c
KCC:	test

Error at line 2060 of C:sys/monsym.h:
#define _HPELP 0000000
.

Error at line 2060 of C:sys/monsym.h:
#define _HPELP 0000000
.

Warning at line 2321 of C:sys/monsym.h:
#define _MOPC
.

Warning at line 2322 of C:sys/monsym.h:
#define _MOPC
.

Warning at line 2329 of C:sys/monsym.h:
#define _MOPRD
.

Warning at line 2335 of C:sys/monsym.h:
#define _MORB
.

Warning at line 2336 of C:sys/monsym.h:
#define _MORC
.

Warning at line 2337 of C:sys/monsym.h:
#define _MORC
.

Warning at line 2338 of C:sys/monsym.h:
#define _MORD
.

Warning at line 2339 of C:sys/monsym.h:
#define _MORD
.

Warning at line 2340 of C:sys/monsym.h:
#define _MORD
.

Warning at line 2341 of C:sys/monsym.h:
#define _MORD
.

Warning at line 2342 of C:sys/monsym.h:
#define _MORD
.

Warning at line 2343 of C:sys/monsym.h:
#define _MORE
.

Warning at line 2344 of C:sys/monsym.h:
#define _MORE
.

Warning at line 2345 of C:sys/monsym.h:
#define _MORF
.

Warning at line 2346 of C:sys/monsym.h:
#define _MORH
.

Warning at line 2347 of C:sys/monsym.h:
#define _MORI
.

Warning at line 2349 of C:sys/monsym.h:
#define _MORLI
.

Warning at line 2350 of C:sys/monsym.h:
#define _MORL
.

Warning at line 2351 of C:sys/monsym.h:
#define _MORL
.

Warning at line 2352 of C:sys/monsym.h:
#define _MORL
.

Warning at line 2354 of C:sys/monsym.h:
#define _M
.

Warning at line 2355 of C:sys/monsym.h:
#define _M
.

Warning at line 2356 of C:sys/monsym.h:
#define _M
.

Warning at line 2357 of C:sys/monsym.h:
#define _M
.

Warning at line 2358 of C:sys/monsym.h:
#define _M
.

Warning at line 2359 of C:sys/monsym.h:
#define _M
.

Warning at line 2360 of C:sys/monsym.h:
#define _M
.

Warning at line 2361 of C:sys/monsym.h:
#define _M
.

Warning at line 2362 of C:sys/monsym.h:
#define _M
.

Warning at line 2363 of C:sys/monsym.h:
#define _M
.

Warning at line 2364 of C:sys/monsym.h:
#define _M
.

Warning at line 2365 of C:sys/monsym.h:
#define _M
.

Warning at line 2366 of C:sys/monsym.h:
#define _M
.

Warning at line 2367 of C:sys/monsym.h:
#define _M
.

Wa^C
!pop

[PHOTO:  Recording terminated Sun 23-Jun-85 6:56PM]

The warning message could be a bit more explicit, perhaps?

						Bill

Date: Sun 23 Jun 85 22:00:25-PDT
From: David Eppstein <[email protected]>

Maybe it ran out of symbol table?  I think symbol records are picked up from
allocated memory, but I think it uses some form of rehashing rather than
hash buckets so things can overflow.  Can't think of any other explanation...


Date: Sun 23 Jun 85 22:13:12-PDT
From: David Eppstein <[email protected]>

It's overflowing the macro string pool.  All those digits in all those
definitions, and it just can't handle it.  Another thing to redo to come
from dynamic memory.  Sigh.  16000 chars just isn't enough.
24-Jun-85 10:05:30-PDT,524;000000000001
Mail-From: KRONJ created at 24-Jun-85 10:05:27
Date: Monday, 24 June 1985  02:01-PDT
Message-ID: <[email protected]>
Sender: Ken Harrenstien <[email protected]>
From: Ken Harrenstien <[email protected]>
To: [email protected]
cc: [email protected]
Subject:   $$ONE
ReSent-From: [email protected]
ReSent-To: bug-c-archive
ReSent-Date: Mon 24 Jun 1985 10:05-PDT

Well, from my testing it appears that MACRO does the right thing
so I fixed CCOUT to only use $$ONE if the assembler is FAIL.
24-Jun-85 10:26:49-PDT,344;000000000001
Mail-From: KRONJ created at 24-Jun-85 10:26:39
Date: Mon 24 Jun 85 10:26:39-PDT
From: David Eppstein <[email protected]>
Subject: Your KCC and FAIL changes
To: [email protected]

ok, I've picked up the sources for your changes to fix -m support in KCC
and to not fail quite so badly with MONSYM.FUN.  will try compiling now.
-------
24-Jun-85 11:52:17-PDT,433;000000000001
Mail-From: KRONJ created at 24-Jun-85 11:51:54
Date: Mon 24 Jun 85 11:51:54-PDT
From: David Eppstein <[email protected]>
Subject: KLH's new improved FAIL
To: [email protected], [email protected], G.Gorin@LOTS-A
cc: [email protected]

I have a copy of KLH's fixes for garbage bits in relocation/polish
in SRA:<FAIL>FAIL.FAI on Sierra.  It seems to work, and I've put
up a binary in NEW:FAIL.EXE also on Sierra.
-------
25-Jun-85 10:15:16-PDT,1235;000000000001
Mail-From: KRONJ created at 25-Jun-85 10:14:57
Date: Tuesday, 25 June 1985  03:52-PDT
Message-ID: <[email protected]>
Sender: Ken Harrenstien <[email protected]>
From: Ken Harrenstien <[email protected]>
To: [email protected]
cc: [email protected]
Subject:   Status

What I am doing now is making sure the CLIB stuff assembles OK when
using MACRO instead of FAIL, so that having an uptodate FAIL is not
necessary in order to put things together (note, though, FAIL could
be distributed with KCC, just as MIDAS is included with TECO/EMACS).
Since I am likely to continue hacking CLIB for a while (mainly for
portability purposes), the question arises of whether to update
SIERRA as soon as this step is done (everything consistent & working)
or wait until the hack pace slackens off.  Your preference?


Date: 25 Jun 1985  10:18 PDT (Tue)
From: David Eppstein <[email protected]>

If you were done hacking it I would want to take it back so I could be
free to hack it, but since you are going to continue there doesn't
seem to any point in doing so.

I had made a directory <KCC.FAIL> on Sierra with the latest version of
FAIL so that it could be distributed with KCC as you suggest.
25-Jun-85 17:37:49-PDT,4783;000000000001
Date: Tuesday, 25 June 1985  17:24-PDT
From: Ken Harrenstien <[email protected]>
To: [email protected]
Subject: Register alloc bug in KCC

The test file <KLH>CONDCC.C provokes a KCC bug.  Sigh.


Date: Tuesday, 25 June 1985  17:25-PDT
From: Ken Harrenstien <[email protected]>

Also, why does KCC produce calls to $SPOP when it compiles CONDCC, especially
when it never calls $SPUSH and I'm not aware that my code moves any
structures around?


Date: Tue 25 Jun 85 17:47:43-PDT
From: David Eppstein <[email protected]>

I tried getting this but SRI-NIC now complains "Password incorrect" when
I attempt to log in as anonymous.  Will this be fixed soon?


Date: Tue 25 Jun 85 20:45:39-PDT
From: David Eppstein <[email protected]>

Well now I can log in as ANONYMOUS again, but I still can't get the file.
FTP complains "File not accessable. Read access required".


Date: Tuesday, 25 June 1985  20:49-PDT
From: Ken Harrenstien <[email protected]>

Foo.  Try again now.  The system went down for V6 testing before I could
fix the protection, and during V6 it would not recognize anonymous login,
which was why you were having problems the first time.  We will be doing
more such testing for about 1700-2000 in the next couple of days.


Date: Tue 25 Jun 85 20:58:14-PDT
From: David Eppstein <[email protected]>

Ok, I've got it and it does in fact produce a register allocation error.
I will poke at it some more tomorrow.

I am less sure of the use for the program itself.  At least at Sierra,
the EXEC has for some time understood about CC.  I just today modified
KCC so that /LANGUAGE-SWITCHES:" -Dfoo" to the COMPILE command will work
(the space is necessary).

I'm sure the SPOP is from the same bug that is causing the register
allocation error.  What this usually means is that some value is being
computed into a register or register pair but that later code for some
reason believes that it was generated as a stacked struct.


Date: Tue 25 Jun 85 23:07:19-PDT
From: David Eppstein <[email protected]>

The problem is your declaration
	char *av[];
which declares an empty array of char pointers.  Then when you assign
realloc() to av it thinks it uncleverly thinks its making an assignment
to a zero-length object and fucks up.

I think KCC should complain that av is zero-length, and I think it should
also complain that you're assigning something to an array (i.e. constant
address).


Date: Wednesday, 26 June 1985  01:58-PDT
From: Ken Harrenstien <[email protected]>

Hmmm.  I was going to ask rhetorically why KCC should complain, since
I thought "char *av[];" was supposed to be the same thing as "char
**av;" -- in both cases, "av" is a pointer to an array of char
pointers.  HOWEVER... attempting to compile CONDCC on a 4.2 VAX
revealed the following error messages:

$ cc condcc.c
"condcc.c", line 114: illegal lhs of assignment operator
"condcc.c", line 128: illegal lhs of assignment operator
"condcc.c", line 136: syntax error
"condcc.c", line 137: illegal type combination
"condcc.c", line 138: syntax error
$ ed condcc.c
2749
114p
        av = (char **)calloc(1, sizeof(char **));
128p
                av = (char **)realloc((char *)av, argc*sizeof(char **));
136p
wsp(char)
137p
int char;
138p
{       return(char == ' ' || char == '\t');
q
$ 


This forced me to do some grungy reading of K&R and taught me
something I wasn't clearly aware of before.  Namely, **av and *av[]
are only equivalent when declared as FORMAL PARAMETERS of a function.
When declaring them elsewhere (eg as auto vars), they are NOT
equivalent.  Argh!  Grumble, grumble, C declaration syntax strikes
again!  I agree, error messages are required.

As you might also notice from the errors above, the 4.2 CC is fussy
about finding a reserved keyword ("char") used as a parameter name.
Silly me.  I guess KCC should be fussy too.


Date: Wednesday, 26 June 1985  02:07-PDT
From: Ken Harrenstien <[email protected]>

Well, I already had the basic structure there for doing TENEX
cross-compiles so I thought I would add an indirect file feature (to
try it out there before possibly adding it to KCC).  The EXEC only
expands an indirect filespec if it is the first or second thing on a
command line (with nothing else after it); otherwise the program has
to do the work.  Anyway, this makes it easy to keep just one list of
library modules which various system-dependent command files can make
use of, otherwise I have to update all of them every time a module is
added or deleted.  I don't like to depend on EXEC modifications, which
are a lot harder to port (!!).

An interesting project for some gullible student would be to re-write a
public MAKE...
25-Jun-85 20:53:04-PDT,1337;000000000001
Date: 25 Jun 1985  20:53 PDT (Tue)
From: David Eppstein <[email protected]>
To: [email protected]
Subject: CC68 port

<KCC.CC68.PORT> will now produce assembly code from a C source.
There seem to be problems if I pipe SYS:CC -E into its standard input
though (in place of CPP as the macro preprocessor)...


Date: Tuesday, 25 June 1985  22:18-PDT
From: Len Bosack <[email protected]>

Congratulations! That's a significant body of code to run through and
have work.

Since this work includes CPP, you have a one-on-one comparison of the
macro expansion process. I wonder what horrors lurk.

How about the rest of those tools? as68 and lk68 are their names (I think).
Ask Phil about the whole process, all the way through to either something
you can boot or something to program into a EPROM.

Len

Date: 25 Jun 1985  22:45 PDT (Tue)
From: David Eppstein <[email protected]>

Well actually I haven't gotten CPP to compile yet.  That's why I
wanted to use KCC -E instead.  Turns out the problem with that is in
the runtimes rather than KCC so it would happen with CPP if I got that
to compile anyway.

I don't have sources to as68 and lk68.  But even if I did it's less
clear how to do anything useful with them, because their output is
eight-bit binary rather than the text cc68 produces.
25-Jun-85 21:34:33-PDT,1731;000000000001
Mail-From: KRONJ created at 25-Jun-85 21:34:32
Date: Tue 25 Jun 85 21:34:32-PDT
From: David Eppstein <[email protected]>
Subject: bug in runtimes
To: [email protected]

If I pipe two programs together, then when they exit I will get an
illegal instruction 0 at 777777 with the stack pointing to alternating
zeros and ones.  This looks like it might have been caused by fork()
rather than pipe() so it probably also happens if you try to run a
subfork even if you don't do any plumbing.

Also fork() seems to take inordinately long in copying the address
space but I don't know if that is from this bug or just because it
has to do a lot of work to make a safe copy (we don't want to leave
any shared pages because that's not the way UNIX works).

How'd you like to look into these?


Date: Tue 25 Jun 85 21:42:02-PDT
From: David Eppstein <[email protected]>
Subject: while you're looking at fork()

Another useful thing to do would be to create vfork(), which could be
like fork() except it keeps the same map as the parent rather than
carefully copying it.  I don't think we need to worry about the
"borrowing the parent's thread of control" the 4.2 manual talks about.
Then piping on | could be changed to use it and run much faster...


Date: Wednesday, 26 June 1985  01:13-PDT
From: Ken Harrenstien <[email protected]>

I've already found and fixed a bug thereabouts which had exactly the
symptoms you describe.  Found it while using fork() and wait().
About vfork(), I'll check it out in the 4.2BSD stuff.

Looks like it would be a good idea to copy over the runtimes as soon as
possible, if only to fix stuff like that.  I'll set things up and send
you a message when it's ready.
26-Jun-85 13:23:24-PDT,1513;000000000001
Date: Wednesday, 26 June 1985  04:56-PDT
From: Ken Harrenstien <[email protected]>
To: [email protected]
Subject: Bug with MALLOC.C

I had an infernal time trying to figure out why a new version of
CONDCC was losing, and finally discovered that the free storage is being
clobbered; MALLOC seems to think that a previously allocated area is
free, when it really isn't.  This may be due to the use of REALLOC by
the program.  I haven't investigated closely yet.  What I did try was
to substitute my own versions of malloc/etc from ELLE, and CONDCC then
worked.  I am too tired to think clearly right now.  It is of course
possible that CONDCC is doing some clobbering of its own which misleads
malloc.  I should have further opinions in the light of day.


Date: Thursday, 27 June 1985  00:26-PDT
From: Ken Harrenstien <[email protected]>
Subject: False alarm

Turns out that MALLOC is okay, the program was indeed clobbering a count
word.  Those routines need to be replaced someday (fragmentation problems)
but there is no urgent bug.  Amazing what a little rest does.  Onward again...


Date: Thursday, 27 June 1985  09:54-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
Subject: False alarm

Actually all that I think needs to be done to malloc and friends to
solve fragmentation is to merge blocks when freeing them.  Of course
this would be quicker (at the expense of a little space) if the
headers consisted of forward and backward links as well as the current
count...
26-Jun-85 13:24:40-PDT,5587;000000000001
Mail-From: KRONJ created at 26-Jun-85 13:24:36
Date: Wednesday, 26 June 1985  12:54-PDT
Message-ID: <[email protected]>
Sender: Bill Palmer <[email protected]>
From: Bill Palmer <[email protected]>
To: [email protected], [email protected], [email protected]
Subject:   for you dedicated unix-blizzards non-readers out there...
ReSent-From: [email protected]
ReSent-To: bug-c-archive
ReSent-Date: Wed 26 Jun 1985 13:24-PDT

From: [email protected] (Landon Noll)
Subject: 2nd Annual Obfuscated Contest Winners
Date: 24 Jun 85 23:14:23 GMT
To:       [email protected]

*** OBFUSCATE THIS LINE WITH YOUR MESSAGE ***

The following programs were judged good (bad? wierd?) enough to win
awards. This year, rather than trying to rank the programs in order of
obfuscatedness, we gave a single award in each of 4 categories and a
grand prize.

________________________________________________________________________

1. The most obscure program:
(submitted by Lennart Augustsson <seismo!mcvax!enea!chalmers!augustss> )

#define p struct c
#define q struct b
#define h a->a
#define i a->b
#define e i->c
#define o a=(*b->a)(b->b,b->c)
#define s return a;}q*
#define n (d,b)p*b;{q*a;p*c;
#define z(t)(t*)malloc(sizeof(t))
q{int a;p{q*(*a)();int b;p*c;}*b;};q*u n a=z(q);h=d;i=z(p);i->a=u;i->b=d+1;s
v n c=b;do o,b=i;while(!(h%d));i=c;i->a=v;i->b=d;e=b;s
w n o;c=i;i=b;i->a=w;e=z(p);e->a=v;e->b=h;e->c=c;s
t n for(;;)o,main(-h),b=i;}main(b){p*a;if(b>0)a=z(p),h=w,a->c=z(p),a->c->a=u,a->c->b=2,t(0,a);putchar(b?main(b/2),-b%2+'0':10);}

________________________________________________________________________

2. The worst abuse of the C preprocessor:
(submitted by Col. G. L. Sicherman <decvax!sunybcs!colonel> )

#define C_C_(_)~' '&_
#define _C_C(_)('\b'b'\b'>=C_C>'\t'b'\n')
#define C_C _|_
#define b *
#define C /b/
#define V _C_C(
main(C,V)
char **V;
/*	C program. (If you don't
 *	understand it look it
 */	up.) (In the C Manual)
{
	char _,__; 
	while (read(0,&__,1) & write((_=(_=C_C_(__),C)),
	_C_,1)) _=C-V+subr(&V);
}
subr(C)
char *C;
{
	C="Lint says "argument Manual isn't used."  What's that
	mean?"; while (write((read(C_C('"'-'/*"'/*"*/))?__:__-_+
	'\b'b'\b'|((_-52)%('\b'b'\b'+C_C_('\t'b'\n'))+1),1),&_,1));
}

[ This program confused the C preprocessor so badly that it left some
comments in the preprocessed version. Also, lint DID complain that
"argument Manual isn't used". ]

________________________________________________________________________

3. The strangest appearing program:
(submitted by Ed Lycklama <decvax!cca!ima!ism780!ed> )

#define o define
#o ___o write
#o ooo (unsigned)
#o o_o_ 1
#o _o_ char
#o _oo goto
#o _oo_ read
#o o_o for
#o o_ main
#o o__ if
#o oo_ 0
#o _o(_,__,___)(void)___o(_,__,ooo(___))
#o __o (o_o_<<((o_o_<<(o_o_<<o_o_))+(o_o_<<o_o_)))+(o_o_<<(o_o_<<(o_o_<<o_o_)))
o_(){_o_ _=oo_,__,___,____[__o];_oo ______;_____:___=__o-o_o_; _______:
_o(o_o_,____,__=(_-o_o_<___?_-o_o_:___));o_o(;__;_o(o_o_,"\b",o_o_),__--);
_o(o_o_," ",o_o_);o__(--___)_oo _______;_o(o_o_,"\n",o_o_);______:o__(_=_oo_(
oo_,____,__o))_oo _____;}

[it looks like tty noise]

________________________________________________________________________

4. The best "small" program:
(submitted by Jack Applin [with help from Robert Heckendorn]
<hplabs!hp-dcd!jack> )

main(v,c)char**c;{for(v[c++]="Hello, world!\n)";(!!c)[*c]&&(v--||--c&&execlp(*c,*c,c[!!c]+!!c,!c));**c=!c)write(!!*c,*c,!!**c);}

________________________________________________________________________

5. The grand prize (most well-rounded in confusion):
(submitted by Carl Shapiro <sdcrdcf!otto!carl> )

#define P(X)j=write(1,X,1)
#define C 39
int M[5000]={2},*u=M,N[5000],R=22,a[4],l[]={0,-1,C-1,-1},m[]={1,-C,-1,C},*b=N,
*d=N,c,e,f,g,i,j,k,s;main(){for(M[i=C*R-1]=24;f|d>=b;){c=M[g=i];i=e;for(s=f=0;
s<4;s++)if((k=m[s]+g)>=0&&k<C*R&&l[s]!=k%C&&(!M[k]||!j&&c>=16!=M[k]>=16))
a[f++]=s;if(f){f=M[e=m[s=a[rand()/(1+2147483647/f)]]+g];j=j<f?f:j;f+=c&-16*!j;
M[g]=c|1<<s;M[*d++=e]=f|1<<(s+2)%4;}else e=d>b++?b[-1]:e;}P(" ");for(s=C;--s;
P("_"))P(" ");for(;P("\n"),R--;P("|"))for(e=C;e--;P("_ "+(*u++/8)%2))
P("| "+(*u/4)%2);}

[As submitted, this program was 3 lines (2 of defines and 1 of code).
To make news/mail/etc. happy we split the last line into 7. Join them
back without the newlines to get the original version]

 ----------------------------------------------------------------------

Congratulations to the winners (and anyone else who wasted their time
creating such wierd programs).

For your own enjoyment, you can figure out what these programs do. Or 
if this is too hard, compile and run them! All of these compiled and
ran on the vax. Lint was even happy with most of the entries.

The winning programs may be published in a programming magazine.
The columnist will post further details to the net, if there are any.

Next years contest will be held somewhat earlier in the year so
that the winners will be announced to the summer Usenix's Usenet BOF.

[We will not post or mail the non-winning programs.]
[Although they are more interesting and constructive than most of net.flame]

[It is sad to say that there are programs which are part of UNIX systems
(sh,finger,config,etc.) which are not much easier to understand than
these award winners - and much longer] :-)

From the obfuscated keyboards of:

chongo <char *grepal="pheep";main(){printf("%s? %s\n",grepal,"grepal");}>
								/\??/\
27-Jun-85 09:58:10-PDT,1554;000000000001
Mail-From: KRONJ created at 27-Jun-85 09:58:06
Date: Thursday, 27 June 1985  02:42-PDT
Message-ID: <[email protected]>
Sender: Ken Harrenstien <[email protected]>
From: Ken Harrenstien <[email protected]>
To: [email protected]
cc: [email protected]
Subject:   Fixes to KCC, and CLIB ready (?)
ReSent-From: [email protected]
ReSent-To: bug-c-archive
ReSent-Date: Thu 27 Jun 1985 09:58-PDT

I found a couple of minor things that needed fixing in KCC; the files CC.C
and CCOUT.C have new versions.  KCC was running out of FILE ptrs since it
was forgetting to fclose the header file and possibly the main input file,
although I'm not certain about the latter.  Also, I fixed floating-point
constant output to be compatible with both FAIL and MACRO.  The $$ONE
fix I have already mentioned (also in CCOUT).  SS:<C.KCC.CC> as usual.

With those and CONDCC I was able to compile CLIB with MACRO.  I think you
should grab a copy at this point, by FTP'ing version .0 of everything in
SS:<C.KCC.LIB>.  If at all possible I think you should put the new stuff
in a new directory so as not to conflict with the existing stuff in KCC.LIB;
too much has changed.  Keep the old directory around in case there are
any problems, and so code can be SRCCOM'd if any questions arise.

There are some notes in CLIB.DOC which describe the various files.

Since the configuration params are identical for NIC and SIERRA, a
complete recompile isn't really necessary; it should work to just copy
CLIB.REL and C-HDR.FAI to C:.
27-Jun-85 16:36:22-PDT,6959;000000000001
Mail-From: KRONJ created at 27-Jun-85 16:36:20
Date: Thu 27 Jun 85 16:36:20-PDT
From: David Eppstein <[email protected]>
Subject: status
To: [email protected]

I have taken your sources both to the KCC changes and to the runtimes.
Some comments:

- I didn't take the change in CC.C to close in -- instead I fixed CCINP.C
  to do that.  I had made some changes to CC.C so I had to do a merge
  anyway, and so I put it where I thought it fit better.

- I have changed printf() to use _doprnt() as in the UNIX versions.
  This means there is no longer a limit on how long a printf can be.

- When recompiling PRINTF.C using CLIB.MIC I discovered that it produces
  an EXTERN declaration for _flout() [declared implicitly by the calls
  in what is now _doprnt()].  I fixed this by declaring it static() and
  by fixing KCC to notice such a declaration.

- Bill Palmer had made an edit to time.c that got lost.  It turns out his
  edit was incorrect, and has been replaced by instead fixing the KCC bug
  that made it necessary.  I am still in the process of recompiling KCC
  now that the bug has been fixed (Sierra is rather loaded).

- The %PURE etc macros in C-HDR seem excessively complicated to me.
  KCC emitted code is guaranteed to alternate between the two, and
  entry statements will always be emitted before either of them.
  Therefore unless you have strong objections I will redo them the old way.

- Defining P in C-HDR is a very bad idea.  I plan to either rename it to
  something innocuous like $SP or flush the definition altogether and go
  back to using 17.

- I haven't put this up on Sierra yet.  I plan to do so once I have made the
  above changes and tested a KCC loaded using the resulting CLIB and C-HDR.
  Load and downtime permitting this should happen sometime tomorrow.

- KA format doubles will not work because KCC emits the sequence for $DFLOT
  in line rather than calling it like $DFIX and friends.  Similarly G format
  will not work (there seems to be less support in the runtimes for the
  latter, also).

In general the runtimes seem to be much better organized than before.
The idea of using assembler in C files rather than FAI files works
better than I had thought it might.  I hope there aren't too many new bugs...


Date: Friday, 28 June 1985  14:48-PDT
From: Ken Harrenstien <[email protected]>

First, some bad news.  One of our RP07's is down, and SS: will be
unavailable for the next day (or 2 or 3?)  The good news, I perceive,
is that you were able to snarf everything before then, thus outwitting
the gremlins!

Since I now have nothing better to do, some comments on your comments:

- I didn't take the change in CC.C to close in -- instead I fixed CCINP.C
  to do that.  I had made some changes to CC.C so I had to do a merge
  anyway, and so I put it where I thought it fit better.

[Fine, I was just nailing up everything that looked like a mouse hole]

- I have changed printf() to use _doprnt() as in the UNIX versions.
  This means there is no longer a limit on how long a printf can be.

[Actually, the UNIX version does have a limit.  It isn't very big either.
So portable code unfortunately will have to avoid huge printf strings anyway.]

- When recompiling PRINTF.C using CLIB.MIC I discovered that it produces
  an EXTERN declaration for _flout() [declared implicitly by the calls
  in what is now _doprnt()].  I fixed this by declaring it static() and
  by fixing KCC to notice such a declaration.

[If you read CLIB.DOC you'll notice this problem is mentioned.  Hmm, so
your solution is to allow a "static _flout();" statement?  That seems like
a good hack -- I couldn't think of anything obvious myself -- but we should
remember someplace that this needs to be changed back if/when #asm inside
a function body becomes possible, because at that point it will be an
error to declare a static function which isn't actually in the module.
You will get a lot of such "errors" when you compile the rest of CLIB; you can
either fix them or ignore them.  By the way, if you use -m you can also
avoid the silly "FAIL BUG IN SEARCH" messages.]

- The %PURE etc macros in C-HDR seem excessively complicated to me.
  KCC emitted code is guaranteed to alternate between the two, and
  entry statements will always be emitted before either of them.
  Therefore unless you have strong objections I will redo them the old way.

[At the time "entry" was fixed, much of the complication became
unnecessary.  I plan to simplify it further, but I would still very
much like to retain the feature of immunity to a double switch.  Costs
nothing and ensures peace of mind.]

- Defining P in C-HDR is a very bad idea.  I plan to either rename it to
  something innocuous like $SP or flush the definition altogether and go
  back to using 17.

[Right -- that one slipped past.  It has not caused a problem so far,
but obviously could.  Some #asm code will need changing.  As I mention
in CLIB.DOC I used a convention of $ for external locations, and % for
macros or symbolic values; to follow this I would suggest %P.
Probably the #asm code can be fixed most easily by just inserting a
P==%P in each module that needs it. ]

- KA format doubles will not work because KCC emits the sequence for $DFLOT
  in line rather than calling it like $DFIX and friends.  Similarly G format
  will not work (there seems to be less support in the runtimes for the
  latter, also).

[Hmm I didn't realize it was now inline.  Would it work for KCC to
emit a %DFLOT macro?  I can't get at the source just now and don't
remember how complicated the code is.  I'm not sure what you mean
about lack of G format support; do you mean I should have added macros
to replace DFAD/DFSB/DFMP/DFDV?  That's easy enough, but it needs to
be decided whether to have the variant floating point stuff done by
C-HDR or by KCC; if by C-HDR, then KCC should emit macros, since
otherwise C-HDR cannot always use the best instruction sequence (note
the single instruction frobs for G float and fix!); if by KCC then KCC
needs a format switch and some CCOUT beefup.]

In general the runtimes seem to be much better organized than before.
The idea of using assembler in C files rather than FAI files works
better than I had thought it might.  I hope there aren't too many new bugs...

[Thanks.  One thing I was trying to do with the reorganization was to 
put in CRT.C everything that KCC "C" needed to run, as a language, regardless
of what libraries or system calls the language might be used with.  URT.C
by contrast contains the runtime stuff necessary to provide a UNIX-style
environment... which may not always be wanted.  Not that I expect anyone
to re-write TOPS-20 in C, but who knows.
	I have been using the new stuff when building each new version of
KCC the past couple weeks, so it shouldn't be too buggy...]
27-Jun-85 20:48:27-PDT,345;000000000001
Mail-From: KRONJ created at 27-Jun-85 20:48:26
Date: Thu 27 Jun 85 20:48:26-PDT
From: David Eppstein <[email protected]>
Subject: your double and adjbp simulations
To: [email protected]

These need to be more careful to not disturb any registers other than
the ones expected to be changed by the operation.  I am fixing them.
-------
29-Jun-85 14:47:04-PDT,861;000000000001
Date: Thursday, 27 June 1985  23:09-PDT
From: Bill Palmer <[email protected]>
To: [email protected]
Subject: kcc bug

Compiling <whp4.plot>crtplo.c causes kcc to output some fail code that has
multiple instructions on the same line in the line: routine.  Could this
maybe be the fault of some of KLH's funky macros?

					Bill

Date: Thursday, 27 June 1985  23:24-PDT
From: Bill Palmer <[email protected]>

Turns out it was your bug after all.  In CCOUT.C you dropped a \n in one
of the statements in the DFIX R,R code - I put it back for you but didn't
make a new KCC.  It also generates a spurious \n before the code, but
that could be regarded as a paranoia feature, I guess, so I left it in.

					Bill

Date: 29 Jun 1985  17:23 PDT (Sat)
From: David Eppstein <[email protected]>

I've put up a CC.EXE with your DFIX fix.
29-Jun-85 14:53:48-PDT,662;000000000001
Mail-From: KRONJ created at 29-Jun-85 14:53:46
Date: Friday, 28 June 1985  21:26-PDT
Message-ID: <[email protected]>
Sender: Bill Palmer <[email protected]>
From: Bill Palmer <[email protected]>
To: [email protected]
Subject:   kcc spirited away
ReSent-From: [email protected]
ReSent-To: bug-c-archive
ReSent-Date: Sat 29 Jun 1985 14:53-PDT

I moved copies of <kcc.c>, <kcc.lib>, <kcc.cc>, <kcc.lib.old>, <kcc.c.sys>,
<kcc.c.local> and maybe something else to csli in <whp4.kcc.*>.  You don't
seem to have an account there, so I'll make sure they are publically
readable in case you need to swipe them.

					Bill
29-Jun-85 14:55:56-PDT,305;000000000001
Mail-From: KRONJ created at 29-Jun-85 14:55:54
Date: Sat 29 Jun 85 14:55:54-PDT
From: David Eppstein <[email protected]>
Subject: Runtimes
To: [email protected]

Sierra was down all Friday so I didn't have a chance to do anything about
finishing installing the runtimes.  Maybe Monday.
-------
29-Jun-85 17:14:42-PDT,754;000000000001
Mail-From: KRONJ created at 29-Jun-85 17:14:40
Date: Saturday, 29 June 1985  17:13-PDT
Message-ID: <[email protected]>
Sender: Kirk Lougheed <[email protected]>
From: Kirk Lougheed <[email protected]>
To: [email protected]
cc: [email protected]
Subject:   Sierra
ReSent-From: [email protected]
ReSent-To: bug-c-archive
ReSent-Date: Sat 29 Jun 1985 17:14-PDT

Hello Rob -
	The Sierra dialup number is 321-0211 (ten lines in rotary, 1200+300
Racal Vadics).  I've also put your new mailbox (gingell!sun@Glacier) in the
BUG-KCC mailing list.

David Eppstein (KRONJ@Sierra) is the one banging away at the compiler these
days.  He can help you with access to KCC and similar matters.

Kirk
29-Jun-85 18:58:43-PDT,429;000000000001
Mail-From: KRONJ created at 29-Jun-85 18:58:42
Date: Sat 29 Jun 85 18:58:42-PDT
From: David Eppstein <[email protected]>
Subject: terminal message
To: [email protected]

WHP4, TTY2, 29-Jun-85 6:23PM
maybe it might not be a bad idea to set up a nightly batch job to update
kcc sources elsewhere? given that other machines have copies of the sources,
they might as well be reasonably up to date...
-------
 1-Jul-85 15:18:42-PDT,758;000000000001
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Mon 1 Jul 85 15:15:50-PDT
Date: Mon 1 Jul 85 15:04:30-PDT
From: Andrew "Droid" Gideon <[email protected]>
Subject: Question on returned value in KCC
To: [email protected]
Office-Phone: (415) 497-4816

Hi again.

This time, a simple question.  What is the retuned value of
the funtion "sizeof()" in KCC?

A CHAR returns 1, INT returns 4, and DOUBLE returns 8.  Is this
the number of 9 bit bytes?  

		Thanks,
		  Andy

Date: Monday, 1 July 1985  15:18-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Andrew "Droid" Gideon <GIDEON at SU-SCORE.ARPA>
Re:   Question on returned value in KCC

Yes, it's the number of 9-bit bytes.
 1-Jul-85 15:36:27-PDT,622;000000000001
Mail-From: KRONJ created at  1-Jul-85 15:36:08
Date: Mon 1 Jul 85 15:36:08-PDT
From: David Eppstein <[email protected]>
Subject: better extended addressing test
To: [email protected]

Occasionally in the runtimes I've been seeing
	XHLLI reg,.
	JUMPN reg,<extended addressing>
sometimes not being careful to make sure that the reg starts non-zero.
XHLLI doesn't change the right half of the register.
A much better test is XMOVEI reg,0.


Date: Tuesday, 2 July 1985  14:13-PDT
From: Ken Harrenstien <[email protected]>

I'm not sure what I was thinking of when I used XHLLI... must have been
real late.
 1-Jul-85 18:12:57-PDT,2977;000000000001
Mail-From: KRONJ created at  1-Jul-85 18:12:55
Date: Mon 1 Jul 85 18:12:55-PDT
From: David Eppstein <[email protected]>
Subject: new runtimes working, installed on Sierra
To: [email protected]

After a little trouble with brk() and sbrk() involving the XHLLI I mentioned
and a rewrite of sbrk() in C that I didn't get right the first time, I've
gotten your runtimes to work well enough to make the compiler compile itself
correctly using them, and so I have put them up in C: on Sierra.  You should
probably take back a copy for yourself.

- The only remaining "E" errors from MACRO are in CRT.C - the EXTERN
  declarations were in C-HDR.FAI so I couldn't get rid of them with static.

- I've flushed the definitions of P and XHLLI.  The latter was reasonable
  to have but it wasn't used after I got through.

- %IMPURE and %PURE are back to being $$DATA and $$CODE which both expand
  to RELOC.  I like the idea of your convention for macros vs symbols but
  I didn't feel like changing KCC again for it.

- WHP4 wrote two new modules, GETTIM and RENAME.

- I implemented vfork().  After some reflection I decided that it was necessary
  for the superior to wait for the inferior exec() as the UNIX man describes.
  To do this I needed a flag which would cause the inferior to HALTF when it
  does the exec(), at which point the superior returns from a WFORK and does
  a continue, then returns knowing that it is finally safe to do things like
  storing the returned pid in a variable without the inferior being able to
  write on top of it.  So anyway since both exec() and vfork() need to see this
  flag, I merged EXEC.C and FORK.C.  It's all less complicated than it sounds,
  and it saves 10 cpu seconds for every command line pipe setup.

- sbrk() now only takes the number of bytes wanted rather than rounding up
  to words.  It's written in C, and is no longer quite so machine dependant.

- Somewhere in the ITS code (I think) there was a call to calloc() after
  which the code expected that the pushed arguments had not been munged.
  This is in general a bad assumption.  I didn't do anything about it though,
  beyond putting up a large block comment explaining why this is bad.

- malloc() now checks the return value from sbrk().

- printf() hex output now works; printf now goes through _doprnt which uses
  file descriptors rather than string buffers for output.

- exec() does a BLT to set up the registers rather than doing DMOVEs or
  (for your TENEX code) MOVEs.  I used AC0 which is probably not a good
  idea but all the others were taken except for AC17 which is even worse.
  Someday pfork() should go through there rather than having its own
  copy of the chain-to-program code.

The stuff before you hacked it is still in <KCC.LIB.OLD>, and will probably
stay there a while.  I am unlikely to be doing any major runtime hacking
soon so if you have things you want done, go ahead.
-------
 3-Jul-85 15:50:55-PDT,889;000000000001
Received: from LOTS-A by Sierra with Pup; Wed 3 Jul 85 14:47:47-PDT
Date: Wed 3 Jul 85 14:47:30-PDT
From: Andrew "Droid" Gideon <G.GIDEON@LOTS-A>
Subject: YAKQ - Yet another KCC question
To: kronj@Sierra
Office-Phone: (415) 497-4816

Hi again.

What does the error "Register allocation error: unreleased registers left -
	over from previous code." mean?

It seems to be occurring at the start of procedure declarations, and
I do not know what to look for so far as errors are concerned.

		Andy

Date: Wednesday, 3 July 1985  15:50-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Andrew "Droid" Gideon <G.GIDEON at LOTS-A>
Re:   YAKQ - Yet another KCC question

It means you have a broken version of KCC.  If it still happens with
the latest version from Sierra (you will also need to copy the files
in C:) please point me at the source that it barfs on.
 3-Jul-85 22:42:34-PDT,683;000000000001
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Wed 3 Jul 85 22:30:06-PDT
Date: Wed 3 Jul 85 22:29:51-PDT
From: Len Bosack <[email protected]>
Subject: as68, ld68
To: [email protected]

Look in [score]<bosack.d.as> and [score]<bosack.d.ld>

You will likely need some more header files. If you can figure out which
ones, I can probably find them for you.

Len


Date: Wed 3 Jul 85 23:55:12-PDT
From: David Eppstein <[email protected]>

Ok, I will probably pick these up Friday or Monday.
It would also be useful to get a copy of c2 (the peepholer)
and maybe cc (the mother program that puts it all together).
 5-Jul-85 00:00:26-PDT,705;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 4 Jul 85 23:55:47-PDT
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SRI-NIC.ARPA with TCP; Wed 3 Jul 85 23:50:43-PDT
Date: Thu, 4 Jul 1985  00:50 MDT
Message-ID: <[email protected]>
From: "Frank J. Wancho" <[email protected]>
To:   KLH@NIC
cc:   [email protected]
Subject: KCC Bug Report: Underscores
ReSent-Date: Thu 4 Jul 85 23:56:00-PDT
ReSent-From: Ken Harrenstien <[email protected]>
ReSent-To: [email protected]

Ken,

KCC ignores underscores (and anything to the left of the underscore)
in filenames supplied for compilation.

--Frank
 5-Jul-85 12:31:00-PDT,764;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Fri 5 Jul 85 12:27:01-PDT
Date: Fri 5 Jul 85 12:26:59-PDT
From: David Roode <[email protected]>
Subject: program name for C compiler
To: [email protected], [email protected], [email protected]
Location:  EJ286    Phone: (415) 859-2774

Why do the error messages and so forth indicate it should be called
KCC.EXE whereas it is really called CC.EXE.  The KCC name would be
superior since a bounced character on a C for Continue command
invokes CC, and this is the opposite of what the user
desires when he is trying to continue his fork!!!!!
No one need ever run the compiler directly anyway, since
it is accessible via the LOAD-class commands.
-------
 5-Jul-85 13:02:15-PDT,447;000000000001
Mail-From: KRONJ created at  5-Jul-85 13:02:13
Date: Fri 5 Jul 85 13:02:13-PDT
From: David Eppstein <[email protected]>
Subject: hock
To: [email protected]

my latest versions of KCC complain about uses of structure members
from the wrong structure in cmpasup().  Is this my error or yours?


Date: Fri 5 Jul 85 14:47:46-PDT
From: Ken Harrenstien <[email protected]>

Mine, should be "struct supvsr".  Guess KCC is getting smarter.
 6-Jul-85 09:07:04-PDT,10362;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 6 Jul 85 06:18:23-PDT
Date: Sat 6 Jul 85 06:18:34-PDT
From: Ken Harrenstien <[email protected]>
Subject: PFORK Chain bug  [and later: system() -- DE]
To: [email protected]
cc: [email protected]

I don't feel confident that I know what is going on in pfork (or more
accurately, why) so maybe you should handle this.
                ---------------

Date: Fri, 5 Jul 1985  21:21 MDT
From: "Frank J. Wancho" <[email protected]>
To:   KLH@NIC
Re:   PFORK Chain bug

Ken,

PFORK with the chain option will enter the program at the entry vector
+ 1 (and ignore the start_offset argument).  If anything, it should
have been +0...

--Frank


Date: Sat 6 Jul 85 09:16:16-PDT
From: David Eppstein <[email protected]>

Ok, I've fixed the runtimes to look at the offset argument even when chaining.


Date: Sat, 6 Jul 1985  10:38 MDT
From: "Frank J. Wancho" <[email protected]>

Dave,

Would you put my address on your BUG-KCC list so I can track progress
and have an idea of when to grab newer versions of things?  Also, you
might be interested in a related runtime I've been working on for the
past several days, and now have working: system().  It bypasses the
kludge of a wasteful and slow fork()/execv() sequence.  It does not
currently handle the case of feeding it EXEC built-ins because EXEC
appears not to do a RSCAN and I don't know how to use pipe() to talk
to it (yet).

--Frank


Date: 6 Jul 1985  12:35 PDT (Sat)
From: David Eppstein <[email protected]>

Ok, you're now on Bug-KCC.  I'd be interested in any runtimes you care to
write.  We now have vfork() so running programs isn't quite so bad, but
system() would be useful in any case.

Too bad you can't do "EXEC -c commands" the way you can with the shell...
pipe() is pretty easy but you need monitor mods to do it.  I intend
one day to make them work with PTYs if the PIP: device doesn't work, but
it hasn't yet been done.


Date: Sat, 6 Jul 1985  15:18 MDT
From: "Frank J. Wancho" <[email protected]>
Subject: system()

Thanks for putting me on your list.

fork() does more work than it has to, i.e., it maps ALL sections, even
if they are unused, and that takes alot of time.  I'm not sure about
vfork().  Certainly system() is the better way to go for many cases.

Right now I have system() imbedded in a test program and it may need
to have TNX conditionals reinstalled by an expert.  It depends on
RSCAN% so the TNX conditional is "easy".  Perhaps the "right" thing to
do is merge this code into FORK.C as it uses a variant of that code
and PFORK.C.  The current version without the experimental EXEC code
is in <WANCHO>TESTIT.C and is subject to change if I ever can get the
EXEC part to work.

As for passing command to EXEC, I guess what we really need is a
lobotomized version of EXEC that uses RSCAN% (or a third entry point
in the real EXEC that does an RSCAN%).

We do have the PIP: device in our MONITR; I just never quite
understood how to use it.  I have examined URT.C, but even with the
comments, I'm not sure of my ground.  It may be a case of just being
too easy to use that I'm overlooking the obvious.

--Frank


Date: Saturday, 6 July 1985  14:23-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
Re:   vfork()

...is like fork() except that the map is set by using CR%MAP in the CFORK%
(i.e. impure sections are mapped the same rather than copies of each other),
and that the superior waits for the new fork to do an exec() before
continuing (to avoid races).  So it runs much faster.

I thought that fork() was pretty efficient about copying empty
sections, and that all that time was going into copying the pages that
were really there.  But I don't really remember.

Why do you need to use pfork() instead of fork() for this, anyway?


Date: Sat, 6 Jul 1985  15:31 MDT
From: "Frank J. Wancho" <[email protected]>
Subject: pfork() vs fork()

I don't.  I just stole code from both.

--Frank


Date: Tue, 9 Jul 1985  20:17 MDT
From: "Frank J. Wancho" <[email protected]>
Subject: system()

Dave,

Have you had a chance to look at my system()?  Any questions?

I've also been wondering about the possibilities of implementing
popen() and pclose() without the necessity of doing a v/fork() first,
and piping the I/O directly to the inferior fork.  This would get us
the remaining feature for system(): being able to "talk" to an EXEC!

--Frank


Date: Tuesday, 9 July 1985  20:36-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>

No, I don't recall you telling me where you are keeping your sources.
So I haven't been able to pick up your system().

I'm not sure I understand your second question.  What are popen() and
pclose()?  Maybe you mean pipe()?  If you need to understand the
calling conventions of pipe() you can look it up in a UNIX manual...
In any case you would also need to do a [v]fork() and exec() to get
your inferior EXEC.  It should be possible to use a pipe to send input
to the EXEC, and if you want (although I don't think this is necessary
for system()) to use another one to pick up the results.  This may or
may not work if the EXEC command wants to run a program or do I/O
redirection in a PCL command, though.


Date: Tue, 9 Jul 1985  22:11 MDT
From: "Frank J. Wancho" <[email protected]>

Dave,

I *thought* I sent you a message pointing to my test file which
includes system().  It is here in <WANCHO>TESTIT.C.

FILE *popen(command, type) 
char *command, *type;

int pclose(stream)
FILE *stream;

	The arguments to popen are pointers to null-terminated strings
	containing, respectively, a shell command line and an I/O
	mode, either r for reading or w for writing.  Popen creates a
	pipe between the calling process and the command to be
	executed.  The value returned is a stream pointer that can be
	used (as appropriate) to write to the standard input of the
	command or read from its standard output.

	A stream opened by popen should be closed by pclose, which
	waits for the associated process to terminate and returns the
	exit status of the command.

	Because open files are shared, a type r command may be used as
	an input filter, and a type w as an output filter.

There's more, but I think you get the idea.  I do not understand the
need to (perhaps ever) use a [v]fork/exec combination, when the
internal routine, _xfork(), of system() will do, i.e., get the program
loaded directly in an inferior fork, with no intervening mapping of
the current program (a unnecessary duplication, except when spawning a
duplicate for a good reason) and an execX().  And even if you do need
an EXEC, you should be able to get one directly (with _xfork()), and
(somehow) use pipe() and dup2() to pass it the given command.  EXEC
wants to read the console, and I don't see why you can't fake it with
a dup2().

--Frank


Date: Wed 10 Jul 85 10:28:40-PDT
From: David Eppstein <[email protected]>

Looks reasonable.  I'm not convinced the overhead this way is much less
than that of a vfork()/execlp() -- vfork() copies the map with CR%MAP in
the CFORK% so it's pretty efficient -- but the only reason I can think
of for not doing it your way would be to keep as little as possible
in assembly language routines.  One minor nit: it would be better to use
SFRKV% than GEVEC% followed by SFORK%.  Not only does this save a JSYS,
but it also has a better chance of working if the program wants to start
in extended addressing.

I forget -- is system(NULL) supposed to push to a shell?  I would guess
that from your recent comments you're working on doing EXEC commands,
and maybe popen()/pclose().  If you get those working well, I can flush
the vfork()/execlp() that handle piping in the startup code...


Date: Wed, 10 Jul 1985  12:15 MDT
From: "Frank J. Wancho" <[email protected]>

Take a copy of my sample driver and code the same thing using
vfork()/execlp() combination.  Run your version with something
relatively constant, say, FINGER KRONJ, and note the startup time.
Then run mine with the same command.  Note the difference.  It is
perhaps more noticeable and dramatic on this small 2040...

I'll change the code as you suggest to use SFRKV%.

I don't know about system(NULL).  The books don't say.  I suspect it
would do nothing and return.  I'll have to check on a real Unix
machine.  Because of the way I coded system(), it expects a PROGRAM to
be run.  My thoughts were to allow system("EXEC DIR") as a special
case, since there is now way for it to otherwise determine that
system("DIR") is really an EXEC command (without trying to open
SYS:DIR.EXE and then try as if "EXEC DIR" was given, which may not
necessarily be what was intended).

The real problem here with these commands is a basic one between the
Unix concept of a process/fork and the TOPS-20 one.  In Unix, you
cannot start up a process without doing a v/fork() first.  Here we
don't have to.  In Unix, you can do any of the execx() functions, but
none of them return - they are run under the current environment.
That is why you need a forked environment "above" an execX().  The
closest Unix comes to what we have is the shell only command option, a
trailing ampersand.

Anyway, enough ranting and raving.  I'll see if I can figure out a
popen() and pclose(), unless someone else beats me to it.

BTW, I mentioned to KEN and forgot to mention to you: if you don't
have it or have never seen it, I have the sources for the so-called
MIT C compiler runtimes, originally from Alan Snyder at SUMEX and most
recently maintained by EBM@XX.  You are welcome to look at them for
ideas.  Let me know.

Lastly, I have the NMIMT C compiler online here, but not the sources,
which are on a tape I cannot read yet.  I had Ken try it already.  He
was not overly impressed.  It is, however, a strictly v7 compiler, and
a native one...  If you are interested in playing with it, let me know
also and I'll send you the pointer and basic instructions to use i
(and Ken's comments, which I've forwarded to the author, who is on
vacation, for his comments before I send them on to the group at
large).

--Frank
 8-Jul-85 12:58:04-PDT,2623;000000000001
Received: from LOTS-B by Sierra with Pup; Mon 8 Jul 85 12:10:32-PDT
Date: Mon 8 Jul 85 12:09:54-PDT
From: Andrew "Droid" Gideon <G.GIDEON@LOTS-B>
Subject: KCC (yet again!!)
To: kronj@Sierra
Office-Phone: (415) 497-4816

Hi again.

I copied all of KCC fresh from Sierra, and all the compilation woes
disappeared.  Well and good.

But after Linking and Saving, the run goes for a bit, and then yields
an "illegal instruction" error message.  The .EXE is a result of seven 
.C files and about ten .H files...no assembler.  So, is there anything 
that I could have done within a C program to call an illegal instruction 
to be generated?

Here is a "photo" of the run.  What do you think?


				Andy


[PHOTO:  Recording initiated  Mon 8-Jul-85 12:02PM]

B!lndvif.EXE.8 
lndvif: Starting
Input File:mobius.dvi
Output File:mobius.ln0
lndvif: Read font log
lndvif: Print DVI file (?) 
Preamble:  TeX output 1985.06.24:1605
  ver=2  num=25400000  denom=473628672  mag=1100
?Illegal instruction 402405 at 223157
?Undefined operation code
B!pop

[PHOTO:  Recording terminated Mon 8-Jul-85 12:03PM]

Date: Monday, 8 July 1985  12:57-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   Andrew "Droid" Gideon <G.GIDEON at LOTS-B>
Re:   KCC (yet again!!)

The PC at which you got the error is in the data section, so you must
have erroneously jumped there.  I suppose you could be trying to call
an array as a function or something, but it's more likely that
something is being trashed in your stack.

You shouldn't have had to recompile CC; in fact, at some points it has
changed such that older versions would have been unable to compile it.
It's easier just to take SYS:CC.EXE from Sierra.  When you copied KCC,
did you also get the newer C:CLIB.REL and C-HDR.FAI?


Date: Mon 8 Jul 85 13:30:54-PDT
From: Andrew "VaxBuster" Gideon <[email protected]>

Well, I didn't recompile it.  I took all from Sierra.  Including the
CLIB & C-HDR files.

What could I be doing to trash the stack in the programs?

			Andy


Date: Mon 8 Jul 85 16:22:48-PDT
From: David Eppstein <[email protected]>

Well, you could be using a local array out of bounds.  Or you could have
declared the return value to a function to be a structure in one place and
a pointer to a structure in another place.  Or you could have not passed
enough arguments to a function.  Or you could have trashed AC17 somehow.
What I usually do in this sort of case is poke around at the corpse in DDT.
What we really need is a source-level debugger.  You are welcome to try
to write one...
 8-Jul-85 16:05:03-PDT,819;000000000001
Mail-From: KRONJ created at  8-Jul-85 16:04:38
Date: Mon 8 Jul 85 16:04:38-PDT
From: David Eppstein <[email protected]>
Subject: library bugs fixed
To: [email protected]

I've just fixed a couple of bugs in the runtimes:
- feof() wasn't working due to a confusion between stdio and read()
  over return values for errors vs. eof.
- malloc() was blowing away the end of the argument strings because
  sbrk() no longer bothers to word-align its arguments.  malloc()
  has been changed to make sure the start of any new block of memory
  is word aligned.
These fixes are for CLIB.REL since version 112, of 1-Jul.  If you
have an older CLIB.REL you probably don't need them.  However there
are a lot of other useful changes in that version so you might
as well pick up the new one anyway.
-------
 9-Jul-85 14:35:39-PDT,2365;000000000001
Mail-From: KRONJ created at  9-Jul-85 14:34:42
Date: 9 Jul 1985  14:34 PDT (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To: Andrew "Droid" Gideon <[email protected]>
Cc: [email protected], [email protected]
Subject: A question of buffers??
In-reply-to: Msg of 9 Jul 1985  14:19-PDT from Andrew "Droid" Gideon <GIDEON at SU-SIERRA.ARPA>

    Date: Tuesday, 9 July 1985  14:19-PDT
    From: Andrew "Droid" Gideon <GIDEON at SU-SIERRA.ARPA>

    I am using the fseek() call on a file that I have open in eight
    bit mode (mode="R").  I 'fseek' somewhere, and then read in a
    byte with getc().

    But the fseek only seems to work in two cases.  First, on the first
    'getc()' call, and second, at EOF.  Other times, the 'getc()' reads
    in the next byte in sequence, as if the fseek had not occurred.

    I looked at the code in lseek.c, fseek.c, and stdio.c.  In the first
    two, fseek seems to merely call the SFPTR jsys.  In the third, getc()
    appears to work from a buffer until the buffer is empty.  I could see
    nothing which refills the buffer after the SFPTR.  I am guessing that
    this is why the fseek() call works only in those two cases.

There was code in fseek() to do an fflush(), which clears out all
buffering.  However, Greg Satz had apparently disabled this for input files
[Greg: Why??].  I have turned it back on for all files.


Date: Tue 9 Jul 85 14:41:42-PDT
From: Andrew "Droid" Gideon <G.GIDEON@LOTS-A>
Subject: Hey...I actually solved that one!

After I do the fseek() call, I set the stream's pointer and counter
back to start, forcing the fillbuf() call (the buffer is now empty).
This works.

I got the idea from the code for rewind() in FSEEK.C.

Does this sound like a reasonable solution?

	Thanks...

		Andy

P.S.	Another thing I considered, but haven't tried, is replacing
	j = getc(fp);

	with...

	read(fp->_file,&j,1);

	Would this work?


Date: Tue 9 Jul 85 14:49:27-PDT
From: Greg Satz <[email protected]>

I barely remember why, but I do remember it screwing up a program
that was trying to read stdin, possibly from a pipe. I think the program
wasn't getting the correct amount of data towards the end of a stream.

I will see if I can find what I was working on then.
10-Jul-85 13:04:23-PDT,5075;000000000001
Mail-From: WHP4 created at 10-Jul-85 11:33:16
Date: Wed 10 Jul 85 11:33:16-PDT
From: Bill Palmer <[email protected]>
Subject: kcc broken
To: [email protected]

[PHOTO:  Recording initiated  Wed 10-Jul-85 11:29AM]

!ty foo.c
#include <sys/time.h>
main(){time();}
!remark
Type remark.  End with CTRL/Z.
yes, I know the program doesn't make much sense, but that shouldn't matter.
look what happens when I try loading it.
^Z
!load foo.c
LINK:	Loading
^C
!del foo.rel
 FOO.REL.2 [OK]
!load foo.c
KCC:	FOO
<WHP4.LOAD>FOO.FAI.1
FAIL:  FOO
?Illegal instruction 0 at 0 (PC = 0)
?Undefined operation code
!del foo.rel
 FOO.REL.3 [OK]
!cc foo
KCC:	foo
<WHP4.LOAD>FOO.FAI.1
FAIL:  foo
?Illegal instruction 0 at 0 (PC = 0)
?Undefined operation code
!remARK (MODE) 
Type remark.  End with CTRL/Z.
looks like someone broke the stuff that chains to link.  I'll take a look at
it, but if you think you know what is wrong....
^Z
!pop

[PHOTO:  Recording terminated Wed 10-Jul-85 11:32AM]


Date: Wednesday, 10 July 1985  13:04-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>

I changed pfork().  The purpose of this was to use the offset argument
even when chaining.  I am not sure how this could have broken FAIL,
since KCC runs FAIL without chaining.  I didn't think I broke it but I
guess I might have been wrong.  Feel free to fix it.


Date: Wed 10 Jul 85 13:11:39-PDT
From: Bill Palmer <[email protected]>

No, it's not FAIL that is breaking, it is LINK.

KLH "fixed" pfork to use SFRKV when chaining.  This doesn't work, probably
because of a monitor bug.  My version did a GEVEC and then did a JRST 1(2).
I fixed it up to assemble the proper JRST instruction with whatever offset
was passed along and it works again.  The fixed copy of PFORK.C is in my
directory.  Was going to send a followup msg, but my stomach interrupted me
for lunch.


Date: Wednesday, 10 July 1985  13:14-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>

I think that was my "fix" rather than KLH's.  I though SFORK would
break for extended addressing programs (and there isn't room in the
ACs to do an XSFRK and fall back to SFORK).  Why doesn't it work?


Date: Wed 10 Jul 85 13:18:35-PDT
From: David Eppstein <[email protected]>
Re:   more pfork

oops, sfork is only for when you're starting an inferior fork.  that happens
elsewhere.  anyway, i have put your pfork back in <KCC.LIB>.  I also
edited it to save an instr (JRST x(2) rather than ADDI 2,x / JRST 0(2)).


Date: Wed 10 Jul 85 13:22:30-PDT
From: Bill Palmer <[email protected]>

Oh, yeah, something seemed dumb about what I was doing there.  Hard to think
over the growls of my stomach.

Actually, the code had SFRKV, not SFORK, and the documentation doesn't seem
to explicitly rule out doing it on .FHSLF, but it seems unnecessary to do
so given that it works the other way with no apparent ill effects.  


Date: Wed 10 Jul 85 14:12:18-PDT
From: David Eppstein <[email protected]>
Re:   pfork resolved

Ok, here is what was happening:

LINK has a TOPS-10 format entry vector, pointed to by .JBSA.  The address
contained in .JBSA is what GEVEC% returns.  .JBREN is 0.  So when you do
the GEVEC% followed by a JRST 1(2) you go to 1+C(.JBSA) which is what you
want.  When you do a SFRKV% with AC2 containing 1 you go to 0 which loses.
What you really want to do is a SFRKV% with AC2 containing 1,,0 -- this is
a kluge to let you start LINK or something like that.

I am not sure whether I want to change pfork() and kcc to work this way,
or to leave it in its current working state.


Date: Wed 10 Jul 85 14:15:32-PDT
From: Bill Palmer <[email protected]>

From my very cursory glance at the code in FORK.MAC, it looks like SFRKV with
.FHSLF as the process handle does very little except fudge the PC and stack
and return to the right place.  So, I think it's probably just extra overhead
to bother going through with the SFRKV at all - might as well just leave things
as they are unless something is found that breaks the current assumptions.


Date: Wednesday, 10 July 1985  14:20-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>

Well GEVEC% doesn't work with extended addressing programs so there
would be some advantage in using SFRKV%.  Until I want to pass CCL
to such a program I won't worry about it.  But I think I will erect a
monument in pfork() explaining the story.

I looked at FORK.MAC too.  It seems reasonable, and looks like it
would work for .FHSLF exactly as well as for other forks.  So at least
now I don't have to go fix all the exec() code to use GEVEC%.


Date: Wed 10 Jul 85 14:31:11-PDT
From: Bill Palmer <[email protected]>

Well, how about using XGVEC% in all cases?


Date: Wednesday, 10 July 1985  14:33-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>

XGVEC% doesn't work before release 5.1.  SFRKV% always works.


Date: Wed 10 Jul 85 14:33:57-PDT
From: Bill Palmer <[email protected]>

Hey, what's a few dozen more conditionals?
11-Jul-85 12:34:39-PDT,4190;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Thu 11 Jul 85 12:11:42-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA29948; Wed, 10 Jul 85 19:19:46 pdt
Message-Id: <[email protected]>
Date: Wed Jul 10 18:49:43 1985
From: jld@sri-unix
To: kronj@su-sierra
Cc: jld@sri-unix
Subject: kcc byte pointers

	I have been working on a KCC interface for the COMND jsys.  I
am stymied by the fact that COMND requires 7-bit byte pointers, but KCC
uses 9-bit byte pointers except for constant strings.

	How does one get around this problem?

Jim Dein


Date: Thursday, 11 July 1985  12:34-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>

As far as I am aware, COMND% takes byte pointers as arguments which
can be of any byte size; in particular 9-bit bytes should work.
Be sure that you are decrementing your pointers to make ILDB pointers
rather than LDB pointers.

If you are not already doing so, I strongly suggest you model your
subroutine interface after the PASCMD package avaiable with the
Rutgers P20 Pascal compiler.  You would also need to look at SETJMP.C
from the C runtime library, for reparse handling.


Date: Thu Jul 11 16:32:32 1985
From: jld@sri-unix

Thanx for your quick response to my query.

Alan Larson (our TOPS20 expert) has shown me the MACRO code for COMND%
that checks for 7-bit-bytedness in the local one-word byte pointers for
the prompt (CM%RTY), input buffer (CM%BFP), and probably everything
else.  COMND also assumes section 0.  Perhaps these restrictions have
been removed from the TOPS20 monitor version you are using.

I will try the following kluge.  The only buffer that the *user* needs
to write is the prompt string; the others are only written by COMND.
If I use a union to line up a buffers on a word bdry, then I can take
the 9-bit buffer pointer, modify the upper half, back up a byte, and
assign the result to the appropriate word in the command status block.
Sizeof(buffer) will safely underestimate the buffer size.  The user
will have to use a special setprompt() routine to copy his/her/its
version of the prompt string to the internal 7-bit-byte version.

All this would be unnecessary if KCC always used 7-bit bytes.  Of
course, sizeof calculations would then be more "interesting".  (The new
ANSI C standard will require a byte size >= 8 bits, unfortunately.)

I plan to steal some ideas from PASCMD.  The C interface will be more
compact, however -- fewer routines, more flag args, a la UNIX.  The
"multiple mode" idea is nice.

Jim Dein


Date: Thu 11 Jul 85 17:37:33-PDT
From: David Eppstein <[email protected]>

You got me curious, so I looked in the monitor.  If the routine you mean
is RDCBP, then what the version we have (and likely yours) actually does
is make sure that the byte size is not less than 7 bits, and that if OWGBPs
are used then that only happens in a non-zero section.  So KCC-generated
9-bit pointers and extended addressing OWGBPs should be safe.

I am not so sure what TBLUK et al (used by COMND% for .CMKEY and .CMSWI)
do in extended addressing.  But there again there should be no byte size
problems.


Date: Fri Jul 12 12:38:55 1985
From: jld@sri-unix
Re:   KCC test pgm

	I have set up a (reasonably) clean version of my KCC pgm and
deprotected it so you can access it via FTP in the usual
anonymous/guest manner.  The files are on host sri-ai, and they are

	<jld>kronj.c
	<jld>kronj.h

	To further test the 7-byte bit hypothesis, I have added a
conditionally compilable variation that assigns pointers to constant
"scratch" strings when TEST7 is defined (see top of .h file).  This
version works as expected (prints the prompt and waits).

	Manithanx for your help.

Jim Dein


Date: Friday, 12 July 1985  13:10-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>

Ok, now I believe you.  CHKBP does in fact require 7-bit local byte
pointers.  Chomp chomp.

Oh, another program you should look at is the C version of Kermit.
They have a quite complete implementation of COMND under UNIX, and it
should be possible to port this to TOPS-20 as well.
12-Jul-85 10:50:22-PDT,5757;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Thu 11 Jul 85 17:50:16-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA20127; Thu, 11 Jul 85 17:49:04 pdt
Message-Id: <[email protected]>
Date: Thu Jul 11 17:26:50 1985
From: jld@sri-unix
To: kronj@su-sierra
Cc: jld@sri-unix
Subject: kcc command

It seems clear from the KCC docu and from tests that every KCC file
that does not contain main() is treated as a library source, and that
every global entity in such source must be identified at the beginning
of the file by an "entry" statement.  Furthermore, commands like

	@CC MAIN.C NOTMAIN.C

do not produce MAIN.EXE, and in fact multiple-source-file pgms cannot
be compiled and linked in one step, even though such pgms are common in
the C world.  It took me some time to guess that the necessary
commandage is

	@CC -c MAIN.C NOTMAIN.C
	...
	@LINK
	*MAIN,NOTMAIN
	*/GO
	@

It seems to me that since KCC has a default peephole phase, unresolved
references discovered during the parsing phase could be saved and
written out first, before the rest of the (possibly peepholed) FAIL
code.  Also, perhaps KCC could be taught to identify .REL, .FAI, and
.MAC files and incorporate them into the program at the appropriate
point.  The way the present KCC command works is inconvenient and
non-standard.

Jim Dein

Date: Friday, 12 July 1985  10:49-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   jld at sri-unix
Re:   kcc command

Well, you are sort of correct in that
	CC MAIN.C NOTMAIN.C
should in fact give you a MAIN.EXE.  Currently it merely separately
compiles the two modules.  On Sierra you would instead type
	LOAD MAIN.C,NOTMAIN.C
but I guess that doesn't work at SRI.  Instead you should do
	CC -c MAIN.C NOTMAIN.C
	LOAD MAIN.REL,NOTMAIN.REL
(simpler than running LINK by hand as you suggest).

You seem to a little confused about some other things:

In spite of the peephole phase, KCC is one-pass.  The most of the
parse tree for the program it ever has in memory at one time is one
function, and the most of the to-be-emitted FAIL code it ever has in
memory (in the peephole buffer) at one time is one basic block.

Unresolved references are what the extern declaration is for, and are
also noticed implicitly if you use a function that has not yet been
declared.  Because KCC is one-pass, it is impossible to tell until the
end of the file whether the actual function is defined within the file
or not, but in any case it makes little difference to FAIL where in
the FAIL code the EXTERN pseudo-ops are written out.

You only need entry if your module is going in the C runtime library.
If it is just a secondary module that will be linked in explicitly
with one or more programs as above, then it is not necessary.  I.e.,
you don't need it, but I do.  The reason I need it is because after a
bunch of REL files have been stuck together with MAKLIB into a
library, LINK will only load a REL file from this library when there
is an unresolved external reference to a symbol from that REL file
that has been declared an entry.  The reasons it needs to go at the
front of the file are:

  - ENTRY blocks can only go at the start of REL files.  FAIL is
    one-pass, so it can only emit them there if the ENTRY pseudo-op
    comes before any code is emitted.  KCC is one-pass, so it can only
    put the ENTRY at the start of the code if that's where the
    corresponding entry statement occurs in the C source.

  - This way I can treat entry as an ordinary symbol if it occurs
    anywhere else in the C source, and thus correctly compile programs
    whose authors didn't believe K&R when they said it was a reserved word.

I dislike the UNIX convention of having CC be the equivalent of the
COMPILE command, knowing about REL and MAC and so on files.  CC is the
C compiler.  COMPILE is the command to compile other things.  If you
really want CC to deal with assembly, you can delimit it in the C
source with #asm and #endasm.  Some caveats:

  - It is not likely to appear where you expect in the FAIL output.
    Only put whole routines in #asm, and don't make them depend on
    where in the FAIL outputthey are defined.

  - Data structures should call the $$DATA macro before their
    definitions and should always (even if they're the last of the
    #asm) call $$CODE after them.  On second thought, don't use them
    at all -- declare them in C instead.

  - If you use the -m flag (to use MACRO instead of FAIL) be aware
    that any routine declared in #asm and used in the same program
    will get an EXTERN declaration in the MACRO output rather than the
    INTERN it should have gotten.  MACRO doesn't like this.  You can
    get around it by declaring all such functions static (which you
    should do even if you use FAIL).

  - The calling conventions for KCC functions are documented in
    CC.DOC, which you should have a copy of somewhere.

  - Keep your assembly inclusions simple, because at some point I plan
    to make KCC parse the stuff and I don't want it to be too complicated.

  - Use C preprocessor conditional compilation features and macros
    rather than FAIL or MACRO conditional compilation and macros.
    This is partly so that it will be parseable by KCC when I do that,
    and partly because the assembly header KCC inserts at the start of
    its output undefines all the conditional compilation pseudo-ops.

  - Only do assembly when necessary -- remember that the more assembly
    you have, the less portable and understandable your program becomes.

Hope this has cleared things up a little...
					David
13-Jul-85 23:59:26-PDT,717;000000000001
Mail-From: KRONJ created at 13-Jul-85 23:59:17
Date: Sat 13 Jul 85 23:59:17-PDT
From: David Eppstein <[email protected]>
Subject: cute hack in KCC assembly output
To: [email protected]

Friday I finished installing the SKIPPED flag (but there are still some
quirks left in the emitted code so I haven't put it up on SYS: yet).
So anyway, I wanted to see which instructions KCC thought were being
skipped over, in case it made a mistake.  Now I've made it put an extra
space before those instructions, just like a human programmer
(too bad I couldn't add a number of spaces to match the level of
skip cascading, but that's not in the data structure).  If only
it could comment its code...
-------
15-Jul-85 14:37:13-PDT,479;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 15 Jul 85 13:57:26-PDT
Date: Mon 15 Jul 85 13:57:28-PDT
From: Ken Harrenstien <[email protected]>
Subject: KCC infinite loop
To: [email protected]
cc: [email protected]

One of the users here has created a test program which puts KCC into
an infinite loop without an error message.  The file is <KLH>TEST3.C.
Appears to be a problem with static initializations but not sure.
15-Jul-85 14:52:38-PDT,278;000000000001
Mail-From: KRONJ created at 15-Jul-85 14:52:28
Date: Mon 15 Jul 85 14:52:28-PDT
From: David Eppstein <[email protected]>
Subject: test3
To: [email protected]

Odd.  Both SYS:CC and the new version I am testing on Sierra happily
compile test3.c and terminate.
-------
15-Jul-85 15:31:40-PDT,425;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 15 Jul 85 14:58:20-PDT
Date: Mon 15 Jul 85 14:58:16-PDT
From: Ken Harrenstien <[email protected]>
Subject: Re: test3
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Mon 15 Jul 85 14:53:57-PDT

Maybe that means I need a new version of KCC.  I will try that.
15-Jul-85 15:31:58-PDT,488;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 15 Jul 85 15:26:31-PDT
Date: Mon 15 Jul 85 15:17:04-PDT
From: Ken Harrenstien <[email protected]>
Subject: Re: test3
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Mon 15 Jul 85 14:53:57-PDT

OK, I brought over your SYS:CC and used that, and TEST3 now compiles OK.
I guess you must have fixed something along the way.
17-Jul-85 10:04:03-PDT,742;000000000001
Mail-From: LOUGHEED created at 17-Jul-85 00:56:32
Date: Wed 17 Jul 85 00:56:32-PDT
From: Kirk Lougheed <[email protected]>
Subject: KCC export directory
To: [email protected]
cc: [email protected], [email protected]

David -
	Would it be possible to set up a KCC export directory that we
could point people at?  Such a directory (or group of directories) would
contain a working snapshot of the compiler sources and binaries as well
as enough documentation to allow someone to install KCC for the first time.
Sources to the EXEC hacks for .C extensions would also be useful.

I've done something like this for FTP, e.g. created PS:<FTP.EXPORT> while
maintaining PS:<FTP> as the primary source directory.

Kirk
17-Jul-85 10:20:14-PDT,678;000000000001
Mail-From: KRONJ created at 17-Jul-85 10:20:08
Date: Wed 17 Jul 85 10:20:08-PDT
From: David Eppstein <[email protected]>
Subject: mysterious reappearance of <SUBSYS>CC.EXE
To: [email protected]
cc: [email protected]

I recently moved CC.EXE from <SUBSYS> to LSYS:.  Today I was making comparisons
between my latest version in testing and what I thought was my latest installed
version, and was greatly confused to discover that SYS:CC.EXE was a couple of
weeks behind what I thought it should be.  Turns out that LSYS:CC.EXE was fine,
but <SUBSYS>CC.EXE had mysteriously reappeared.  Was this related to yesterday's
filesystem mungage?
-------
17-Jul-85 12:50:38-PDT,459;000000000001
Mail-From: LOUGHEED created at 17-Jul-85 12:50:31
Date: Wed 17 Jul 85 12:50:31-PDT
From: Kirk Lougheed <[email protected]>
Subject: Re: mysterious reappearance of <SUBSYS>CC.EXE
To: [email protected]
cc: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Wed 17 Jul 85 10:20:11-PDT

Yes, CC's reappearance on <SUBSYS> is a direct result of restoring that
directory from tape.

Kirk
-------
17-Jul-85 12:59:24-PDT,468;000000000001
Return-Path: <[email protected]>
Received: from SU-GSB-HOW.ARPA by SU-SIERRA.ARPA with TCP; Wed 17 Jul 85 11:45:10-PDT
Date: Wed 17 Jul 85 11:43:48-PDT
From: Andrew "VaxBuster" Gideon <[email protected]>
Subject: KCC - array initialization
To: [email protected]
Office-Phone: (415) 497-4816

Is there anything wrong with the line:

int vvv[2] = { 20, 30 };   ?

It should work, and yet I receive the error "illegal initialization".

		Andy
18-Jul-85 14:56:26-PDT,925;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 18 Jul 85 14:55:34-PDT
Date: Thu 18 Jul 85 14:56:06-PDT
From: Ken Harrenstien <[email protected]>
Subject: Useful C book
To: [email protected]
cc: [email protected]

"C: A Reference Manual" by Samuel P. Harbison & Guy L. Steele Jr.
pub. by Prentice-Hall Inc.  ISBN 0-13-110008-4

This is a good book for people writing C compilers or portable C code.
It isn't an introduction, or tutorial, or anything like that, but does
provide a much more specific description of what a C compiler should
understand and what it should do.  I found it very helpful in clarifying
many fuzzy details, especially as it often tells you whether a feature is
implemented on "all", "most", or "few" compilers; doesn't go far enough
in actually identifying specific compilers by name, but compared to K&R this
is a big help.
-------
18-Jul-85 14:58:04-PDT,1692;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Thu 18 Jul 85 14:48:11-PDT
Date: Thu 18 Jul 85 14:48:05-PDT
From: Ken Harrenstien <[email protected]>
Subject: Re: NMIMT v7 C ("GCC")
To: c20%[email protected]
cc: [email protected], [email protected], [email protected],
    [email protected]
In-Reply-To: Message from "Greg Titus <c20%[email protected]>" of Wed 17 Jul 85 10:22:15-PDT

Now that you are on SIMTEL20 you should probably ask to be put on the
BUG-KCC@SIERRA mailing list.  David Eppstein (KRONJ@SIERRA) is the
person who has been doing most of the actual compiler work; I've been
overhauling a lot of library routines and worrying about portability
to other systems.  My impression is that Greg Satz (SATZ@SIERRA) is
overall coordinator (the official SIERRA person), although he hasn't
been visibly active recently.

I don't know whether GCC and KCC have enough commonality any more to
consider sharing code fragments, although the ideas may be beneficial.
The library routines are probably the most likely place where stuff
developed for one can be used with the other.  However, before even
looking at the sources with a eye to this, we need to agree whether it
is OK in principle to share code.  While I don't mind carrying
copyrights around, I don't want to touch anything that cannot be
freely distributed.  In fact this is my motivation for copyrighting...
to ensure that company X cannot acquire the code and then charge money
for it.


Perhaps you could bring the others up to speed by sending a short summary of
NMIT C (or Frank can just re-send his own summaries).

--Ken
18-Jul-85 16:36:08-PDT,774;000000000001
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Thu 18 Jul 85 15:42:40-PDT
Date: Thu, 18 Jul 1985  16:41 MDT
Message-ID: <[email protected]>
From: "Frank J. Wancho" <[email protected]>
To:   Ken Harrenstien <[email protected]>
Cc:   c20%[email protected], [email protected],
      [email protected], [email protected], [email protected]
Subject: NMIMT v7 C ("GCC")
In-reply-to: Msg of 18 Jul 1985  15:48-MDT from Ken Harrenstien <KLH at SRI-NIC.ARPA>

Greg Titus is set up here as GTITUS.  Perhaps I should let Greg
clarify the issue of code-sharing from his side of the fence before we
get into any further discussions/summaries of our correspondence so
far.

--Frank
20-Jul-85 09:40:11-PDT,983;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Fri 19 Jul 85 19:13:24-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA17938; Fri, 19 Jul 85 19:10:53 pdt
Message-Id: <[email protected]>
Date: Fri Jul 19 18:14:42 1985
From: jld@sri-unix
To: kronj@su-sierra
Cc: jld@sri-unix
Subject: byte ptr conversion

	See <jld>coerce.c on host sri-ai for an example of converting
9-bit-byte strings to 7.  The string has to be copied into a newly
allocated buffer.  Unfortunately, there is no mechanism to return
memory, since the buffer ptr is not saved.

	The key was to get around the bad (int) char* cast (result is
zero) by doing an (int) (int *) char* simulcast.

Jim Dein

Date: Saturday, 20 July 1985  09:39-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>
To:   jld at sri-unix
Re:   byte ptr conversion

If you are interested in my looking at this file it would be helpful
to set public read permission.
20-Jul-85 10:42:04-PDT,447;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Sat 20 Jul 85 10:28:01-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA24088; Sat, 20 Jul 85 10:25:26 pdt
Message-Id: <[email protected]>
Date: Sat Jul 20 10:21:33 1985
From: jld@sri-unix
To: kronj@su-sierra
Subject: file prot

	Ooops.  I have modified the mode bits, as appropriate, of file
<jld>coerce.c on host sri-ai.

Jim Dein
20-Jul-85 11:02:59-PDT,1008;000000000001
Mail-From: KRONJ created at 20-Jul-85 11:02:57
Date: 20 Jul 1985  11:02 PDT (Sat)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   jld@sri-ai
Subject: coerce

Looks reasonable.  You don't really need the &777777 -- (int *) does
the same thing for you.  I am not quite sure what you mean by (int)
zeroing your char pointers -- it should merely copy all the bits.
Your code will also not work for extended addressing, but I guess
that's not a particularly serious shortcoming.

Here is what I use for roughly the same thing in KCC itself, to make
PRARG block strings for passing off to LINK and FAIL:

char *bp7(ip)
int *ip;
{
    int i = ip;
    i |= ((i &~ 0777777) ? 0620000000000 : 0350700000000);
    return (char *) i;
}

This takes an int pointer and turns it into a 7-bit byte pointer, no
malloc and no string copy.  I guess the non-PDP-10 version (not a
useful thing in my case) would merely coerce it back to (char *).
20-Jul-85 15:26:40-PDT,1679;000000000011
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 20 Jul 85 15:20:44-PDT
Date: Sat 20 Jul 85 15:21:21-PDT
From: Ken Harrenstien <[email protected]>
Subject: KCC broken
To: [email protected]
cc: [email protected]

Well, the new KCC breaks HOCK again; the sizeof construct is messed up.
Here is a demonstration test program:


[PHOTO:  Recording initiated  Sat 20-Jul-85 3:20pm]

 End of COMAND.CMD.1
@ty test5.c
#define SW_FLG 0
#define SW_VAR 1
#define SW_STR 2
#define SW_SPC_ID 3
#define SW_SPC_RL 4
struct swarg {
        char *sw_name;
        int sw_type;
        union { int *sw_avar; char **sw_astr; } sw_v;
} swtab[] = {
        "Identplayer",  SW_SPC_ID, 0,
        "RList",        SW_SPC_RL, 0,
        "Gamereport",   SW_FLG, 0,
        "GSchedule",    SW_VAR, 0,
        "Standings",    SW_FLG, 0,
        "Missing",      SW_FLG, 0,
        "Indivstats",   SW_FLG, 0,
        "Headtype",     SW_VAR, 0,
        "Creditlist",   SW_VAR, 0,
        "ITotals",      SW_VAR, 0,
        "Ranking",      SW_VAR, 0,
        "Rdebug",       SW_FLG, 0,
        "Team",         SW_STR, 0,
        "Reflist",      SW_VAR, 0
};
main()
{
        printf("Table at %o, size %d, elsize %d, nelems %d, nelwds %d\n",
                        swtab,
                        sizeof(swtab),
                        sizeof(struct swarg),
                        (sizeof(swtab))/(sizeof(struct swarg)),
                        (sizeof(struct swarg))/(sizeof(char *)));
}
@
@test5
Table at 140, size 4, elsize 12, nelems 0, nelwds 3
@
@pop

[PHOTO:  Recording terminated Sat 20-Jul-85 3:20pm]
20-Jul-85 18:39:54-PDT,573;000000000001
Mail-From: KRONJ created at 20-Jul-85 18:39:51
Date: Sat 20 Jul 85 18:39:51-PDT
From: David Eppstein <[email protected]>
Subject: Re: KCC broken
To: [email protected]
In-Reply-To: Message from "Ken Harrenstien <[email protected]>" of Sat 20 Jul 85 15:26:40-PDT

Ok, it's fixed now.  This happened a while back when I changed array and
function uses to generate pointer-type objects (to reduce confusion in
later type checking and code generation).  The solution was to distinguish
between pointer types created in this manner and normal pointer types.
-------
25-Jul-85 19:10:58-PDT,475;000000000001
Mail-From: WHP4 created at 24-Jul-85 20:46:53
Date: Wed 24 Jul 85 20:46:53-PDT
From: Bill Palmer <[email protected]>
Subject: bmgrep
To: [email protected], [email protected]

I snarfed bmgrep off of Mojave the other night and got around to compiling
it tonight.  It seems to work, so I renamed the directory to 
<KCC.UNIX.SRC.BMGREP> and I suppose I'll stick the .EXE in whatever the
directory is that .EXEs live in.

It's not as fast as XSEARCH, I think.
25-Jul-85 19:18:28-PDT,474;000000000001
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Thu 25 Jul 85 19:17:09-PDT
Date: 25 Jul 1985 16:24-PDT
Sender: [email protected]
Subject: Re: system logical name C:
From:  William "Chops" Westfield <[email protected]>
To: [email protected]
Message-ID: <[SU-SCORE.ARPA]25-Jul-85 16:24:03.BILLW>
In-Reply-To: <[email protected]>

C: will now be defined at system startup to be PS:<KCC.C>

BillW
30-Jul-85 18:11:33-PDT,1811;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 30 Jul 85 17:31:11-PDT
Date: Tue 30 Jul 85 17:31:08-PDT
From: Ken Harrenstien <[email protected]>
Subject: Yet another KCC overoptimization bug
To: [email protected]
cc: [email protected]

HOCK once again tickles a bug.  Here is the test program extract:
	-----------------------------------------
#include <ctype.h>
#define isnum(a) isdigit(a)

main(argc,argv)
int argc;
char **argv;
{	char *penname;
	int penlen, pentyp;

	if(argc >= 2) penname = argv[1];
	else penname = 0;
	penlen = 2;
	pentyp = 0;
	if(!penname) penname = "???";
	else
	  {	if(isnum(penname[0]))
		  {	penlen = penname[0] - '0';
			if(isnum(penname[1]))
				penlen = penlen*10 + (penname[1] - '0');
			if(penlen == 5) pentyp = 100;
			if(penlen == 10) pentyp = 1000;
		  }
	  }
}
	--------------------------------

Here is the bad portion of the FAIL output:

	LDB 3,-2(17)			; this is the first ISDIGIT test
	ADJBP 3,[221100,,.ctype]
	LDB 5,3
	TRNN 5,4
	 JRST $3
	LDB 14,-2(17)
	SUBI 14,60
	MOVEM 14,-1(17)
	MOVE 13,-2(17)			; Here KCC puts penname in 13
	ILDB 7,13			; KCC now gets penname[1] but also
					; leaves penname+1 in 13...
	ADJBP 7,[221100,,.ctype]	; This is the second ISDIGIT test
	LDB 11,7
	TRNN 11,4
	 JRST $6
	MOVE 10,13			; KCC now tries to reuse penname,
	ILDB 6,10			; but forgets that it was already ++'d!
	SUBI 6,60
	IMULI 14,12
	ADD 6,14
	MOVEM 6,-1(17)
$6::

Please fix soon... who knows what else is breaking.


Date: Tuesday, 30 July 1985  18:10-PDT
From: David Eppstein <Kronj at SU-SIERRA.ARPA>

Fixed now.  I had been thinking of doing something similar for AOS reg,reg
anyway -- but I hadn't realized that actual code was losing because of this.
31-Jul-85 16:56:07-PDT,1947;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Wed 31 Jul 85 16:47:07-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA02901; Wed, 31 Jul 85 16:47:56 pdt
Message-Id: <[email protected]>
Date: Wed Jul 31 15:38:07 1985
From: jld@sri-unix
To: kronj@su-sierra
Cc: jld@sri-unix
Subject: reply to msgs re kcc

	This is just to let you know that I have indeed been getting
your long msgs re KCC.  They have been quite helpful.  To clarify one
point of mine: (int) of a (char *) does not return 0 as I claimed --
Thank God -- whatever example I thought was doing that unfortunately
got debugged into oblivion long ago.

	My COMND% interface actually works -- as long as you only want
to parse quoted strings.  I finally decided to uniformly use malloc to
get reliably word-aligned entities, although that's overkill, since I
never give the space back.

	As for services offered by the CC command: I won't bother you
with details here, except to say that COMPILE and LINK are just the
most odious commands I have ever seen in this universe, and it's too
bad KCC can't do more to protect me from their horrors.  However, I am
able to limp along with the help of clumsy .MIC scripts.  I was
surprised to find that names of "extern fn();" declarations get passed
to the loader even if the names are never used.  I am not sure, though,
whether that's a bad or good feature.  Other problems: cascades of
error messages, and failure of the compiler to catch dereferencing of
non-pointers (the assembler complains instead).

	Please understand that for the most part I am quite satisfied
with KCC.  Without it I would not be programming for TOPS20 users.  The
code it produces is good, and the runtime lib greatly assists
portability.  I now view the byte-size problem as a monitor anachronism,
not a KCC limitation or a problem with the machine architecture.

Jim Dein
 1-Aug-85 00:59:10-PDT,1398;000000000001
Received: from LOTS-B by Sierra with Pup; Thu 1 Aug 85 00:59:05-PDT
Date: Thu 1 Aug 85 00:56:58-PDT
From: Elgin Lee <P.PAVANE@LOTS-B>
Subject: KCC bugs
To: bug-c@LOTS-B

I think I've found a couple of KCC bugs.  The first is that printf() and
company do not seem to properly handle the %x format specification.

The second is more obscure and best demonstrated by example.  The following
short program demonstrates both bugs (I won't show the output in the
interest of saving space):

#include <stdio.h>

/* #define KCC_RETURN_BUG */

#define BYTEMASK 0xff

short
func()
{
	register int c;
	short value = 0;

	printf("enter two bytes: ");
	while ((c = getchar()) == '\n');
	value = (c & BYTEMASK) << 8;
	c = getchar();
	value |= (c & BYTEMASK);
#ifdef KCC_RETURN_BUG
	/* bug -- if this isn't present, only the second byte is returned */
	c = value + 1;
#endif
	return value;
}

main()
{
    int i;

    printf("hex number? ");
    scanf("%x", &i);
    printf("\ndecimal equivalent:\t%d\n", i);
    printf("hex equivalent:\t%x\n", i);

    printf("\nint value = %d\n", func());
}

-----

The first bug should be clear on running the program.  The second may be
demonstrated by compiling the program with and without KCC_RETURN_BUG
defined -- func() returns different values in these two cases, which
shouldn't be the case.

			Elgin
-------
 1-Aug-85 10:53:46-PDT,486;000000000001
Mail-From: KRONJ created at  1-Aug-85 10:52:54
Date: 1 Aug 1985  10:52 PDT (Thu)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Elgin Lee <P.PAVANE@LOTS-B>
Cc:   [email protected]
Subject: KCC bugs
In-reply-to: Msg of 1 Aug 1985  00:56-PDT from Elgin Lee <P.PAVANE at LOTS-B>

I thought I had fixed %x in printf(), but apparently not.  My latest
version seems to work.  The other bug was fixed a couple of days ago.
 1-Aug-85 23:19:49-PDT,754;000000000001
Received: from LOTS-B by Sierra with Pup; Thu 1 Aug 85 23:19:41-PDT
Date: Thu 1 Aug 85 23:17:26-PDT
From: Elgin Lee <P.PAVANE@LOTS-B>
Subject: Re: KCC bugs
To: [email protected]
cc: bug-c@Sierra
In-Reply-To: Message from "David Eppstein <[email protected]>" of Thu 1 Aug 85 10:52:00-PDT

I just updated the LOTS C compiler from [SIERRA]<KCC.C>,<KCC.SYS>, and
SYS:CC.EXE.  I recompiled the program that I had mailed to bug-c yesterday,
and the %x in printf() now works, but the bug with return only returning
the low order byte of the calculated short value still remains -- the
program still behaves differently with and without the KCC_RETURN_BUG
preprocessor constant defined, which should not be.

			Thanks,
				Elgin
-------
 2-Aug-85 11:42:43-PDT,532;000000000001
Mail-From: KRONJ created at  2-Aug-85 11:42:29
Date: 2 Aug 1985  11:42 PDT (Fri)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Elgin Lee <P.PAVANE@LOTS-B>
Cc:   bug-c@Sierra
Subject: KCC bugs
In-reply-to: Msg of 1 Aug 1985  23:17-PDT from Elgin Lee <P.PAVANE at LOTS-B>

Oops.  Different other bug than the one I had fixed (which would
likely have caused you to get 1 + the correct return value if
KCC_RETURN_BUG was defined).  Anyway, it too has now been fixed.
 5-Aug-85 14:08:23-PDT,1078;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Mon 5 Aug 85 14:03:46-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA18482; Mon, 5 Aug 85 14:02:43 pdt
Message-Id: <[email protected]>
Date: Mon Aug  5 13:47:50 1985
From: jld@sri-unix
To: kronj@su-sierra
Cc: jld@sri-unix
Subject: void problem

	This isn't about KCC.  It's about C.  I have already shown this
problem (described below) to 5 experts.  It concerns what happens when
you assign a "pointer to a fn returning nothing".  For instance:

main()
{
	void (*fnptr)();		/* ptr to fn returning nothing */
	extern void fn();		/* fn that returns nothing */
	fnptr = fn;			/* <-- problem here */
	(*fnptr)();			/* execute fn */
}

void fn()				/* fn that returns nothing */
{
	printf("This is a test.\n");
}

KCC warns of illegal pointer coercion but does the right thing; for the
Pyramid compiler this is a fatal error.  However, if "void" is
translated globally to "int", the problem vanishes.

	Any idea what's going on here??

Jim Dein
 5-Aug-85 14:14:06-PDT,415;000000000001
Mail-From: KRONJ created at  5-Aug-85 14:14:03
Date: Mon 5 Aug 85 14:14:03-PDT
From: David Eppstein <[email protected]>
Subject: function returning void
To: [email protected]

The version of KCC I have doesn't complain at all.  I have heard before
that PCC doesn't like pointers to functions returning void, but I don't
know why that is.  The code is perfectly acceptable as far as I can tell.
-------
 5-Aug-85 16:17:02-PDT,429;000000000001
Mail-From: KRONJ created at  5-Aug-85 16:16:59
Date: Mon 5 Aug 85 16:16:59-PDT
From: David Eppstein <[email protected]>
Subject: TermInfo/Curses
To: [email protected], [email protected]

The mod.sources newsgroup sent out public domain versions of these a while
back.  We may still have them on one of our UNIX machines, or there may
be a way of getting them sent to us (electronically or magtape).
-------
 6-Aug-85 15:17:23-PDT,715;000000000001
Mail-From: KRONJ created at  6-Aug-85 15:17:17
Date: 6 Aug 1985  15:17 PDT (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Bill Palmer <[email protected]>
Subject: >& construct broken in runtimes?
In-reply-to: Msg of 6 Aug 1985  15:14-PDT from Bill Palmer <whp4 at SU-SIERRA.ARPA>

    Date: Tuesday, 6 August 1985  15:14-PDT
    From: Bill Palmer <whp4 at SU-SIERRA.ARPA>

    Isn't the command line

    foo >& bar

    supposed to pipe the stderr output of foo into the file bar?  It
    doesn't seem to work; did it ever?

I never implemented anything like that, so if you didn't either then I
guess it never worked.  YKWTSA...
 6-Aug-85 17:59:05-PDT,497;000000000001
Return-Path: <[email protected]>
Received: from SU-SCORE.ARPA by SU-SIERRA.ARPA with TCP; Tue 6 Aug 85 17:15:15-PDT
Date: Tue 6 Aug 85 17:08:47-PDT
From: Len Bosack <[email protected]>
Subject: Re: TermInfo/Curses
To: [email protected]
In-Reply-To: Message from "David Eppstein <[email protected]>" of Mon 5 Aug 85 16:16:52-PDT
Message-ID: <[email protected]>

Could you poke around a bit and see if you can find them? Maybe Greg
knows something...

Len
 6-Aug-85 20:25:26-PDT,934;000000000005
Mail-From: KRONJ created at  6-Aug-85 20:25:26
Date: Tue 6 Aug 85 20:25:26-PDT
From: David Eppstein <[email protected]>
Subject: Things to do to filename hacker
To: [email protected]

We should eventually support ~ and ~user.  PCC uses /str (e.g. /ps/kronj)
so maybe we should do that too.  The thing that expands logical names should
be some sort of backtracking search so multiple paths can work (e.g. if
I define C: to be <KCC.LIB>,<KCC.C> and want to include <sys/types.h> where
<.SYS> doesn't exist in <KCC.LIB> or TYPES.H isn't there.  We should do
something reasonable for /tmp, but I don't know what (maybe try SCR: then
revert to using the home directory).  We should support /dev, at least
for /dev/null and maybe for /dev/tty and /dev/ttyx (return .NULIO and .CTTRM
(or maybe get JFN on NUL: and TTY:), expand to TTYx:).  Maybe find out
what 4.2 ptys look like and simulate them too.
-------
 6-Aug-85 22:40:01-PDT,599;000000000001
Received: from LOTS-B by Sierra with Pup; Tue 6 Aug 85 22:39:20-PDT
Date: Tue 6 Aug 85 22:36:23-PDT
From: Elgin Lee <P.PAVANE@LOTS-B>
Subject: KCC code speed?
To: kronj@Sierra

I'm trying to port a UNIX program to TOPS-20 using KCC.  After your timely
bugfixes, I think that it is close to working, but the execution speed is
quite slow -- about half the speed as on a 780.  Are there some speed
issues in KCC that I don't understand?  If it makes a difference, the
program is probably quite disk-bound -- it basically translates files
from one format to another.

		Thanks,
			Elgin
 6-Aug-85 22:50:29-PDT,1169;000000000001
Mail-From: KRONJ created at  6-Aug-85 22:50:28
Date: 6 Aug 1985  22:50 PDT (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Elgin Lee <P.PAVANE@LOTS-B>
Subject: KCC code speed?
In-reply-to: Msg of Tue 6 Aug 85 22:36:23-PDT from Elgin Lee <P.PAVANE@LOTS-B>

Byte operations are fairly slow on DEC-20s compared to a byte oriented
machine such as the Vax, so that might be part of it.  The standard
I/O library for KCC means an extra layer of bit-fondling, which is
perhaps less the case under UNIX.  You might also check that you are
comparing times for unloaded machines.  For compute-bound tasks I have
found that KCC programs run as expected 3 or so times as fast as on a Vax.

If you are looking for ways to speed things up, one thing to be aware
of is that *++x is typically faster than *x++ for x a byte pointer.
Register declarations won't get you anything because KCC ignores them.
If you are doing bulk input from stdin, you should make sure it is
buffered using setbuf() ... this should make a big difference as it
lets stdio do buffer-at-a-time rather than char-at-a-time reads.
 7-Aug-85 20:58:17-PDT,721;000000000011
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Wed 7 Aug 85 15:25:53-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA25734; Wed, 7 Aug 85 15:24:41 pdt
Message-Id: <[email protected]>
Date: Wed Aug  7 15:16:32 1985
From: jld@sri-unix
To: kronj@su-sierra
Cc: jld@sri-unix
Subject: KCC bug (peephole optimizer)

	The following pgm fails to execute "fn" unless it is compiled
with the peephole optimizer turned off or DEBUG is defined.  (Aug 2
version) -- Jim Dein

main()
{
	int (*fnptr)();
	extern int fn();
	fnptr = fn;
	(*fnptr)();
#ifdef DEBUG
	printf("fnptr, fn: %o, %o\n", fnptr, fn);
#endif
}

int fn() { printf("This is a test.\n"); }
 7-Aug-85 23:14:10-PDT,836;000000000011
Received: from LOTS-B by Sierra with Pup; Wed 7 Aug 85 23:14:00-PDT
Date: Wed 7 Aug 85 23:11:02-PDT
From: Elgin Lee <P.PAVANE@LOTS-B>
Subject: Null Byte Bug
To: bug-c@Sierra

I think I've found yet another bug in KCC.  fwrite does not write a leading
null into a file.  Witness the following test program:

#include <stdio.h>

FILE *out;

main()
{
    char buf[10];

    out = fopen("test.out", "W");
    if (out == NULL)
	exit(1);

    buf[0] = '\0';
    (void) strcpy(buf + 1, "A test.");
    fprintf(stderr, "buf[0] = %d, buf[1] = %d\n", buf[0], buf[1]);
    fwrite(buf, 1, 10, out);
}

----------

Running the program, we see that the first character in buf is a null (zero)
byte.  However, the output file (test.out) begins with the 'A' -- leaving
out the leading null entirely.

		Elgin
-------
 8-Aug-85 15:08:18-PDT,336;000000000001
Mail-From: KRONJ created at  8-Aug-85 15:08:13
Date: Thu 8 Aug 85 15:08:13-PDT
From: David Eppstein <[email protected]>
Subject: Re: KCC bug (peephole optimizer)
To: [email protected]
In-Reply-To: Message from "jld@sri-unix" of Wed 7 Aug 85 20:58:17-PDT

Ok, tail-recursing to pointers to functions should now work.
-------
 8-Aug-85 15:09:29-PDT,426;000000000001
Mail-From: KRONJ created at  8-Aug-85 15:09:15
Date: Thu 8 Aug 85 15:09:15-PDT
From: David Eppstein <[email protected]>
Subject: Re: Null Byte Bug
To: P.PAVANE@LOTS-B
cc: [email protected]
In-Reply-To: Message from "Elgin Lee <P.PAVANE@LOTS-B>" of Wed 7 Aug 85 23:14:10-PDT

I have reworked stdio so that '\0' as the first byte written to a file
now actually gets written.  Thanks for the bug report.
-------
 9-Aug-85 13:31:41-PDT,530;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Fri 9 Aug 85 11:51:15-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA12223; Fri, 9 Aug 85 11:51:19 pdt
Message-Id: <[email protected]>
Date: Fri Aug  9 11:47:07 1985
From: jld@sri-unix
To: kronj@su-sierra
Cc: jld@sri-unix
Subject: all OK

	The newest version of KCC handles void (*)() things correctly,
both in parsing and in code-generation.  Now I get to complain to
Pyramid about their cc and lint!

Jim Dein
18-Aug-85 11:16:00-PDT,1311;000000000001
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Fri 16 Aug 85 17:09:44-PDT
Date: Fri 16 Aug 85 18:09:22-MDT
From: Greg Titus <[email protected]>
Subject: NMIMT v7 C (sharing)
To: [email protected], [email protected], [email protected],
    [email protected]
cc: [email protected]
Message-ID: <[email protected]>

Code sharing from our side of the fence:

Sorry for being uncommunicative for so long -- I should be logging
in here more frequently from now on.

Our position on code sharing is that we would want someone to cross
our palm with silver in return for our code.  Tech is, quite frankly,
trying to make some money from the sale of this compiler and library.
It would not be in our best interest, therefore, to give the sources
away free to potential competitors (even unintending competitors,
i.e., those not after profit).  So, how about buying a distribution
(it comes with sources) for the going rate, which is $100?

This is not a particularly altruistic attitude for us to take, but the
NM state legislature can pinch a mean penny when it wants to, and we
will take value where we can get it.

As for other things, I will be happy to test stuff and generally kibitz,
if that's ok.

greg
18-Aug-85 11:16:50-PDT,1596;000000000001
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Fri 16 Aug 85 18:48:55-PDT
Date: Fri, 16 Aug 1985  19:41 MDT
Message-ID: <[email protected]>
From: "Frank J. Wancho" <[email protected]>
To:   Greg Titus <[email protected]>
Cc:   [email protected], [email protected], [email protected],
      [email protected]
Subject: NMIMT v7 C (sharing)
In-reply-to: Msg of 16 Aug 1985  18:09-MDT from Greg Titus <GTITUS>

Greg,

You still haven't answered the question about code sharing:  suppose
this group just happened to find the 100 bucks, and against their
principles, bought a copy of your compiler and sources.  They would
still not necessarily have the rights to use what they get for their
give-away project.  Conversely, it appears that you may borrow ideas
and code at no cost to Tech and no strings, then turn around and
profit by it.

I believe what we have here is a special case with all quibbling about
$100 aside, since such a purchase would not benefit either side all
that much in the long run.  However, the proposed sharing of runtime
libraries might be another matter, and both sides can save face and
their principles.  Both can keep their compiler approaches while
bending a little to accomodate a common runtime environment.  Propose
this to your management for careful consideration and let us know.  If
it turns out to be workable, we might have something; if not, your
compiler may have to fend for itself on entirely different grounds,
and we all loose...

--Frank
22-Aug-85 18:22:12-PDT,1470;000000000001
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Thu 22 Aug 85 13:13:19-PDT
Date: Thu 22 Aug 85 14:13:10-MDT
From: Greg Titus <[email protected]>
Subject: NMIMT v7 C (sharing)
To: [email protected]
cc: [email protected], [email protected], [email protected],
    [email protected]
Message-ID: <[email protected]>

Frank,

I'm not sure I understand you.  We are offering to sell SRI the right to
use the code.  What we are asking is that they pay the same price for
that right that other people pay for the right to simply run it.  Granted,
$100 is not that much money, in terms of what SRI or White Sands or even
NMT spends in a year.  On the other hand, for every five $100 checks we
collect, we can buy another terminal (or whatever).

The thing is, we don't get much by sharing code with SRI.  Our compiler
is done and functional, as is our library.  You're saying that we're
missing out on the best of both worlds;  we're saying that we've already
chosen our world.

What do you mean by "against their principles"?  If you mean that it
is against their principles to steal code that wasn't offered to them,
no one is asking them to do that.  If you mean that it is against their
principles to pay for code, then that's too bad.  SRI pays them for
their code;  SRI can pay us for our code.

In any case, I'd like to hear from KLH or SATZ on this, also.

greg
24-Aug-85 15:54:38-PDT,368;000000000001
Mail-From: KRONJ created at 24-Aug-85 15:54:31
Date: Sat 24 Aug 85 15:54:31-PDT
From: David Eppstein <[email protected]>
Subject: varargs
To: [email protected]

I've written a varargs.h for KCC, and installed it on Sierra.
The various runtimes which ought to use it (printf, scanf, execl)
should probably be rewritten to do so at some point.
-------
16-Aug-85 17:12:57-PDT,1320;000000000001
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Fri 16 Aug 85 17:09:44-PDT
Date: Fri 16 Aug 85 18:09:22-MDT
From: Greg Titus <[email protected]>
Subject: NMIMT v7 C (sharing)
To: [email protected], [email protected], [email protected],
    [email protected]
cc: [email protected]
Message-ID: <[email protected]>

Code sharing from our side of the fence:

Sorry for being uncommunicative for so long -- I should be logging
in here more frequently from now on.

Our position on code sharing is that we would want someone to cross
our palm with silver in return for our code.  Tech is, quite frankly,
trying to make some money from the sale of this compiler and library.
It would not be in our best interest, therefore, to give the sources
away free to potential competitors (even unintending competitors,
i.e., those not after profit).  So, how about buying a distribution
(it comes with sources) for the going rate, which is $100?

This is not a particularly altruistic attitude for us to take, but the
NM state legislature can pinch a mean penny when it wants to, and we
will take value where we can get it.

As for other things, I will be happy to test stuff and generally kibitz,
if that's ok.

greg
-------
16-Aug-85 18:50:23-PDT,1596;000000000001
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Fri 16 Aug 85 18:48:55-PDT
Date: Fri, 16 Aug 1985  19:41 MDT
Message-ID: <[email protected]>
From: "Frank J. Wancho" <[email protected]>
To:   Greg Titus <[email protected]>
Cc:   [email protected], [email protected], [email protected],
      [email protected]
Subject: NMIMT v7 C (sharing)
In-reply-to: Msg of 16 Aug 1985  18:09-MDT from Greg Titus <GTITUS>

Greg,

You still haven't answered the question about code sharing:  suppose
this group just happened to find the 100 bucks, and against their
principles, bought a copy of your compiler and sources.  They would
still not necessarily have the rights to use what they get for their
give-away project.  Conversely, it appears that you may borrow ideas
and code at no cost to Tech and no strings, then turn around and
profit by it.

I believe what we have here is a special case with all quibbling about
$100 aside, since such a purchase would not benefit either side all
that much in the long run.  However, the proposed sharing of runtime
libraries might be another matter, and both sides can save face and
their principles.  Both can keep their compiler approaches while
bending a little to accomodate a common runtime environment.  Propose
this to your management for careful consideration and let us know.  If
it turns out to be workable, we might have something; if not, your
compiler may have to fend for itself on entirely different grounds,
and we all loose...

--Frank
26-Aug-85 14:02:20-PDT,1193;000000000001
Mail-From: SATZ created at 26-Aug-85 14:01:15
Date: Mon 26 Aug 85 14:01:15-PDT
From: Greg Satz <[email protected]>
Subject: Re: NMIMT v7 C (sharing)
To: [email protected]
cc: [email protected], [email protected], [email protected]
In-Reply-To: Message from "Greg Titus <[email protected]>" of Thu 22 Aug 85 13:14:07-PDT
Phone: (415) 497-1004

Just to set the record straight, Stanford is developing this compiler
with help from others around the internet. This piece of code will be
copyrighted by the university so it can't be resold or otherwise
exchanged for money. Stanford, at this point, doesn't have any interest
in asking for any renumeration. Furthermore the code is free from any
licensing restrictions.

I don't remember who asked (Wancho?) the original question but I thought
it was something along the lines of letting us (Stanford) use some of
NMT's runtimes so we wouldn't have to write them ourselves. Greg Titus
seems to imply that $100 will give us the right to use NMT's software
but not redistribute it in the form of our compiler/runtime system. That
seems to answer the original query.

Stanford will continue developing the runtimes
22-Aug-85 13:14:07-PDT,1479;000000000011
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Thu 22 Aug 85 13:13:19-PDT
Date: Thu 22 Aug 85 14:13:10-MDT
From: Greg Titus <[email protected]>
Subject: NMIMT v7 C (sharing)
To: [email protected]
cc: [email protected], [email protected], [email protected],
    [email protected]
Message-ID: <[email protected]>

Frank,

I'm not sure I understand you.  We are offering to sell SRI the right to
use the code.  What we are asking is that they pay the same price for
that right that other people pay for the right to simply run it.  Granted,
$100 is not that much money, in terms of what SRI or White Sands or even
NMT spends in a year.  On the other hand, for every five $100 checks we
collect, we can buy another terminal (or whatever).

The thing is, we don't get much by sharing code with SRI.  Our compiler
is done and functional, as is our library.  You're saying that we're
missing out on the best of both worlds;  we're saying that we've already
chosen our world.

What do you mean by "against their principles"?  If you mean that it
is against their principles to steal code that wasn't offered to them,
no one is asking them to do that.  If you mean that it is against their
principles to pay for code, then that's too bad.  SRI pays them for
their code;  SRI can pay us for our code.

In any case, I'd like to hear from KLH or SATZ on this, also.

greg
-------
26-Aug-85 23:12:54-PDT,1989;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Mon 26 Aug 85 18:10:45-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA05079; Mon, 26 Aug 85 18:11:57 pdt
Message-Id: <[email protected]>
Date: Mon Aug 26 17:22:38 1985
From: jld@sri-unix
To: satz@su-sierra
Cc: jld@sri-unix, sarvela@sri-unix, kronj@su-sierra
Subject: kcc runtime lib

	David Eppstein tells me that he won't be doing much maintenance
on KCC until the end of the year because of a fellowship conflict.  We
are concerned enough about certain runtime library non-functionalities
(see below) to even do some of the work ourselves in the meantime, but
we would like to coordinate with SU -- and with D.E.

	1.  Symptom: programs that use stdio to read from a terminal
never see an end-of-file.  The problem is that read() assumes that
monitor-level reads will fail after end-of-file has been reached -- not
true for terminals.  Correct UNIX emulation requires setting a flag on
hitting real EOF, then returning 0 bytes and resetting the flag on the
next read, etc.  The flag must be associated with the file descriptor,
so we propose augmenting _uioch[] to become

	struct {
		short jfn;
		short flag;
	} _uioch[];

Modules that refer to _uioch[] will have to be changed.

	2.  Symptom: pgms hang in sscanf() because _fillbuf doesn't
check for _IOSTRG flag and return immediate EOF, but tries to read from
nowhere.

	3.  File descriptors are not really dup()'d on fork(), so
closing a descriptor in more than process causes problems.  We propose
to follow D.E.'s suggestion of mapping a special page into all child
processes to contain job-wide file structures with reference counts and
type information.  This would be rather tricky, as fork, exec, open,
and close would all be affected.  Unused parts of the shared page could
support an optional KCC pipe mechanism, for those poor systems that
don't have pipe devices.

Jim Dein
26-Aug-85 23:38:28-PDT,2426;000000000001
Mail-From: KRONJ created at 26-Aug-85 23:38:26
Date: 26 Aug 1985  23:38 PDT (Mon)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   [email protected]
Cc:   [email protected], [email protected]
Subject: kcc runtime lib
In-reply-to: Msg of Mon Aug 26 17:22:38 1985 from jld@sri-unix

First your specific proposals, and then some more general comments.

1.  Re: getting EOF from terminals.  You shouldn't need the flag.
Just check what type of device the JFN you're reading is to (I think
you do a DVCHR% or some such JSYS and then look for .DVTTY; if pipes
are ever done with PTYs we would want to be more careful) and if it's
a terminal do a SIN% with break on ^D and character count rather than
pure count.  Then you can return from the read() with a short count
that one time and it will look to the rest of the world like an EOF
(UNIX signals EOF by a short count, not necessarily by a zero count).
So this only needs to change the low level read, except that you
should also make sure stdio handles it correctly.

2. Re: getting EOF from strings in sscanf().  Sounds good.  I hadn't
realized _IOSTRG was useful for anything.

3. Re: making close of dup()ed fd's work correctly.  I thought some
more about this after I mentioned it to you and realized that the
scheme I described wouldn't work.  The problem is the memory map gets
blown away across exec(), and you can't know that you're running a C
program so you can't not let it get blown away.  What you need to do
is get the filename with JFNS, and open another JFN on that same name.
But I don't think there is any way to duplicate a pipe properly -- it
should be tried, but I don't think it will work (not to worry overmuch
though, since pipes are a local hack and could likely be improved to
some usable state).  And this would be complicated to get right for
output files in any case.


In general, the appropriate thing to do would be to make what changes
you think appropriate, and then send pointers to your files to Bug-KCC
so that someone can take a look at them and incorporate them into the
master sources.

As you mention I will not be doing much on KCC during the school year,
so I guess that someone would be Greg or Bill or Frank.  But in any
case you should also discuss these things with KLH, as he knows
probably more than I about the runtimes.
26-Aug-85 18:13:26-PDT,1989;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Mon 26 Aug 85 18:10:45-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA05079; Mon, 26 Aug 85 18:11:57 pdt
Message-Id: <[email protected]>
Date: Mon Aug 26 17:22:38 1985
From: jld@sri-unix
To: satz@su-sierra
Cc: jld@sri-unix, sarvela@sri-unix, kronj@su-sierra
Subject: kcc runtime lib

	David Eppstein tells me that he won't be doing much maintenance
on KCC until the end of the year because of a fellowship conflict.  We
are concerned enough about certain runtime library non-functionalities
(see below) to even do some of the work ourselves in the meantime, but
we would like to coordinate with SU -- and with D.E.

	1.  Symptom: programs that use stdio to read from a terminal
never see an end-of-file.  The problem is that read() assumes that
monitor-level reads will fail after end-of-file has been reached -- not
true for terminals.  Correct UNIX emulation requires setting a flag on
hitting real EOF, then returning 0 bytes and resetting the flag on the
next read, etc.  The flag must be associated with the file descriptor,
so we propose augmenting _uioch[] to become

	struct {
		short jfn;
		short flag;
	} _uioch[];

Modules that refer to _uioch[] will have to be changed.

	2.  Symptom: pgms hang in sscanf() because _fillbuf doesn't
check for _IOSTRG flag and return immediate EOF, but tries to read from
nowhere.

	3.  File descriptors are not really dup()'d on fork(), so
closing a descriptor in more than process causes problems.  We propose
to follow D.E.'s suggestion of mapping a special page into all child
processes to contain job-wide file structures with reference counts and
type information.  This would be rather tricky, as fork, exec, open,
and close would all be affected.  Unused parts of the shared page could
support an optional KCC pipe mechanism, for those poor systems that
don't have pipe devices.

Jim Dein
26-Aug-85 23:38:28-PDT,2426;000000000001
Mail-From: KRONJ created at 26-Aug-85 23:38:26
Date: 26 Aug 1985  23:38 PDT (Mon)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   [email protected]
Cc:   [email protected], [email protected]
Subject: kcc runtime lib
In-reply-to: Msg of Mon Aug 26 17:22:38 1985 from jld@sri-unix

First your specific proposals, and then some more general comments.

1.  Re: getting EOF from terminals.  You shouldn't need the flag.
Just check what type of device the JFN you're reading is to (I think
you do a DVCHR% or some such JSYS and then look for .DVTTY; if pipes
are ever done with PTYs we would want to be more careful) and if it's
a terminal do a SIN% with break on ^D and character count rather than
pure count.  Then you can return from the read() with a short count
that one time and it will look to the rest of the world like an EOF
(UNIX signals EOF by a short count, not necessarily by a zero count).
So this only needs to change the low level read, except that you
should also make sure stdio handles it correctly.

2. Re: getting EOF from strings in sscanf().  Sounds good.  I hadn't
realized _IOSTRG was useful for anything.

3. Re: making close of dup()ed fd's work correctly.  I thought some
more about this after I mentioned it to you and realized that the
scheme I described wouldn't work.  The problem is the memory map gets
blown away across exec(), and you can't know that you're running a C
program so you can't not let it get blown away.  What you need to do
is get the filename with JFNS, and open another JFN on that same name.
But I don't think there is any way to duplicate a pipe properly -- it
should be tried, but I don't think it will work (not to worry overmuch
though, since pipes are a local hack and could likely be improved to
some usable state).  And this would be complicated to get right for
output files in any case.


In general, the appropriate thing to do would be to make what changes
you think appropriate, and then send pointers to your files to Bug-KCC
so that someone can take a look at them and incorporate them into the
master sources.

As you mention I will not be doing much on KCC during the school year,
so I guess that someone would be Greg or Bill or Frank.  But in any
case you should also discuss these things with KLH, as he knows
probably more than I about the runtimes.
27-Aug-85 11:39:18-PDT,420;000000000001
Mail-From: WHP4 created at 27-Aug-85 11:37:54
Date: Tue 27 Aug 85 11:37:54-PDT
From: Bill Palmer <[email protected]>
Subject: extended addressing broken
To: [email protected]

Just for kicks, I tried making an extended version of KCC.  It seems to have
real problems parsing files that a non-extended version parses with ease.  It
also gets illegal memory reads.

Try it, you won't like it.

					Bill
27-Aug-85 11:54:10-PDT,1016;000000000001
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Tue 27 Aug 85 11:53:30-PDT
Date: Tue 27 Aug 85 12:52:48-MDT
From: Greg Titus <[email protected]>
Subject: Re: NMIMT v7 C (sharing)
To: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Mon 26 Aug 85 15:02:36-MDT
Message-ID: <[email protected]>

Well, I don't know how to make it any clearer -- $100 will give you the
right to use our code in any way you want:  hack it, sell it, throw it away,
laugh at it, whatever.  We're not offering the standard deal, here;  we're
offering (joint) ownership of the code.

I think (he said, objectively) that you ought to go for it.  You'd get a
lot of runtimes, along with some useful compiler stuff (like the syntactic
error recovery technique, for example).  You can probably find someone who
already owns our C who will let you drive it around a little and maybe even
peek under the hood ...

greg
-------
27-Aug-85 11:59:21-PDT,1349;000000000001
Mail-From: SATZ created at 27-Aug-85 11:58:01
Date: Tue 27 Aug 85 11:58:01-PDT
From: Greg Satz <[email protected]>
Subject: [Greg Titus <[email protected]>: Re: NMIMT v7 C (sharing)]
To: [email protected], [email protected]
Phone: (415) 497-1004

You may want to do something about this.
                ---------------

Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Tue 27 Aug 85 11:53:30-PDT
Date: Tue 27 Aug 85 12:52:48-MDT
From: Greg Titus <[email protected]>
Subject: Re: NMIMT v7 C (sharing)
To: [email protected]
In-Reply-To: Message from "Greg Satz <[email protected]>" of Mon 26 Aug 85 15:02:36-MDT
Message-ID: <[email protected]>

Well, I don't know how to make it any clearer -- $100 will give you the
right to use our code in any way you want:  hack it, sell it, throw it away,
laugh at it, whatever.  We're not offering the standard deal, here;  we're
offering (joint) ownership of the code.

I think (he said, objectively) that you ought to go for it.  You'd get a
lot of runtimes, along with some useful compiler stuff (like the syntactic
error recovery technique, for example).  You can probably find someone who
already owns our C who will let you drive it around a little and maybe even
peek under the hood ...

greg
27-Aug-85 20:11:57-PDT,1280;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Tue 27 Aug 85 19:41:56-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA17866; Tue, 27 Aug 85 19:43:03 pdt
Message-Id: <[email protected]>
Date: Tue Aug 27 11:50:07 1985
From: jld@sri-unix
To: kronj@su-sierra
Cc: sarvela@sri-unix, satz@su-sierra
Subject: kcc runtime lib

	Thanx for your reply to my proposals for fixing/improving the
KCC runtime lib.  I will submit my improvements to Bug-KCC.

	By the way, it is not true that a short-count read signals EOF
in UNIX (see, e.g., K&R p.  160).  For example, in UNIXland most reads
from the terminal (in cooked mode) return short counts.  It is only on
the zero count that you know you have hit the wall.  However, TOPS20
signals EOF on the last short read; by the time you go back for the
zero count, the monitor has "forgotten" about the EOF, and the monitor
read looks for more input.  It is therefore up to the KCC runtime
system to somehow remember EOF on file descriptors between reads.  This
would be required even if terminal reads were set to break on ^D.  For
the sake of uniformity, all UNIX pgms should use the zero read as the
basis for finding EOF -- see the code for _fillbuf().

Jim Dein
27-Aug-85 20:13:34-PDT,550;000000000001
Mail-From: KRONJ created at 27-Aug-85 20:13:31
Date: 27 Aug 1985  20:13 PDT (Tue)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   [email protected]
Cc:   [email protected], [email protected]
Subject: kcc runtime lib
In-reply-to: Msg of Tue Aug 27 11:50:07 1985 from jld@sri-unix

Another way to remember EOF would be to BKJFN% the ^D if you got other
chars with it.  The overhead involved would not occur particularly
often and it would save having to add an end-of-file flag.
29-Aug-85 00:11:53-PDT,2525;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Wed 28 Aug 85 17:24:06-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA28729; Wed, 28 Aug 85 17:24:52 pdt
Message-Id: <[email protected]>
Date: Wed Aug 28 10:46:17 1985
From: jld@sri-unix
To: kronj@su-sierra
Cc: jld@sri-unix, sarvela@sri-unix, satz@su-sierra, klh@sri-nic
Subject: kcc read

	[To KLH: the present ongoing discussion stems from the
discovery that KCC pgms reading from the terminal never see
end-of-file.  The problem is in _read(), which assumes "once-EOF,
always EOF", which is not true for terminals; thus KCC read() from a
terminal can never return a zero count.]

	There is a fundamental diff between the way UNIX and TOPS20
pgms sense EOF -- UNIX pgms must execute an "extra" read() that returns
a zero byte count.  Also, UNIX pgms must be able to continue reading in
the hope that more data may become available; this is true for disk
files as well as for terminals.  Note that the TOPS20 EOF may occur on
the zero-count read if the previous read consumed the last available
byte, or on the previous read() itself if not enough info was available
to satisfy the request.  To put it another way, read() may resume
getting chars from the terminal iff the previous read returned 0 bytes,
a piece of state information that cannot be gleaned from the monitor.

	Now, you suggest getting around the problem (for terminals) by
doing a SIN% breaking on ^D, then pushing the ^D back with BKJFN% if
other chars were read.  The shortcomings of that scenario are somewhat
esoteric: suppose we arrange to have 2 processes trying to read from
the terminal at the same time.  (Yes, this is occasionally useful!)
Suppose the first one reads some chars terminated a ^D.  The read is
successful, and the ^D gets pushed back.  Now the second process sees
a naked ^D and returns a 0 count.  Oh dear, the wrong process got the
EOF.

	In any case, I believe you are right that terminal input must
somehow involve special code because of the ^D or ^Z or whatever, so
read() must know what kind of device it's reading from.  Other weird
devices may also need special code.  There is probably no point in
writing read() in assembler because the C code would be nearly as
efficient and much clearer.

	Summary:

		struct _uioch {
			short jfn;		/* JFN */
			short devcode;		/* device designator */
			short was_eof;		/* last read returned 0 */
		} _uioch[NFILES];

Jim Dein
29-Aug-85 11:28:11-PDT,385;000000000001
Mail-From: KRONJ created at 29-Aug-85 11:28:08
Date: Thu 29 Aug 85 11:28:08-PDT
From: David Eppstein <[email protected]>
Subject: _uioch[]
To: [email protected]

If you're going to add flags for each file, at least do it in a parallel array
so that not nearly so much code needs to be changed.  And do them with bits
rather than taking a whole word for each flag.
-------
 2-Sep-85 10:15:00-PDT,2749;000000000001
Return-Path: <jld@sri-unix>
Received: from sri-unix.ARPA by SU-SIERRA.ARPA with TCP; Sat 31 Aug 85 21:50:27-PDT
Received: by sri-unix.ARPA (4.12/4.16)
	id AA08179; Thu, 29 Aug 85 14:53:26 pdt
Message-Id: <[email protected]>
Date: Thu Aug 29 11:41:54 1985
From: jld@sri-unix
To: [email protected]
Cc: jld@sri-unix, sarvela@sri-unix
Subject: Re: _uioch[]

	I know hindsight is easy, but probably _uioch[] should have
been a structure to begin with; after all, it *does* constitute the
per-process file table entry.  We expect to find other things to add to
it later.  The whole business of copying fd's on fork, and similar
matters, is easier to maintain if all the information is kept together.
In any case, other improvements we have in mind may also affect many
places in the code.  I am not quite sure how to introduce the changes
to bug-kcc -- I suppose the best thing to do is submit a summary and
invite others to ftp our new code and try it out.  Incidentally, our
source stuff is kept in <kcc.whatever> on host sri-stripe.

	As for the struct layout: for device type and JFN, we propose
using 1/2-wds, as with monitor calls; they would be fields defined as
int :18.  These short fields would be changed to type short when short
is shortened to 1/2 wd :<).

	Here are some of our murkier ideas.  It appears that ENQ/DEQ
could be used to control the multiple-close problem on fd's; the
"resource" would be _uchio->jfn.  We really need to have logically
different JFN's in each process, however, because each process has its
own idea of the file offset.  New JFN's would also be required on dup()
for the same reason.  It is not clear, however, whether it would always
be possible to get new equivalent jfn's for instantiations of old
objects -- that is the real sticker.

	I would like to state my (UNIX-biased) opinion that assembler
should be used only where absolutely necessary in KCC (and in
everything else, for that matter), mainly to buffer KCC against changes
in itself (change in sizeof(_uchio), e.g.).  I would like to see a
char7 type added to KCC to facilitate communication with the monitor.
Naturally, some routines would be just ridiculous if written in C, and
there is also the problem of buffering KCC against changes in the
monitor -- though perhaps an equivalent of "SEARCH MONSYM" could be
implemented, perhaps effected with #include <monsym.h> ??!  Naturally,
I am learning how to program in MACRO, if only to avoid the charge of
using C by default.

 ________________________

	If I don't see you before you go back to Columbia -- have a fun
year, and say hello to good old Morningside Heights for me.  Be sure to
give DEM your net addr.

Jim Dein
 5-Sep-85 08:18:01-PDT,1655;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Thu 5 Sep 85 08:17:37-PDT
Date: Wednesday, 4 September 1985  18:44-EDT
Message-ID: <[email protected]>
Sender: Ken Harrenstien <[email protected]>
From: Ken Harrenstien <[email protected]>
To: [email protected]
cc: [email protected]
Subject:   KCC code gen bug
ReSent-From: [email protected]
ReSent-To: Bug-KCC@Sierra
ReSent-Date: Thu 5 Sep 1985 11:19-EDT

I was playing around to see what KCC would generate for various code
sequences and found the following disturbing bug.  Here is the test
program I used:
--------------------------------------
#define FLGBIT 040
struct header {
	unsigned he_id : 18;
	unsigned he_tbl : 18;
	unsigned he_len : 18;
	unsigned he_fre : 18;
	unsigned he_byt : 8;
	unsigned he_bit : 1;
} head;
struct header *hptr;
main()
{	int a,b,c;
	hptr = &head;
	a = hptr->he_id;
	b = hptr->he_tbl;
	c = hptr->he_len;
	if(a != 0123) printf("Bad");
	if(b+c) printf("ok");	
	if(b&FLGBIT) b=1;
	b = 0;
	a = hptr->he_byt;
	return(a+b);
}
------------------------------------------
The field stuff is incidental, I think.  The problem is with the
test for FLGBIT, which produces:
	...
	MOVE 4,-1(17)		; Get B
	TRNE 4,40		; Test it
	 SKIPA 3,[1]		; If set, get val 1
	 TRNA 			; If not set, we, uh, we... uh, what the hell?
	SETZ 1,1		; We only get here if FLGBIT was set!
	MOVE 7,hptr
	LDB 6,[341007,,2]
	ADD 1,6			; B gets used here!
	ADJSP 17,-3
	POPJ 17,

Even though the C code is not well written, it should still always
leave B set to zero!
 7-Sep-85 17:14:13-PDT,698;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 7 Sep 85 17:10:39-PDT
Date: Sat 7 Sep 85 17:11:52-PDT
From: Ken Harrenstien <[email protected]>
Subject: IMR at SCONST+15
To: [email protected]
cc: [email protected]

KCC dies messily if it tries to compile a file that has too many constant
strings within one structure, by getting an illegal mem ref at SCONST+15
which is where it does an IDPB using a BP in CLOOP.  CLOOP gets smashed
when the stuff in .CLOOP becomes too large.

(1) Can't this stuff be allocated dynamically?
(2) Should check for overflow.

I have a file which produces this error, if a test case is needed.
-------
14-Sep-85 01:20:22-PDT,571;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Sat 14 Sep 85 01:15:36-PDT
Date: Sat 14 Sep 85 01:16:43-PDT
From: Ken Harrenstien <[email protected]>
Subject: unsigned int support
To: [email protected]
cc: [email protected]

Are there any plans to support "unsigned int"?  Currently the code
produced appears to treat unsigned ints exactly like signed ints.
This may seem to work a lot of the time because 35 bits tend to be
enough, but there are real portability problems, especially when
doing compares.
-------
23-Sep-85 20:11:09-PDT,994;000000000001
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Mon 23 Sep 85 20:11:03-PDT
Date: Mon, 23 Sep 1985  21:07 MDT
Message-ID: <[email protected]>
From: "Frank J. Wancho" <[email protected]>
To:   BUG-KCC@SIERRA
cc:   [email protected]
Subject: Another off-the-wall idea

Would there be any merit in trying to compile gnuemacs with KCC?  I
realize it may be an interesting exercise for the compiler.  But, the
real question is would the result, once successfully compiled and
linked, be a worthwhile program in the TOPS-20 environment that
already has the "original" semi-abandoned EMACS?

I suspect the answer may be "I don't know, why don't you go ahead and
try it".  However, what I'd like to know is that given that you
already know the "inefficiencies" of a TECO-based EMACS, would a
C-based EMACS be any "better"?  It would help if comments were based
on some knowledge of using gnuemacs...

--Frank
24-Sep-85 14:06:03-PDT,961;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 24 Sep 85 14:05:46-PDT
Date: Tue 24 Sep 85 14:01:42-PDT
From: Ken Harrenstien <[email protected]>
Subject: Re: Another off-the-wall idea
To: [email protected], [email protected]
cc: [email protected]
In-Reply-To: Message from ""Frank J. Wancho" <[email protected]>" of Mon 23 Sep 85 20:08:54-PDT

Personally, the only immediate advantages that a C-based EMACS might
provide over the real EMACS are (1) ability to edit huge files, and
(2) better redisplay.  For some time I have planned to try bringing
up my ELLE editor on TOPS-20, partly just for fun and partly so that
I will be able to edit files of any size whatsoever when that becomes
necessary.  I do not know whether GNU emacs requires that the entire
file be mapped in or not.  I suspect it does.  However, with extended
addressing this might not be too severe a limitation.
-------
24-Sep-85 23:44:32-PDT,359;000000000001
Mail-From: ROODE created at 24-Sep-85 23:44:26
Date: Tue 24 Sep 85 23:44:25-PDT
From: David Roode <[email protected]>
Subject: math/trig routines
To: [email protected]

Has anyone thought of building an interface to call math routines
from the fortran library in KCC?  Are any of the routines in
C:MATH.H available for KCC anywhere?
-------
30-Sep-85 04:21:49-PDT,450;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Mon 30 Sep 85 04:20:18-PDT
Date: Mon 30 Sep 85 04:16:02-PDT
From: Ken Harrenstien <[email protected]>
Subject: KCC status?
To: [email protected]
cc: [email protected]

What is the status of KCC at this point?  I would like to start another
round of intensive hacking (mostly the C library) but need to know if
things are quiescent or not.
-------
30-Sep-85 08:09:05-PDT,596;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Mon 30 Sep 85 08:05:12-PDT
Date: 30 Sep 1985  11:00 EDT (Mon)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Ken Harrenstien <[email protected]>
Cc:   [email protected]
Subject: KCC status?
In-reply-to: Msg of 30 Sep 1985  07:16-EDT from Ken Harrenstien <KLH at SRI-NIC.ARPA>

I'm certainly not doing anything, and likely won't be doing much until
at least next summer.  I can't speak for the people at Sierra...
 3-Oct-85 07:31:02-PDT,724;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Thu 3 Oct 85 07:29:13-PDT
Date: Thursday, 3 October 1985  01:30-EDT
Message-ID: <[email protected]>
Sender: Andrew "VaxBuster" Gideon <[email protected]>
From: Andrew "VaxBuster" Gideon <[email protected]>
To: [email protected]
Subject:   KCC question
Office-Phone: (415) 497-4816
ReSent-From: [email protected]
ReSent-To: bug-c-archive@sierra
ReSent-Date: Thu 3 Oct 1985 10:25-EDT

David:

Is there any way to switch stdin & stdio from seven bit to eight bit?

	Andy


P.S.	If I were to close/open them to do this, what file name would
	be appropriate?
 3-Oct-85 07:32:40-PDT,884;000000000001
Return-Path: <[email protected]>
Received: from COLUMBIA-20.ARPA by SU-SIERRA.ARPA with TCP; Thu 3 Oct 85 07:32:37-PDT
Date: 3 Oct 1985  10:28 EDT (Thu)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Andrew "VaxBuster" Gideon <[email protected]>
Subject: Way to switch stdin & stdio from seven bit to eight bit
In-reply-to: Msg of 3 Oct 1985  01:30-EDT from Andrew "VaxBuster" Gideon <A.ANDY at SU-GSB-HOW.ARPA>

The simple answer is that you're out of luck.

The more complicated answer is that in certain cases (i.e. not pipes)
you can do a JFNS% on .PRIIN or .PRIOU, and reopen that file.  If you
think that sounds more like assembly language than C, you're right.

By the way, I'm off being a student again, so you would do better to
send questions and bug reports to BUG-KCC@SIERRA.
 8-Oct-85 09:45:41-PDT,653;000000000001
Mail-From: SATZ created at  8-Oct-85 09:45:32
Date: Tue 8 Oct 85 09:45:32-PDT
From: Greg Satz <[email protected]>
Subject: [Sierra]PS:<KCC.DBUG>
To: [email protected]
Phone: (415) 497-1004

I found a debugging system on Usenet a while ago and recently came
across it in my archives. Today, I just brought it up on Sierra in the
above directory and it seems to work. Documentation and test programs as
well as the compiled library exist in this directory.

This type of debugger requires you to add extra instrumentation into
your program to glean any benefit. However, this is better then what we
currently have. Enjoy...
-------
23-Oct-85 19:31:08-PDT,455;000000000001
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Wed 23 Oct 85 19:28:54-PDT
Date: Wed, 23 Oct 1985  20:24 MDT
Message-ID: <[email protected]>
From: "Frank J. Wancho" <[email protected]>
To:   BUG-KCC@SIERRA
Subject: FOPEN and generation numbers

Is it a known bug that if you give fopen a filename with an explicit
generation number, fopen returns a NULL (failure)?

--Frank
 1-Nov-85 11:13:57-PST,344;000000000001
Mail-From: SATZ created at  1-Nov-85 11:13:37
Date: Fri 1 Nov 85 11:13:36-PST
From: Greg Satz <[email protected]>
Subject: DECUS version of cpp in <KCC.CPP> on Sierra
To: [email protected]
Phone: (415) 497-1004

I put a copy of the DECUS public domain version of the C preprocessor
on sierra in PS:<KCC.CPP>. Have fun.
-------
 4-Nov-85 14:17:05-PST,998;000000000001
Return-Path: <[email protected]>
Received: from CS.COLUMBIA.EDU by SU-SIERRA.ARPA with TCP; Mon 4 Nov 85 14:15:16-PST
Date: 4 Nov 1985  17:12 EST (Mon)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   "Joel P. Bion" <[email protected]>
Subject: Extended addressing in C
In-reply-to: Msg of 4 Nov 1985  17:09-EST from Joel P. Bion <JPBion at SU-SIERRA.ARPA>

    Date: Monday, 4 November 1985  17:09-EST
    From: Joel P. Bion <JPBion at SU-SIERRA.ARPA>

    How do I get it to "turn on?" I've got examples of code in W.C and S.C
    in my directory, both of which generate rather odd results...

You can use /USE-SECTION in various EXEC commands, or you can deposit
the number 1 in $EXADF.  It is possible that extended addressing has
at some point been broken, in which case you should probably send mail
to Bug-KCC rather than me (I'm being a student right now and therefore
not looking at KCC at all).
11-Nov-85 10:36:03-PST,1031;000000000001
Return-Path: <[email protected]>
Received: from CS.COLUMBIA.EDU by SU-SIERRA.ARPA with TCP; Mon 11 Nov 85 10:34:54-PST
Date: Saturday, 9 November 1985  20:15-EST
Message-ID: <[email protected]>
Sender: Bill Palmer <[email protected]>
From: Bill Palmer <[email protected]>
To: [email protected], [email protected]
Subject:   drystones and KCC
ReSent-From: [email protected]
ReSent-To: bug-c-archive@sierra
ReSent-Date: Mon 11 Nov 1985 13:32-EST

After seeing a drystone program on net.arch or something a few days ago,
I decided to try it out on KCC.  The NMT c compiler apparently benchmarks
at around 2000 /sec and KCC got slightly above 2600.  Just for kicks, I
also ran a copy on the new Systems Concepts machine at LOTS, which got
just over 4100.  

If you didn't see the original articles, the 8600 got 7100 and some flavor
of Sun-3 got 3500-4100 depending on whether or not register variables were
used.  Some Amdahl beast knocked off 23,000.

					Bill
13-Nov-85 14:17:29-PST,602;000000000001
Return-Path: <[email protected]>
Received: from CS.COLUMBIA.EDU by SU-SIERRA.ARPA with TCP; Wed 13 Nov 85 14:16:35-PST
Date: Wed 13 Nov 85 17:16:10-EST
From: David Eppstein <[email protected]>
Subject: getopt
To: [email protected]

I found a public domain version of getopt() lying around here (apparently
acquired from mod.sources) and since there didn't seem to already be such
a thing for KCC I left the sources for this one on <KCC.LIB> on Sierra.
I know of no reason why it shouldn't work as is.  Maybe someone should look
at it and include it in CLIB.REL?
-------
13-Nov-85 14:50:26-PST,807;000000000001
Return-Path: <[email protected]>
Received: from SIMTEL20.ARPA by SU-SIERRA.ARPA with TCP; Wed 13 Nov 85 14:47:01-PST
Date: Wed, 13 Nov 1985  15:46 MST
Message-ID: <[email protected]>
From: "Frank J. Wancho" <[email protected]>
To:   David Eppstein <[email protected]>
Cc:   [email protected]
Subject: getopt
In-reply-to: Msg of 13 Nov 1985  15:16-MST from David Eppstein <Eppstein at CS.COLUMBIA.EDU>

There are at least two other versions available.  See PD:<UNIX.UTILS1>
here for GETOPT.*.  You may wish to scan PD:<UNIX*> for other files of
possible interest, mostly culled from net.sources and more recently,
mod.sources.  There is an announcement list of what's new/changed.
Send requests to get on the list to UNIX-SW-REQUEST@SIMTEL20.

--Frank
19-Nov-85 11:43:39-PST,430;000000000001
Return-Path: <[email protected]>
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Tue 19 Nov 85 11:41:33-PST
Date: Tue, 19 Nov 1985  11:41 PST
Message-ID: <IAN.12160550924.BABYL@SRI-NIC>
From: Ian Macky <Ian@SRI-NIC>
To:   bug-kcc@sierra
Subject: conditional bug


	if (1) ; else <statement>

takes the else branch, whether the if branch is empty or not.  something
like int i = 1 then if (i) works, however...
21-Nov-85 08:32:13-PST,544;000000000001
Return-Path: <[email protected]>
Received: from CS.COLUMBIA.EDU by SU-SIERRA.ARPA with TCP; Thu 21 Nov 85 08:32:07-PST
Date: 21 Nov 1985  11:32 EST (Thu)
Message-ID: <[email protected]>
From: David Eppstein <[email protected]>
To:   Ken Harrenstien <[email protected]>
Subject: KCC crunch imminent
In-reply-to: Msg of 20 Nov 1985  18:52-EST from Ken Harrenstien <KLH at SRI-NIC.ARPA>

There is already a varargs.h, or did you mean actually using it?
The write lock is of course all right by me.
21-Nov-85 08:33:06-PST,1586;000000000001
Return-Path: <[email protected]>
Received: from CS.COLUMBIA.EDU by SU-SIERRA.ARPA with TCP; Thu 21 Nov 85 08:32:40-PST
Date: Wed 20 Nov 85 15:52:57-PST
Sender: Ken Harrenstien <[email protected]>
Return-Path: <@SU-SIERRA.ARPA:[email protected]>
Received: from SU-SIERRA.ARPA by CS.COLUMBIA.EDU with TCP; Wed 20 Nov 85 23:09:36-EST
Received: from SRI-NIC.ARPA by SU-SIERRA.ARPA with TCP; Wed 20 Nov 85 15:59:55-PST
From: Ken Harrenstien <[email protected]>
Subject: KCC crunch imminent
To: [email protected], [email protected]
cc: [email protected], [email protected]
Message-ID: <[email protected]>
ReSent-From: [email protected]
ReSent-To: bug-c-archive@sierra
ReSent-Date: Thu 21 Nov 1985 11:32-EST

	Ian has located a public C math library, but before we can use
it there are several things that need to be fixed in KCC, for example
its handling of floating-point constants.  They should be doubles but
are not being handled with sufficient precision.  As long as we are at
it, I think now is the time that we (NIC) should exert a write lock and
apply several other fixes and improvements to KCC and the library that
have been piling up.
	If no one else has intentions of touching the stuff for the
next month or so, this shouldn't cause any problems.  Is this a
correct assumption?  Also, if you like, we could probably incorporate
some other additional features/fixes in the process (varargs.h for
example) if you let us know what they are.  Send comments... send them
faster if you don't like the write-lock idea.

--Ken