Trailing-Edge
-
PDP-10 Archives
-
SRI_NIC_PERM_FS_1_19910112
-
kccdist/lib/usys.doc
There are 9 other files named usys.doc in the archive. Click here to see a list.
USYS.DOC - KCC Un*x System-call simulation
This file documents various things about the USYS library routines,
which are intended to provide simulation and support for Un*x system
call functions.
Specifications for the interfaces were taken from the March
1984 4.2BSD Unix Programmer's Manual (UPM), plus the 4.3BSD man pages,
and all code here works as described in those references unless
otherwise documented.
Implementors should also read the CODSYS.DOC file in the
source directory.
Contents:
Intro - listing format
Summary of USYS routines
Definitions
Individual Routine Descriptions (as needed)
Library function listing format:
Name Module Port-status Comments
(routine name) (source file) (see below)
Port-status code:
E = file #includes "c-env.h" for environment config.
<sys> - runs on the given sys, one of: T20,10X,T10,WAITS,ITS.
*10 = portable to all PDP-10 OS (T20,10X,T10,WAITS,ITS)
* = fully portable (either no OS-dependent stuff, or a
fully-portable conditional exists)
Comments:
"U" means the call is USYS_macro enclosed and cannot be
interrupted by signals.
"-" means it isn't enclosed
(this better be cuz interrupts don't affect the call!)
"I" means it is interruptible (i.e. can return EINTR error).
"IC" means interrupts may be continued within the call.
USYS function summary: Src: lib/usys/
Name Module Port-status Comments
access ACCESS T20,10X U 10X only partial.
alarm ALARM T20 U
brk SBRK *10 U
chdir CHDIR T20,10X U
chmod CHMOD T20,10X U
chown CHOWN T20,10X U
close CLOSE T20,10X,ITS U
creat OPEN T20,10X U
dup DUP *10 U
dup2 DUP *10 U
errno (data) URT *10 -
exec[lv][ep] FORK T20,10X U
exit EXIT *10 -
fchmod CHMOD T20,10X U
fchown CHOWN T20,10X U
fcntl FCNTL *10 U
fork FORKEX T20,10X U
forkexec FORKEX T20,10X U KCC-specific routine.
fstat STAT T20,10X U (also: xfstat)
ftime TIME *10 -
geteuid GETUID T20,10X U
getpid GETPID *10 U see format note.
gettimeofday TIME *10 -
getuid GETUID T20,10X U
gtty SGTTY T20 U
ioctl IOCTL T20 UIC Partial.
kill SIGVEC T20,10X U
lseek LSEEK T20,10X U
open OPEN T20,10X U (Uses BSD flags; mode not supported)
pause PAUSE *10 -I Always returns with EINTR
pipe PIPE T20 U (monitor must have PIP: device)
psignal PSIGNA *10 -
raise SIGVEC T20,10X U (ANSI function, not syscall)
read READ T20,10X U
rename RENAME T20,10X U
sbrk SBRK *10 U
sigblock SIGVEC T20,10X U
signal SIGNAL T20,10X U
sigpause SIGVEC T20,10X UI Always returns with EINTR
sigreturn SIGVEC T20,10X U
sigsetmask SIGVEC T20,10X U
sigstack SIGVEC T20,10X U
sigvec SIGVEC T20,10X U
sleep SLEEP *10 -I (returns no value)
stat STAT T20,10X U (also: xstat)
stty SGTTY T20 U
tell LSEEK T20,10X U
time TIME *10 - (also: tadl_xxx routines)
times TIMES *10 -
unlink UNLINK T20,10X,ITS U (10X doesn't expunge)
utime UTIME T20,10X U
vfork FORK T20,10X U
wait WAIT T20,10X UIC
write WRITE T20,10X,ITS UIC
_exit EXIT *10 U
_runtm URT *10 (internal) C programs start here.
_urtsud (data) URTSUD *10 (internal) Runtime startup defs.
DEFINITIONS:
The UPM introduction contains some definitions which provide
a convenient way to start describing how the KCC simulations differ from
Un*x; some concepts are supported and others are not.
Process ID (PID): Supported, see long description.
Parent Process ID: Supported, see long description.
Process Group ID: not implemented
TTY Group ID: not implemented
Real User ID: Supported.
The user ID on T20/10X is the "user number".
Real Group ID: not implemented
Effective UID, GID, and Access Groups: not implemented
Super-user: not implemented
Special Processes: not implemented
File Descriptor (FD): Supported.
Small non-negative integers in the range 0 to 63 inclusive.
FDs 0, 1, and 2 correspond to standard input, output, and error output.
On T20/10X these are initially set to the JFNs .PRIIN, .PRIOU, .CTTRM.
You can obtain the JFN for a FD by using the fcntl() call.
You can assign a JFN to a FD by using the open() call with O_SYSFD.
File Name: Supported.
On T20/10X, up to 39 chars per component. Must be 7-bit ASCII.
Can quote with ^V.
Path Name: Supported.
Un*x style / paths are permitted, where foo/ is taken to mean foo
is a subdirectory of the current directory.
If the monitor worked right, a filespec of the form "C:foo/bar.h" could
be turned into "C:<.foo>bar.h", assuming standard-type relative
directory fixes, and we would win. But no. Instead, we have to do
the work manually: the logical device is recursively expanded until
an end device-directory pair is found, at which point the file
lookup is tried. If it fails, the expansion/traversal continues.
Directory: Supported.
<.> and <..> work on some T20 systems (Stanford mods).
Root Directory and Current Working Directory:
A directory of the form "/foo" is taken to mean "<foo>".
The notion of a current working directory is supported.
File Access Permissions:
Each set of T20/10X owner, group, and world access bits
corresponds to a set of Un*x owner, group, and other bits.
Un*x T20/10X
04 040 Read access
02 020 Write access
01 010 Execute access
- 04 Append access
- 02 Directory listing access
- 01 -
Thus, a call such as "chmod(foo.bar, 0644)" will set the T20/10X
protection of "foo.bar" to 604040.
There is no way a user program can either read or set the last
three T20/10X protection bits, except by doing a CHFDB% itself.
Finally, there is no T20/10X counterpart to the set-UID or set-GID
bits.
Sockets and Address Families: not implemented (yet!)
ERRORS:
The global "errno" is set by all failing USYS calls.
The standard UN*X error numbers from <errno.h> are used where possible,
although a few T20-specific error codes have been defined. See <errno.h>
for details.
There is currently no easy way to find out exactly what
TOPS-20/TENEX error (if any) caused errno to be set. The best one can
do is find the most recent error for the process with GETER%; perhaps
someday this will be improved. Note however that the "strerror" function
can be furnished with an argument of -1 which will cause it to return a
string describing the last TOPS-20/TENEX error. Since this is not a
widespread convention, it is probably best to define a LASTSYSERR
macro such as the following for portability:
#include <errno.h>
#ifndef _ERRNO_LASTSYSERR /* If have capability, */
#define LASTSYSERR -1 /* use this arg to get last sys err */
#else
#define LASTSYSERR errno /* else just do std unix stuff */
#endif
This definition can then be given as an argument to strerror(), as in
fprintf(stderr, "Failed - %s\n", strerror(LASTSYSERR));
Exceptions (never set errno):
ftime, time, gettimeofday, times
getpid
_exit (never returns)
PROCESS ID (PID): Long Description
PID values are generated by:
(1) getpid() - self process ID. This must not change over
the lifetime of the process!
(2) fork() - to identify the child process.
(3) wait() - to identify the child process that stopped.
This should match the value returned by fork().
PID values are used by:
(1) kill() - to send signals to self, child, or parent.
(It is rare to send them anywhere else.)
(2) Code that checks the return value of wait().
(3) Code that generates unique filenames, port numbers, or the like
which should not conflict with those of any other
active process.
For ITS, TOPS-10, and WAITS the PID is simply the job #, a
small positive integer; zero and negative PIDs will never be seen, as
job 0 is the monitor itself and no system can support 2**35 jobs.
However, TENEX (and hence TOPS-20) has never had a notion of a
unique process identifier, except internally inside the monitor; this
fork ID is simply not accessible to the user. Fork handles are all
relative, in an obscene attempt to prevent programs from referencing
any process they shouldn't. This makes it impossible to implement
getpid() in a straightforward way. The subterfuge I have resorted to
is as follows:
T20 PID = <IPCF PID>,,<frk #><job #>
<IPCF PID> - Left half of a PID generated by MUTIL% for process
This is guaranteed by system to be unique.
<frk #> - low 9 bits of relative fork handle, in 0777000.
<job #> - low 9 bits of job number, in 0777.
getpid() remembers the value generated on first call and
returns that thereafter. This satisfies the uniqueness and constancy
criterion, as well as being efficient.
fork() and wait() convert the relative fork handles from
CFORK% and GFRKS% to a child PID with a zero LH but with the other
fields set. Since relative fork handles are from 400000 to 400777, we
only need the low 9 bits.
fork() in the new child process copies the saved getpid()
value, if any; this is its parent's PID and may be used by kill(). The
saved value is then cleared so if the child calls getpid() it will generate
its own unique value.
kill() checks its PID argument first against the saved
getpid() value to see if a signal is being sent to itself. If not it then
sees whether it matches that of its parent (if any) and sends a signal
to .FHSUP if so. Otherwise, if the LH is 0 it assumes the signal is
being sent to a child, and generates the appropriate relative fork
handle from the 9 bits in the PID value. Note: There is no good way
to identify "miscellaneous" signals generated by another process (PSIs
on the "CHNmisc" channel); only those PSIs uniquely mapped to a single
signal can be successfully sent between processes.
This scheme fails only if PIDs are somehow passed from one
process to another either via pipe, file, or vfork() shared memory, since
the result of a child's getpid() won't match what its parent's fork()
returned. But this should practically never happen.
TENEX:
On TENEX, which doesn't have IPCF, we just use GFRKS% to
locate our fork within the job fork structure and hope the resulting
number, which we stick in the LH, doesn't change. At least TENEX
doesn't have extended addressing either so we can munch the GFRKS%
data on the stack.
OPEN() - Some I/O details:
On TOPS-20/TENEX, the open() call basically derives a filename
from the given pathname (which may be in either TOPS-20 or Un*x syntax)
and applies GTJFN% and OPENF% to it. However, in order to avoid some
frustrating problems with "invalid simultaneous access" and "file busy"
errors, a successful OPENF% when writing a new file is followed
immediately by a CLOSF% and another OPEN%, which has the effect of
immediately making the file "real" and visible to other processes. The
main difference with this strategem becomes visible only if the process
happens to be killed or reset prematurely; if so, the files it was
writing will still be left around and have zero length, whereas for older
versions of open() (or non-C programs) the files would not exist at all.
The open() call has several additional flags which are intended to
help specify the proper actions on TOPS-20/TENEX systems, since the
defaults assumed by open() may not always be correct.
Standard BSD flags:
O_RDONLY open for reading only
O_WRONLY open for writing only
O_RDWR open for reading and writing
O_NDELAY do not block on open (not supported)
O_APPEND append on each write
O_CREAT create file if it does not exist
O_TRUNC truncate size to 0
O_EXCL error if create and file exists
KCC-specific flags (not portable)
O_BINARY Open in binary (9-bit byte) mode
O_CONVERTED Force LF-conversion
O_UNCONVERTED Force NO LF-conversion
O_BSIZE_7 Force 7-bit bytesize
O_BSIZE_8 Force 8-bit bytesize
O_BSIZE_9 Force 9-bit bytesize
O_SYSFD 1st arg is a system-dependent I/O handle
(on T20/10X, a JFN)
TOPS-20 and TENEX specific flags
O_T20_WILD Allow wildcards on GTJFN%
O_T20_WROLD For writes, do NOT use GJ%FOU
O_T20_SYS_LOG logical device is system-wide!
O_T20_THAWED Open file for thawed access
The BSD flags behave as per the UPM documentation, and the T20
flags are fairly straightforward. The KCC flags however are more
subtle; they affect the characteristics of the I/O that will be done,
rather than how a file will be found or created. The two decisions
that must be made are: (1) Bytesize, and (2) LF-conversion. These are
explained below.
BYTESIZE:
The decision of which bytesize to use for I/O is somewhat
complicated. The bytesize on UN*X is always 8 bits, but on PDP-10s it
can be anything from 0 to 36 bits. The algorithm we use is as follows:
If a byte size (one of 7, 8, or 9) is explicitly requested, use that.
Otherwise, for a new file, use 9 if O_BINARY, else 7.
for an old file, use the file's bytesize.
A size of 0 or 36 is treated as for a new file.
Any other size is simply used. If this is not
one of 7, 8, or 9 then the results are unpredictable.
LF-CONVERSION:
UN*X text files use the convention that a LF alone is a
"newline", whereas PDP-10 systems use CRLF together. Thus, the normal
mode of I/O uses LF-conversion, wherein read() converts a input CRLF
sequence to LF, and write() converts an output LF to CRLF. The algorithm
used to determine whether LF-conversion should be done is:
Conversion is normally only done if the bytesize is 7.
Any other size implies NO conversion.
However, this default can be overriden by certain flags:
If O_CONVERTED is set, conversion is ALWAYS done.
If O_UNCONVERTED or O_BINARY is set, conversion is NEVER done.
O_SYSFD:
This is a semi-portable flag which allows the user to bind a
system-dependent I/O channel or JFN to a Un*x-style FD. When set, the
first argument must be this "system FD" cast to a char pointer, rather
than a real char pointer identifying a pathname. On TOPS-20 for example
it could be used in this way:
fd = open((char *)jfn, O_SYSFD|O_RDONLY, mode);
The "mode" argument is not currently used for anything on
TOPS-20/TENEX.
LSEEK() - Problems with LF-conversion
lseek() deals only with system-level file pointers. When no
LF-conversion is being done, this corresponds exactly to the UN*X notion
of a file position, namely the # of bytes offset from the start of the file,
and it is possible to create your own file positions arithmetically.
However, when LF-conversion is being done then this is not possible;
the position returned by lseek will correspond to the system's position,
rather than to the number of bytes fed through read() or write(). In this
case you can only lseek to a position previously returned by lseek() itself.
(Note that 0 is a special case that always works). Typically the pointer
returned will be larger than the number of bytes read or written thus far,
since the system is aware of the CR's in the file even though the C program
isn't.
STAT() - file status information
This section describes the correspondence between the components
of the stat() structure (as defined for Un*x) and the TOPS-20 file system
information.
struct stat
{
dev_t st_dev; /* The .DVxxx device type */
ino_t st_ino; /* .FBADR - Disk address of file index blk */
unsigned int st_mode; /* Un*x-style mode bits */
int st_nlink; /* 1 (always) */
int st_uid; /* T20: User #, 10X: directory # */
int st_gid; /* 0 (always, for now) */
dev_t st_rdev; /* 0 (always, for now) */
off_t st_size; /* .FBSIZ - size in bytes (any bytesize) */
time_t st_atime; /* .FBREF - last ref (Un*x format time) */
int st_spare1;
time_t st_mtime; /* .FBWRT - last write (Un*x format time) */
int st_spare2;
time_t st_ctime; /* .FBCRE - last mod (Un*x format time) */
int st_spare3;
long st_blksize; /* # bytes in a page (derived from FB%BSZ) */
long st_blocks; /* # pages in file (FB%PGC of .FBBYV) */
long st_spare4[2];
};
There is one case in which fstat() may return incorrect information. On
TOPS-20 the file size cannot be obtained from the FDB if the file has just
been written and not yet closed. The fstat() code attempts to figure out the
size anyway by using RFPTR% and the USYS internal variables, but this is
not guaranteed to be correct.
FORKEXEC() - New KCC-specific call
This call is intended to combine the functions of fork() and
exec() so that a user program that wants to perform the very common
procedure of first calling fork() and then having the child call exec()
can now simply use forkexec() and accomplish the same thing much faster.
The calling sequence is simply:
#include <frkxec.h>
int forkexec(fxp);
struct frkxec *fxp;
See the include file for details on the contents of the frkxec
structure and the flags that can be provided.
All of the exec*() functions call forkexec() with FX_NOFORK set.
TTY Handling - IOCTL, GTTY, STTY
The library supports many (though not all) of the Un*x TTY
functions. The primary means of getting information about the TTY and
setting TTY parameters is with the ioctl() call. All 4.3BSD TTY-related
ioctl functions are recognized, although not all are completely supported.
In particular, all requests to "get" data structures will always return
as much information as is available; attempting to "set" some elements of
these structures may or may not work, as described in the following comments.
IOCTL function comments:
FIONREAD - Get # bytes to read on FD. Supported.
TIOCGETP - Get sgttyb parameters, same as V6/V7 gtty(). Supported.
TIOCSETP - Set sgttyb parameters, same as V6/V7 stty(). Supported.
sg_ispeed, sg_ospeed Can read and set.
sg_erase Cannot set to anything but DEL (fails if you try).
sg_kill Cannot set to anything but ^U (fails if you try).
sg_flags The following flags are used:
RAW, CRMOD, ECHO, CBREAK
All other flags are ignored, esp. LCASE and TANDEM.
TIOCSETN - V7: same as TIOCSETP, but without flushing TTY input. Supported.
TIOCEXCL - V7: set exclusive use of tty (not implemented).
TIOCNXCL - V7: reset exclusive use of tty (not implemented).
TIOCHPCL - V7: Hang up on last close (not implemented).
TIOCFLUSH - V7: Flush TTY input and output buffers. Supported.
All other functions are for BSD4.3.
TIOCSTI - Simulate terminal input. Supported.
TIOCSBRK - Set break bit. (not implemented)
TIOCCBRK - Clear break bit. (not implemented)
TIOCSDTR - Set data terminal ready. (not implemented)
TIOCCDTR - Clear data terminal ready. (not implemented)
TIOCGPGRP - Get pgrp of tty. (not implemented)
TIOCSPGRP - Set pgrp of tty. (not implemented)
TIOCGETC - Get special characters (tchars). Supported.
TIOCSETC - Set special characters (tchars). Supported (sort of).
t_intrc and t_quitc (for SIGINT and SIGQUIT) are initially -1 but
can be set to any control character. Note that because
chars are unsigned, the initial value when converted to
an integer is 0777, not -1!
No other elements of tchars can be set to anything but what they
already are:
t_stopc = ^S, t_startc = ^Q, t_eofc = ^Z, t_brkc = -1
TIOCLBIS - Set bits in local mode word. (not implemented)
TIOCLBIC - Clear bits in local mode word. (not implemented)
TIOCLGET - Get local mode mask. (not implemented)
TIOCLSET - Set local mode mask. (not implemented)
TIOCSLTC - Set local special chars (ltchars). Supported.
TIOCGLTC - Get local special chars (ltchars). Supported (sort of).
None of these chars can be set to anything but what they already are:
t_suspc = ^C, t_dsuspc = ^C, t_rprntc = ^R, t_flushc = ^O,
t_werasc = ^W, t_lnextc = ^V
TIOCGETD - Get line discipline. Supported.
TIOCSETD - Set line discipline. Supported (sort of).
The line discipline is always NTTYDISC and cannot be set otherwise.
TIOCGWINSZ - Get window size info. Supported.
TIOCSWINSZ - Set window size info. Supported.
ws_col and ws_row correspond to the terminal's width and height.
Both can be read and set.
ws_xpixel and wx_ypixel are initially 0 but can be set and then read.
SIGNAL(), SIGVEC() - Signal handling
Signals are complicated, both on Un*x and T20/10X. The KCC
implementation by default attempts to support the 4.3BSD signal
mechanism, which uses a variety of system calls. For those planning
to use signals, the file SIGNAL.DOC should be consulted.
FCNTL() - File (descriptor) control
This call implements almost none of the 4.3BSd functions, but it
provides a convenient interface for user programs to access system-dependent
parts of USYS.
General form:
int fcntl(int fd, int cmd, int arg);
fd - must be an active, opened file descriptor.
cmd - fcntl command, a F_xxx value.
arg - integer argument for command.
Commands:
F_GETFL Get flags for FD. This returns the internal flags
that USYS contains for this FD.
F_GETSYSFD Gets "system FD" for this FD. On TOPS-20/TENEX this returns
the JFN associated with the FD.
F_GETBYTESIZE Gets the byte size being used for I/O on this FD.
An illegal fcntl() call (command unknown or FD not open) will return -1
with errno set to EINVAL.