Trailing-Edge
-
PDP-10 Archives
-
SRI_NIC_PERM_FS_1_19910112
-
c/lib/usys.doc
There are 9 other files named usys.doc in the archive. Click here to see a list.
USYS.DOC - KCC Un*x System-call simulation
This file documents various things about the USYS library routines,
which are intended to provide simulation and support for Un*x system
call functions.
Specifications for the interfaces were taken from the March
1984 4.2BSD "Unix Programmer's Manual" (UPM), plus the 4.3BSD man pages,
and all code here works as described in those references unless
otherwise documented.
Implementors should also read the CODSYS.DOC file in the
source directory.
NOTE: THE DOCUMENTATION FOR SYSTEMS OTHER THAN TOPS-20/TENEX IS INCOMPLETE!
Contents:
Intro - listing format
Summary of USYS routines
Definitions
Individual Routine Descriptions (as needed)
Library function listing format:
Name Module Port-status Comments
<routine name> <source file> <port code>
Port-status <port-code>:
<sys> - runs on the given systems, as flagged by:
2 - T20, TOPS-20
X - 10X, TENEX
I - ITS
T - T10, TOPS-10
W - WTS, WAITS (T10-like)
C - CSI, CompuServe (T10-like)
*10 = not conditionalized for any OS, hence PDP-10 portable.
Comments:
"U" means the call is USYS_macro enclosed and cannot be
interrupted by signals.
"-" means it isn't enclosed
(this better be cuz interrupts don't affect the call!)
"I" means it is interruptible (i.e. can return EINTR error).
"IC" means interrupts may be continued within the call.
USYS function summary: Src: lib/usys/
Name Module Port-status Comments
access ACCESS 2X---- U 10X only partial.
alarm ALARM 2---- U
brk SBRK 2XITWC U
chdir CHDIR 2X---- U
chmod CHMOD 2X---- U
chown CHOWN 2X---- U
close CLOSE 2XITWC U
creat OPEN 2XI??? U
dup DUP *10 U
dup2 DUP *10 U
errno (data) URT *10 -
exec[lv][ep] FORK 2X-TWC U
exit EXIT *10 -
fchmod CHMOD 2X---- U
fchown CHOWN 2X---- U
fcntl FCNTL *10 U
fork FORKEX 2X---- U
forkexec FORKEX 2X-TWC U KCC-specific routine.
fstat STAT 2X-TWC U (also: xfstat)
ftime TIME 2XITWC -
geteuid GETUID 2X---- U
getpid GETPID 2XITWC U see format note.
gettimeofday TIME 2XITWC -
getuid GETUID 2X---- U
gtty SGTTY 2----- U
ioctl IOCTL 2----- UIC Partial.
kill SIGVEC 2X---- U
lseek LSEEK 2XI--- U
open OPEN 2XI--- U (Uses BSD flags; mode not supported)
pause PAUSE 2XITWC -I Always returns with EINTR
pipe PIPE 2----- U (monitor must have PIP: device)
raise SIGVEC 2X---- U (ANSI function, not syscall)
read READ 2XITWC U
rename RENAME 2X---- U
sbrk SBRK *10 U (invokes brk)
sigblock SIGVEC 2X---- U
signal SIGNAL 2X---- U
sigpause SIGVEC 2X---- UI Always returns with EINTR
sigreturn SIGVEC 2X---- U
sigsetmask SIGVEC 2X---- U
sigstack SIGVEC 2X---- U
sigvec SIGVEC 2X---- U
sleep SLEEP 2XITWC -I (returns no value)
stat STAT 2X-TWC U (also: xstat)
stty SGTTY 2----- U
tell LSEEK *10 U (invokes lseek)
time TIME 2XITWC - (also: tadl_xxx routines)
times TIMES 2XITWC - (poor simulation)
unlink UNLINK 2XI--- U (10X doesn't expunge)
utime UTIME 2X---- U
vfork FORK 2X---- U
wait WAIT 2X---- UIC
write WRITE 2XITWC UIC
_exit EXIT *10 U
_runtm URT 2XITWC (internal) C programs start here.
_urtsud (data) URTSUD *10 (internal) Runtime startup defs.
DEFINITIONS:
The UPM introduction contains some definitions which provide
a convenient way to start describing how the KCC simulations differ from
Un*x; some concepts are supported and others are not.
Process ID (PID): Supported, see long description.
Parent Process ID: Supported, see long description.
Process Group ID: not implemented
TTY Group ID: not implemented
Real User ID: Supported.
The user ID on T20/10X is the "user number".
Real Group ID: not implemented
Effective UID, GID, and Access Groups: not implemented
Super-user: not implemented
Special Processes: not implemented
File Descriptor (FD): Supported.
Small non-negative integers in the range 0 to 63 inclusive.
FDs 0, 1, and 2 correspond to standard input, output, and error output.
On T20/10X these are initially set to the JFNs .PRIIN, .PRIOU, .CTTRM.
You can obtain the JFN for a FD by using the fcntl() call.
You can assign a JFN to a FD by using the open() call with O_SYSFD.
File Name: Supported.
On T20/10X, up to 39 chars per component. Character values limited
to 7 bits (1-0177 inclusive). Can quote chars with ^V.
Path Name: Supported.
Un*x style / paths are permitted, where foo/ is taken to mean foo
is a subdirectory of the current directory.
If the monitor worked right, a filespec of the form "C:foo/bar.h" could
be turned into "C:<.foo>bar.h", assuming standard-type relative
directory fixes, and we would win. But no. Instead, we have to do
the work manually: the logical device is recursively expanded until
an end device-directory pair is found, at which point the file
lookup is tried. If it fails, the expansion/traversal continues.
Directory: Supported.
<.> and <..> work on some T20 systems (Stanford mods).
Root Directory and Current Working Directory:
On T20/10X, a directory of the form "/foo" is taken to mean "<foo>".
The notion of a current working directory is supported.
File Access Permissions:
Each set of T20/10X owner, group, and world access bits
corresponds to a set of Un*x owner, group, and other bits.
Un*x T20/10X
04 040 Read access
02 020 Write access
01 010 Execute access
- 04 Append access
- 02 Directory listing access
- 01 -
Thus, a call such as "chmod(foo.bar, 0644)" will set the T20/10X
protection of "foo.bar" to 604040.
There is no way a user program can either read or set the last
three T20/10X protection bits, except by doing a CHFDB% itself.
Finally, there is no T20/10X counterpart to the set-UID or set-GID
bits.
Sockets and Address Families: not implemented (yet!)
ERRORS:
The global "errno" is set by all failing USYS calls.
The standard UN*X error numbers from <errno.h> are used where possible,
although a few T20-specific error codes have been defined. See <errno.h>
for details.
There is currently no easy way to find out exactly what
TOPS-20/TENEX error (if any) caused errno to be set. The best one can
do is find the most recent error for the process with GETER%; perhaps
someday this will be improved. Note however that the "strerror" function
can be furnished with an argument of -1 which will cause it to return a
string describing the last TOPS-20/TENEX error. Since this is not a
widespread convention, it is probably best to define a LASTSYSERR
macro such as the following for portability in your programs:
#include <errno.h>
#ifndef _ERRNO_LASTSYSERR /* If have capability, then can use */
#define LASTSYSERR _ERRNO_LASTSYSERR /* this to get last sys err */
#else
#define LASTSYSERR errno /* else must do std stuff */
#endif
This definition can then be given as an argument to strerror(), as in
fprintf(stderr, "Call failed: %s\n", strerror(LASTSYSERR));
Exceptions (never set errno):
ftime, time, gettimeofday, times
getpid
_exit (never returns)
PROCESS ID (PID): Long Description
PID values are generated by:
(1) getpid() - self process ID. This must not change over
the lifetime of the process!
(2) fork() - to identify the child process.
(3) wait() - to identify the child process that stopped.
This should match the value returned by fork().
PID values are used by:
(1) kill() - to send signals to self, child, or parent.
(It is rare to send them anywhere else.)
(2) Code that checks the return value of wait().
(3) Code that generates unique filenames, port numbers, or the like
which should not conflict with those of any other
active process.
For ITS and TOPS-10 type systems, the PID is simply the job #, a
small positive integer; zero and negative PIDs will never be seen, as
job 0 is the monitor itself and no system can support 2**35 jobs.
However, TENEX (and hence TOPS-20) has never had a notion of a
unique process identifier, except internally inside the monitor; this
fork ID is simply not accessible to the user. Fork handles are all
relative, in an obscene attempt to prevent programs from referencing
any process they shouldn't. This makes it impossible to implement
getpid() in a straightforward way. The subterfuge I have resorted to
is as follows:
T20 PID = <IPCF PID>,,<frk #><job #>
<IPCF PID> - Left half of a PID generated by MUTIL% for process
This is guaranteed by system to be unique.
<frk #> - low 9 bits of relative fork handle, in 0777000.
<job #> - low 9 bits of job number, in 0777.
getpid() remembers the value generated on first call and
returns that thereafter. This satisfies the uniqueness and constancy
criterion, as well as being efficient.
fork() and wait() convert the relative fork handles from
CFORK% and GFRKS% to a child PID with a zero LH but with the other
fields set. Since relative fork handles are from 400000 to 400777, we
only need the low 9 bits.
fork() in the new child process copies the saved getpid()
value, if any; this is its parent's PID and may be used by kill(). The
saved value is then cleared so if the child calls getpid() it will generate
its own unique value.
kill() checks its PID argument first against the saved
getpid() value to see if a signal is being sent to itself. If not it then
sees whether it matches that of its parent (if any) and sends a signal
to .FHSUP if so. Otherwise, if the LH is 0 it assumes the signal is
being sent to a child, and generates the appropriate relative fork
handle from the 9 bits in the PID value. Note: There is no good way
to identify "miscellaneous" signals generated by another process (PSIs
on the "CHNmisc" channel); only those PSIs uniquely mapped to a single
signal can be successfully sent between processes.
This scheme fails only if PIDs are somehow passed from one
process to another either via pipe, file, or vfork() shared memory, since
the result of a child's getpid() won't match what its parent's fork()
returned. But this should practically never happen.
TENEX:
On TENEX, which doesn't have IPCF, we just use GFRKS% to
locate our fork within the job fork structure and hope the resulting
number, which we stick in the LH, doesn't change. At least TENEX
doesn't have extended addressing either so we can munch the GFRKS%
data on the stack.
OPEN() - Some I/O details:
On TOPS-20/TENEX, the open() call basically derives a filename
from the given pathname (which may be in either TOPS-20 or Un*x syntax)
and applies GTJFN% and OPENF% to it. However, in order to avoid some
frustrating problems with "invalid simultaneous access" and "file busy"
errors, a successful OPENF% when writing a new file is followed
immediately by a CLOSF% and another OPEN%, which has the effect of
immediately making the file "real" and visible to other processes. The
main difference with this strategem becomes visible only if the process
happens to be killed or reset prematurely; if so, the files it was
writing will still be left around and have zero length, whereas for older
versions of open() (or non-C programs) the files would not exist at all.
The open() call has several additional flags which are intended to
help specify the proper actions on TOPS-20/TENEX systems, since the
defaults assumed by open() may not always be correct.
Standard BSD flags:
O_RDONLY open for reading only
O_WRONLY open for writing only
O_RDWR open for reading and writing
O_NDELAY do not block on open (not supported)
O_APPEND append on each write
O_CREAT create file if it does not exist
O_TRUNC truncate size to 0
O_EXCL error if create and file exists
KCC-specific flags (not portable)
O_BINARY Open in binary (9-bit byte) mode
O_CONVERTED Force LF-conversion
O_UNCONVERTED Force NO LF-conversion
O_BSIZE_7 Force 7-bit bytesize
O_BSIZE_8 Force 8-bit bytesize
O_BSIZE_9 Force 9-bit bytesize
O_SYSFD 1st arg is a system-dependent I/O handle
(on T20/10X, a JFN)
TOPS-20 and TENEX specific flags
O_T20_WILD Allow wildcards on GTJFN%
O_T20_WROLD For writes, do NOT use GJ%FOU
O_T20_SYS_LOG logical device is system-wide!
O_T20_THAWED Open file for thawed access
The BSD flags behave as per the UPM documentation, and the T20
flags are fairly straightforward. The KCC flags however are more
subtle; they affect the characteristics of the I/O that will be done,
rather than how a file will be found or created. The two decisions
that must be made are: (1) Bytesize, and (2) LF-conversion. These are
explained below.
BYTESIZE:
The decision of which bytesize to use for I/O is somewhat
complicated. The bytesize on UN*X is always 8 bits, but on PDP-10s it
can be anything from 1 to 36 bits. The algorithm we use is as follows:
If a byte size (one of 7, 8, or 9) is explicitly requested, use that.
Otherwise, for a new file, use 9 if O_BINARY, else 7.
for an old file, use the file's bytesize.
A size of 0 or 36 is treated as for a new file.
Any other size is simply used. If this is not
one of 7, 8, or 9 then the results are unpredictable.
LF-CONVERSION:
UN*X text files use the convention that a LF alone is a
"newline", whereas PDP-10 systems use CRLF together. Thus, the normal
mode of I/O uses LF-conversion, wherein read() converts an input CRLF
sequence to LF, and write() converts an output LF to CRLF. The algorithm
used to determine whether LF-conversion should be done is:
Conversion is normally only done if the bytesize is 7.
Any other size implies NO conversion.
However, this default can be overriden by certain flags:
If O_CONVERTED is set, conversion is ALWAYS done.
If O_UNCONVERTED or O_BINARY is set, conversion is NEVER done.
O_SYSFD:
This is a semi-portable flag which allows the user to bind a
system-dependent I/O channel or JFN to a Un*x-style FD. When set, the
first argument must be this "system FD" cast to a char pointer, rather
than a real char pointer identifying a pathname. On TOPS-20 for example
it could be used in this way:
fd = open((char *)jfn, O_SYSFD|O_RDONLY, mode);
The "mode" argument is not currently used for anything on
TOPS-20/TENEX.
LSEEK() - Problems with LF-conversion
lseek() deals only with system-level file pointers. When no
LF-conversion is being done, this corresponds exactly to the UN*X notion
of a file position, namely the # of bytes offset from the start of the file,
and it is possible to create your own file positions arithmetically.
However, when LF-conversion is being done then this is not possible;
the position returned by lseek will correspond to the system's position,
rather than to the number of bytes fed through read() or write(). In this
case you can only lseek to a position previously returned by lseek() itself.
(Note that 0 is a special case that always works). Typically the pointer
returned will be larger than the number of bytes read or written thus far,
since the system is aware of the CR's in the file even though the C program
isn't.
STAT() - file status information
This section describes the correspondence between the components
of the stat() structure (as defined for Un*x) and the TOPS-20 file system
information.
struct stat
{
dev_t st_dev; /* The .DVxxx device type */
ino_t st_ino; /* .FBADR - Disk address of file index blk */
unsigned int st_mode; /* Un*x-style mode bits */
int st_nlink; /* 1 (always) */
int st_uid; /* T20: User #, 10X: directory # */
int st_gid; /* 0 (always, for now) */
dev_t st_rdev; /* 0 (always, for now) */
off_t st_size; /* .FBSIZ - size in bytes (any bytesize) */
time_t st_atime; /* .FBREF - last ref (Un*x format time) */
int st_spare1;
time_t st_mtime; /* .FBWRT - last write (Un*x format time) */
int st_spare2;
time_t st_ctime; /* .FBCRE - last mod (Un*x format time) */
int st_spare3;
long st_blksize; /* # bytes in a page (derived from FB%BSZ) */
long st_blocks; /* # pages in file (FB%PGC of .FBBYV) */
long st_spare4[2];
};
There is one case in which fstat() may return incorrect information. On
TOPS-20 the file size cannot be obtained from the FDB if the file has just
been written and not yet closed. The fstat() code attempts to figure out the
size anyway by using RFPTR% and the USYS internal variables, but this is
not guaranteed to be correct.
FORKEXEC() - New KCC-specific call
This call is intended to combine the functions of fork() and
exec() so that a user program that wants to perform the very common
procedure of first calling fork() and then having the child call exec()
can now simply use forkexec() and accomplish the same thing much faster.
The calling sequence is simply:
#include <frkxec.h>
int forkexec(fxp);
struct frkxec *fxp;
See the include file for details on the contents of the frkxec
structure and the flags that can be provided.
All of the exec*() functions call forkexec() with FX_NOFORK set.
TTY Handling - IOCTL, GTTY, STTY
The library supports many (though not all) of the Un*x TTY
functions. The primary means of getting information about the TTY and
setting TTY parameters is with the ioctl() call. All 4.3BSD TTY-related
ioctl functions are recognized, although not all are completely supported.
In particular, all requests to "get" data structures will always return
as much information as is available; attempting to "set" some elements of
these structures may or may not work, as described in the following comments.
IOCTL function comments:
FIONREAD - Get # bytes to read on FD. Supported.
TIOCGETP - Get sgttyb parameters, same as V6/V7 gtty(). Supported.
TIOCSETP - Set sgttyb parameters, same as V6/V7 stty(). Supported.
sg_ispeed, sg_ospeed Can read and set.
sg_erase Cannot set to anything but DEL (fails if you try).
sg_kill Cannot set to anything but ^U (fails if you try).
sg_flags The following flags are used:
RAW, CRMOD, ECHO, CBREAK
All other flags are ignored, esp. LCASE and TANDEM.
TIOCSETN - V7: same as TIOCSETP, but without flushing TTY input. Supported.
TIOCEXCL - V7: set exclusive use of tty (not implemented).
TIOCNXCL - V7: reset exclusive use of tty (not implemented).
TIOCHPCL - V7: Hang up on last close (not implemented).
TIOCFLUSH - V7: Flush TTY input and output buffers. Supported.
All other functions are for BSD4.3.
TIOCSTI - Simulate terminal input. Supported.
TIOCSBRK - Set break bit. (not implemented)
TIOCCBRK - Clear break bit. (not implemented)
TIOCSDTR - Set data terminal ready. (not implemented)
TIOCCDTR - Clear data terminal ready. (not implemented)
TIOCGPGRP - Get pgrp of tty. (not implemented)
TIOCSPGRP - Set pgrp of tty. (not implemented)
TIOCGETC - Get special characters (tchars). Supported.
TIOCSETC - Set special characters (tchars). Supported (sort of).
t_intrc and t_quitc (for SIGINT and SIGQUIT) are initially -1 but
can be set to any control character. Note that because
chars are unsigned, the initial value when converted to
an integer is 0777, not -1!
No other elements of tchars can be set to anything but what they
already are:
t_stopc = ^S, t_startc = ^Q, t_eofc = ^Z, t_brkc = -1
TIOCLBIS - Set bits in local mode word. (not implemented)
TIOCLBIC - Clear bits in local mode word. (not implemented)
TIOCLGET - Get local mode mask. (not implemented)
TIOCLSET - Set local mode mask. (not implemented)
TIOCSLTC - Set local special chars (ltchars). Supported.
TIOCGLTC - Get local special chars (ltchars). Supported (sort of).
None of these chars can be set to anything but what they already are:
t_suspc = ^C, t_dsuspc = ^C, t_rprntc = ^R, t_flushc = ^O,
t_werasc = ^W, t_lnextc = ^V
TIOCGETD - Get line discipline. Supported.
TIOCSETD - Set line discipline. Supported (sort of).
The line discipline is always NTTYDISC and cannot be set otherwise.
TIOCGWINSZ - Get window size info. Supported.
TIOCSWINSZ - Set window size info. Supported.
ws_col and ws_row correspond to the terminal's width and height.
Both can be read and set.
ws_xpixel and wx_ypixel are initially 0 but can be set and then read.
SIGNAL(), SIGVEC() - Signal handling
Signals are complicated, both on Un*x and T20/10X. The KCC
implementation by default attempts to support the 4.3BSD signal
mechanism, which uses a variety of system calls. For those planning
to use signals, the file SIGNAL.DOC should be consulted.
FCNTL() - File (descriptor) control
This call implements almost none of the 4.3BSd functions, but it
provides a convenient interface for user programs to access system-dependent
parts of USYS.
General form:
int fcntl(int fd, int cmd, int arg);
fd - must be an active, opened file descriptor.
cmd - fcntl command, a F_xxx value.
arg - integer argument for command.
Commands:
F_GETFL Get flags for FD. This returns the internal flags
that USYS contains for this FD.
F_GETSYSFD Gets "system FD" for this FD. On TOPS-20/TENEX this returns
the JFN associated with the FD.
F_GETBYTESIZE Gets the byte size being used for I/O on this FD.
An illegal fcntl() call (command unknown or FD not open) will return -1
with errno set to EINVAL.