Trailing-Edge
-
PDP-10 Archives
-
BB-4160E-BM
-
sort-doc/srt4.doc
There are no other files named srt4.doc in the archive.
SORT.DOC -- Changes from V3A(200) to V4(302).
Apr 1978
COPYRIGHT (C) 1978 BY
DIGITAL EQUIPMENT CORPORATION, MAYNARD, MASS.
THIS SOFTWARE IS FURNISHED UNDER A LICENSE AND MAY BE USED AND COPIED
ONLY IN ACCORDANCE WITH THE TERMS OF SUCH LICENSE AND WITH THE
INCLUSION OF THE ABOVE COPYRIGHT NOTICE. THIS SOFTWARE OR ANY OTHER
COPIES THEREOF MAY NOT BE PROVIDED OR OTHERWISE MADE AVAILABLE TO ANY
OTHER PERSON. NO TITLE TO AND OWNERSHIP OF THE SOFTWARE IS HEREBY
TRANSFERRED.
THE INFORMATION IN THIS SOFTWARE IS SUBJECT TO CHANGE WITHOUT NOTICE
AND SHOULD NOT BE CONSTRUED AS A COMMITMENT BY DIGITAL EQUIPMENT
CORPORATION.
DIGITAL ASSUMES NO RESPONSIBILITY FOR THE USE OR RELIABILITY OF ITS
SOFTWARE ON EQUIPMENT WHICH IS NOT SUPPLIED BY DIGITAL.
SRT4.DOC Page 2
SORT.DOC -- Changes from V3A(200) to V4(302)
1.0 SUMMARY
1.1 Functions
SORT is the high performance sort/merge package for the DECSYSTEM-20.
SORT may be run as a stand-alone sort/merge, or embedded in a COBOL or
FORTRAN program.
This release, SORT v4(302), contains all published edits up to and
including edit 302, has performance enhancements, has the capability
to sequence check merge files, may sort files via a nonstandard
collating sequence and no longer requires the compatibility package.
However because of the native mode command scanner version 4 is not
compatible with version 3. In particular this means that all BATCH
and FORTRAN SORT commands will have to be rewritten.
1.2 Monitor
SORT runs under all supported monitors, and has been tested under
release 2 and release 3.
SORT requires the release 3 (or later) compatibility package (PA1050)
in order to load correctly.
See section 4 for details of how to rebuild SORT.
1.3 Software Dependencies
The stand-alone sort, SORT.EXE, requires that the FORTRAN object time
system, FOROTS.EXE, be on SYS:, if floating point ASCII keys are to be
used.
The internal COBOL SORT requires version 12 of the COBOL object time
system, LIBOL.REL and LIBO12.EXE , to reside on SYS:.
The internal FORTRAN SORT requires the FORTRAN library, FORLIB.REL,
and the object time system, FOROTS.EXE, to reside on SYS:.
1.4 Relevant Documentation
SORT is documented in the following manuals:
DECSYSTEM-20 User's Guide, AA-4179B-TM
DECSYSTEM-20 SORT/MERGE User's Guide, AA-4186C-TM
COBOL-68 Language Reference Manual, AA-5057A-TK
FORTRAN Reference Manual, AA-4158B-TM
SRT4.DOC Page 3
2.0 EXTERNAL CHANGES
2.1 Sequence Checking of MERGE Files
The COBOL MERGE statement has new syntax [WITH SEQUENCE CHECK]. When
WITH SEQUENCE CHECK is specified SORT will check the record sequence
of the merge files to ensure that they are pre-sorted. There is a new
switch /CHECK to accomplish the same check in the stand-alone and
FORTRAN SORT.
2.2 DECSYSTEM-20 Performance Improvement
There is a minor performance improvement valid on the DECSYSTEM-20.
The BIS instruction CVTDBO is used to convert decimal to binary for
display keys. In addition, all UUO's have been replaced by JSYS's in
DECSYSTEM-20 SORT, which provides a significant performance
improvement (see section 2.6).
2.3 Alternate Collating Sequences
Support has been added for alternate collating sequences. Files may
now be sorted via a non-standard collating sequence. The following
new switches may be used to specify the desired collating sequence:
/COLLATE:ASCII
/COLLATE:EBCDIC
/COLLATE:FILE:file-spec
/COLLATE:LITERAL:'string'
/COLLATE:ADDRESS:address
The FILE switch argument is a file-specification and the collating
sequence is input from that file. The file consist of a string of
ASCII characters. The character's octal value is determined by the
recording mode switch (SIXBIT A=41, ASCII A=101, EBCDIC A=301). Each
character in this string, starting with the left most character, is
assigned sucessive ascending positions in the collating sequence of
the character set being specified. The characters may be a graphic,
an octal value or a delimiter. The delimiters are double quote,
comma, hyphen or equal sign. A graphic or string of graphics are
always enclosed with double quotes ("A",102,"CDE"). The octal value
represents the value of the character. This value may not exceed the
value which represents the total number of characters in the set. A
hyphen (-) may be used in place of a comma to represent all of the
characters, found in the normal collating sequence, between the
character which precedes and follows the hyphen. An equal sign (=)
may be used to denote equivalence; thus (="D") causes "D" to have the
same position in the collating sequence as the preceding character.
All of the following are equivalent:
(The character set is ASCII)
SRT4.DOC Page 4
"A","B","C","D","E"="G","Z"
"A"-"D",105="G","Z"
101-"D","E"=107,"Z"
101-104,105=107,132
Unspecified characters will be treated as though they had appeared at
the end of the specified characters in their normal ascending
collating sequence. Hence, a file containing the single character "z"
would have the normal collating sequence with the exception that "z"
would have the first position instead of the one after "y".
It will be easy for the user to add new installation defined collating
sequence switch names.
The LITERAL argument is just the text string that would have been put
in a file-spec delimited by some character (any character not just ')
that is not in the text string.
The ADDRESS argument is mainly for FORTRAN. It is the address of an
array containing the literal text string. Although an octal address
could be given the more usual form would be a formal argument number
^N (see sec. 2.4).
Only one collating sequence is allowed per sort command.
2.4 FORTRAN SORT
The special FORTRAN SORT option has been replaced by a small routine
which does a GET on the stand-alone SORT. This has two major
benefits:
1. The syntax is exactly the same as the stand-alone SORT and all
features of the stand-alone SORT are available to the FORTRAN
SORT.
2. User's programs are now much smaller since the SORT occupies
unused space above FOROTS and is shareable.
Three new switches have been added mainly for FORTRAN SORT these are:
/SUPPRESS-ERROR:arg where arg is one of:
ALL Suppress all error messages
FATAL Suppress fatal, warnings and informational error messages
(same as ALL)
WARNING Suppress warnings and informational messages
INFORMATION Suppress informational messages
NONE Suppress none of the messages
/ERROR-RETURN:ADDRESS of where to go on a fatal error.
/FATAL-ERROR-CODE:ADDRESS of where to store the sixbit error code
(SRTxyz).
Although /ERROR-RETURN and /FATAL-ERROR-CODE code accept an octal
SRT4.DOC Page 5
address the most useful argument is one of the form ^N where ^N is the
N'th argument in the FORTRAN call.
i.e. CALL SORT ('... /ERROR:^2 /FATAL:^3 ...',$100,ERRCOD)
where label 100 and ERRCOD are defined in the FORTRAN program.
The /ERROR switch allows the user program to trap all SORT errors.
The /SUPPRESS switch allows the user program to prevent the terminal
user from seeing SORT messages. And the /FATAL switch allows the user
program to decode the SORT error and perhaps do something useful.
2.5 Memory allocation algorithm
The memory allocation algorithm for all flavours of SORT have been
centralized so than all requests for memory space now go through one
routine. In particular the FORTRAN SORT now allocate space only via
the FUNCT. call. This means that SORT should be able to co-exist
with LINK overlays, and other user routines which allocate memory
correctly via FUNCT.
Two new switches have been added to replace the /CORE switch. These
are:
/BUFFER-PAGES:n Allocate n pages for I/O buffers
/LEAVES:n Build a SORT tree with n leaves
The /BUFFER-PAGES switch sets the total amount of space to be used for
all I/O buffers during the sort phase of a sort. If this is not
enough space to do the sort, the user will get an error message and
sort will use the minimum space possible in which to do the sort.
Similarly, if the argument is too large SORT will use the largest
possible buffer area. If the argument specified is large enough to do
the sort but not large enough to do the merge, SORT will allocate
enough additional core to do the merge.
The /LEAVES switch sets the size of the sort tree (for an explanation
of the sort tree, see Knuth, vol. 3. The algorithm used is
replacement selection). This is the number of records which will be
held in core at any time during the sort phase. Varying this
parameter will affect the performance of the sort, but no simple rules
can predict what the effect will be. Generally speaking, the larger
this number the better, until it becomes so large that SORT causes
thrashing.
In general, the default values SORT takes for /BUFFER-PAGES and
/LEAVES are adequate for normal sorts. For unusual cases,
experimentation may indicate that giving these switches will speed up
the sort. There are no cookbook rules for what values to use,
however. These must be determined by experimentation.
SRT4.DOC Page 6
2.6 SORT/MERGE in Native Mode
The compatibility package has been removed from SORT which now runs in
native mode. All TOPS-10 UUOs have been replaced with TOPS-20 JSYSes.
This has produced a significant performance improvement in SORT on
TOPS-20. In addition, temporary files are now automatically expunged
after a sort so that they do not needlessly consume disk space. SCAN
has been removed and replaced with a TOPS-20 style command scanner
using the COMND JSYS. Also, the merge algorithm has been modified to
use more internal files thus cutting down on data copying for very
large sorts.
2.7 COBOL SORT
Put the COBOL SORT into the high segment, this provides for all of
SORT except the impure data to be shareable. The impure data resides
in two pages (676 and 677) between LIBOL and PA1050. These two pages
appear in the user's address space only when SORT is being used.
SRT4.DOC Page 7
3.0 KNOWN BUGS AND DEFICIENCIES
As of 12-Apr-78, the following deficiencies are known:
1. Stand-alone SORT may not reinitialize itself after a sort, so
that restarting it by ^C, and running it again may be
necessary.
2. If more than 262,143 (2**18-1) records in the input file all
have identical keys (which is unlikely), SORT's internal
sequence number overflows. This causes the sort to become
unstable; i.e., the output data is sorted correctly but the
records with identical keys will not be in the same order as
they were in the input file. In previous versions of SORT
this simply happened without any warning. Version 4 now
prints a warning message if this occurs and continues
sorting.
3. Edits 303, 304, and 305 were made after the SORT sources were
frozen. These patches are in the beware file SORT.BWR. They
are also included for your convenience in the *.NEW files.
Note however that these patches have had only minimal
testing.
SRT4.DOC Page 8
4.0 INSTALLATION INSTRUCTIONS
4.1 Instructions for Loading and Installing SORT
Mount the SORT tape on MTA0: and type the following commands:
DUMPER
TAPE MTA0:
REWIND
DENSITY 1600-BPI
SKIP 1
RESTORE <*>SORT.EXE (TO) <SUBSYS>*.*.-1,
<*>SORT.HLP (TO) <SUBSYS>*.*.-1
REWIND
4.2 Instructions for Building SORT
Full building instructions for SORT can be found in SORT.CTL.
To generate stand-alone SORT, load all of the SORT sources from the
SORT tape into your directory area, and type:
SUBMIT SORT/TIME:20:00
If LIBOL.REL, LIBSHR.REL and FTDEFS.UNV are in your directory, a COBOL
SORT will be built and inserted into LIBO12.EXE and LIBOL.REL.
If FORLIB.REL is in your directory, a FORTRAN SORT will be built and
inserted into FORLIB.REL.
To build only a COBOL SORT type:
SUBMIT SORT/TIME:20:00/TAG:COBOL
To build only a FORTRAN SORT type:
SUBMIT SORT/TIME:20:00/TAG:FORTRA
The following table contains the assembly switches, their default
value, location and intended use.
ASSEMBLY DEFAULT DEFINED USE WHEN SWITCH HAS
SWITCH VALUE IN MODULE A NONZERO VALUE
------------------------------------------------------------------
FTOPS20 1 SRTPRM Build SORT for a DECsystem-20
FTKL10 1 SRTPRM Use KL instructions (BIS etc.)
FTKI10 1 SRTPRM Use KI instructions (DMOVE etc.)
FTCOL 1 SRTPRM Allow alternate collating sequence
FTDEBUG 0 SRTPRM Debugging aids and additional info
FTCOBOL 1 SRTCBL Build SORT for COBOL
SRT4.DOC Page 9
5.0 INTERNAL CHANGES
Several of the sort source modules have been split to separate
DECsystem-10 and DECSYSTEM-20 code and two new modules were added.
The sources now consist of:
SRTPRM.MAC The common parameter definitions
SRTSTA.MAC The common stand-alone code
SRTCMD.MAC The DECSYSTEM-20 command scanner
SRTJSS.MAC DECSYSTEM-20 specific code
SRTCER.MAC Standard error message routines
SRTCMP.MAC The comparison generator
SORT.MAC The common algorithms
In addition the COBOL SORT has:
SRTCBL.MAC The interface for COBOL
and the FORTRAN SORT has:
FORSRT.MAC The FORTRAN interface to the stand-alone SORT.
The following edits were made to SORT as a result of bugs found:
EDIT #
201 SPR 10-22768
Attempts to compine /BINARY with /ASCII in FORTRAN SORT, and
additionally with /SIXBIT or /EBCDIC in stand-alone SORT, caused
SORT to become confused between the mode for key comparisons and
the mode for actual I/O. Symptoms of this were bad output files
or illegal memory references.
202 SPR 10-22767
FORTRAN SORT incorrectly required all binary files to have been
written with MODE='BINARY' in the OPEN statement. This
frequently resulted in the erroneous message
?SRTFCW FORTRAN binary control word incorrect
203 SPR 10-22769
Wide confusion in FORTRAN and stand-alone SORT's handling of
FORTRAN binary files of all kinds.
204
Change SIXBIT input to ignore zero words. This allowed sorts and
merges of COBOL RANDOM files, among other things.
205
SRT4.DOC Page 10
Fix the handling of labeled multi-reel tapes for sorts.
206
Turn on the FTOPS20 conditional for TOPS-20 SORT on the QT011
tape.
207
Write variable length .TMP files in COBOL SORT. This can cause a
great savings of disk space during sorts and merges in some
cases.
210, SPR 10-22813
On TOPS-10, SORT occasionally would pad the last block of output
disk files with zeros. Edit 131 in a previous release was
intended to fix this, but one case was left out. This edit
includes that case.
211
FORTRAN SORT neglected to free one of the I/O channels it used
each time it was called. With enough calls to SORT, all the
channels were used up. This edit frees the extra channel.
212, SPR 10-22379
Multiple commands to SORT giving the /TEMP switch frequently
caused SORT to blow up. This was caused by SORT's not zeroing
the last pointer in its list of temporary devices.
213, SPR 10-24603
This edit removed many of the restrictions in the FORTRAN SORT
command scanner. Among these are:
1. Removal of some undocumented, redundant switches.
2. Allow recording mode switches anywhere in the command string,
not just following a /KEY.
3. Add /COMPUTATIONAL, /STANDARD and /DENSITY for compatability
with stand-alone SORT.
4. Make /CORE operate as in stand-alone SORT.
5. Allow /FORMAT as a key data type on any /KEY, not just the
last one.
6. Allow the command string to be in an array, rather than
requiring it to be a literal.
7. Try default file types of .CCL and .CMD for indirect command
files.
SRT4.DOC Page 11
214, SPR 10-22409
Several problems with SORT's command scanner when the /TEMP
switch is used, were fixed. The fixes included added checks to
be sure that there was actually an input file (not all /TEMP
specs), and allowing SORT to skip temporary structures if they
were at least writable, but didn't contain the user's path.
215
Fix the handling of labeled multi-reel tapes in all cases. This
was a generalization of edit 205, to allow labeled tapes in
merges as well as sorts. This edit was not published due to its
immense size.
216
This fixed the sorting of line-sequenced ASCII files on KL CPUs.
217, SPR 10-24709
Fixed a bug in edit 207, which caused the data record in a COBOL
sort not to be copied into SORT's buffers.
220
Fixed an illegal instruction when using tapes in FORTRAN SORT.
221, SPR 20-11131
FORTRAN SORT grew and grew in memory size each time it was
called, until all of memory was used up.
222
Specifying more than one FORTRAN floating-point ASCII key
frequently resulted in a FOROTS run-time DECODE error.
223
When FORTRAN SORT was called by a huge FORTRAN program which was
loaded with FORDDT, SORT got address checks or I/O to unassigned
channel errors. This was also caused if the FORTRAN program ran
for a while or had complicated expressions preceeding the call to
SORT.
224
If a user returned from an output procedure in a COBOL program
before the AT END path on the RETURN statement was taken, I/O to
unassigned channel or no free channel errors resulted.
225 thru 277
These were spares, and were never used.
SRT4.DOC Page 12
300
Development of version 4.
301, 302
Field test edits.
The following edits are in the beware file and *.NEW sources
303
Fix compares of two character EBCDIC alphanumeric keys in the
middle of a word.
304
Fix ?SRTRIE errors on EBCDIC fixed-length files.
305
On TOPS-20 fix blocking factor problems
6.0 SUGGESTIONS
None.
[End of SRT4.DOC]