Google
 

Trailing-Edge - PDP-10 Archives - BB-4160E-BM - sort-doc/srt4.doc
There are no other files named srt4.doc in the archive.


SORT.DOC -- Changes from V3A(200) to V4(302).
Apr 1978






























COPYRIGHT (C) 1978 BY
DIGITAL EQUIPMENT CORPORATION, MAYNARD, MASS.


THIS SOFTWARE IS FURNISHED UNDER A LICENSE AND MAY BE USED AND  COPIED
ONLY  IN  ACCORDANCE  WITH  THE  TERMS  OF  SUCH  LICENSE AND WITH THE
INCLUSION OF THE ABOVE COPYRIGHT NOTICE.  THIS SOFTWARE OR  ANY  OTHER
COPIES  THEREOF MAY NOT BE PROVIDED OR OTHERWISE MADE AVAILABLE TO ANY
OTHER PERSON.  NO TITLE TO AND OWNERSHIP OF  THE  SOFTWARE  IS  HEREBY
TRANSFERRED.

THE INFORMATION IN THIS SOFTWARE IS SUBJECT TO CHANGE  WITHOUT  NOTICE
AND  SHOULD  NOT  BE  CONSTRUED  AS  A COMMITMENT BY DIGITAL EQUIPMENT
CORPORATION.

DIGITAL ASSUMES NO RESPONSIBILITY FOR THE USE OR  RELIABILITY  OF  ITS
SOFTWARE ON EQUIPMENT WHICH IS NOT SUPPLIED BY DIGITAL.
SRT4.DOC                                                        Page 2


SORT.DOC -- Changes from V3A(200) to V4(302)



1.0  SUMMARY

1.1  Functions

SORT is the high performance sort/merge package for the  DECSYSTEM-20.
SORT may be run as a stand-alone sort/merge, or embedded in a COBOL or
FORTRAN program.

This release, SORT v4(302), contains all published  edits  up  to  and
including  edit  302, has performance enhancements, has the capability
to sequence check merge  files,  may  sort  files  via  a  nonstandard
collating sequence and no longer requires the compatibility package.

However because of the native mode command scanner version  4  is  not
compatible  with  version  3.  In particular this means that all BATCH
and FORTRAN SORT commands will have to be rewritten.


1.2  Monitor

SORT runs under all supported monitors,  and  has  been  tested  under
release 2 and release 3.
SORT requires the release 3 (or later) compatibility package  (PA1050)
in order to load correctly.

See section 4 for details of how to rebuild SORT.


1.3  Software Dependencies

The stand-alone sort, SORT.EXE, requires that the FORTRAN object  time
system, FOROTS.EXE, be on SYS:, if floating point ASCII keys are to be
used.

The internal COBOL SORT requires version 12 of the COBOL  object  time
system, LIBOL.REL and LIBO12.EXE , to reside on SYS:.

The internal FORTRAN SORT requires the  FORTRAN  library,  FORLIB.REL,
and the object time system, FOROTS.EXE, to reside on SYS:.


1.4  Relevant Documentation

SORT is documented in the following manuals:

DECSYSTEM-20 User's Guide, AA-4179B-TM
DECSYSTEM-20 SORT/MERGE User's Guide, AA-4186C-TM
COBOL-68 Language Reference Manual, AA-5057A-TK
FORTRAN Reference Manual, AA-4158B-TM
SRT4.DOC                                                        Page 3


2.0  EXTERNAL CHANGES


2.1  Sequence Checking of MERGE Files

The COBOL MERGE statement has new syntax [WITH SEQUENCE CHECK].   When
WITH  SEQUENCE  CHECK is specified SORT will check the record sequence
of the merge files to ensure that they are pre-sorted.  There is a new
switch  /CHECK  to  accomplish  the  same check in the stand-alone and
FORTRAN SORT.


2.2  DECSYSTEM-20 Performance Improvement

There is a minor performance improvement valid  on  the  DECSYSTEM-20.
The  BIS  instruction  CVTDBO is used to convert decimal to binary for
display keys.  In addition, all UUO's have been replaced by JSYS's  in
DECSYSTEM-20   SORT,   which   provides   a   significant  performance
improvement (see section 2.6).


2.3  Alternate Collating Sequences

Support has been added for alternate collating sequences.   Files  may
now  be  sorted  via a non-standard collating sequence.  The following
new switches may be used to specify the desired collating sequence:

          /COLLATE:ASCII
          /COLLATE:EBCDIC
          /COLLATE:FILE:file-spec
          /COLLATE:LITERAL:'string'
          /COLLATE:ADDRESS:address


The FILE switch argument is a  file-specification  and  the  collating
sequence  is  input  from  that file.  The file consist of a string of
ASCII characters.  The character's octal value is  determined  by  the
recording  mode switch (SIXBIT A=41, ASCII A=101, EBCDIC A=301).  Each
character in this string, starting with the left  most  character,  is
assigned  sucessive  ascending  positions in the collating sequence of
the character set being specified.  The characters may be  a  graphic,
an  octal  value  or  a  delimiter.   The delimiters are double quote,
comma, hyphen or equal sign.  A graphic  or  string  of  graphics  are
always  enclosed  with double quotes ("A",102,"CDE").  The octal value
represents the value of the character.  This value may not exceed  the
value  which  represents the total number of characters in the set.  A
hyphen (-) may be used in place of a comma to  represent  all  of  the
characters,  found  in  the  normal  collating  sequence,  between the
character which precedes and follows the hyphen.  An  equal  sign  (=)
may be used to denote equivalence;  thus (="D") causes "D" to have the
same position in the collating sequence as the preceding character.

All of the following are equivalent:
(The character set is ASCII)
SRT4.DOC                                                        Page 4


          "A","B","C","D","E"="G","Z"
          "A"-"D",105="G","Z"
          101-"D","E"=107,"Z"
          101-104,105=107,132

Unspecified characters will be treated as though they had appeared  at
the  end  of  the  specified  characters  in  their  normal  ascending
collating sequence.  Hence, a file containing the single character "z"
would  have  the normal collating sequence with the exception that "z"
would have the first position instead of the one after "y".

It will be easy for the user to add new installation defined collating
sequence switch names.

The LITERAL argument is just the text string that would have been  put
in  a file-spec delimited by some character (any character not just ')
that is not in the text string.

The ADDRESS argument is mainly for FORTRAN.  It is the address  of  an
array  containing  the literal text string.  Although an octal address
could be given the more usual form would be a formal  argument  number
^N (see sec.  2.4).

Only one collating sequence is allowed per sort command.


2.4  FORTRAN SORT

The special FORTRAN SORT option has been replaced by a  small  routine
which  does  a  GET  on  the  stand-alone  SORT.   This  has two major
benefits:

1.  The syntax is exactly the same as the  stand-alone  SORT  and  all
    features  of  the  stand-alone  SORT  are available to the FORTRAN
    SORT.

2.  User's programs are now  much  smaller  since  the  SORT  occupies
    unused space above FOROTS and is shareable.

Three new switches have been added mainly for FORTRAN SORT these are:

/SUPPRESS-ERROR:arg where arg is one of:
ALL         Suppress all error messages
FATAL       Suppress fatal, warnings and informational error  messages
            (same as ALL)
WARNING     Suppress warnings and informational messages
INFORMATION Suppress informational messages
NONE        Suppress none of the messages

/ERROR-RETURN:ADDRESS of where to go on a fatal error.

/FATAL-ERROR-CODE:ADDRESS of where to  store  the  sixbit  error  code
            (SRTxyz).

Although  /ERROR-RETURN and  /FATAL-ERROR-CODE code  accept  an  octal
SRT4.DOC                                                        Page 5


address the most useful argument is one of the form ^N where ^N is the
N'th argument in the FORTRAN call.
i.e.  CALL SORT ('...  /ERROR:^2 /FATAL:^3 ...',$100,ERRCOD)
where label 100 and ERRCOD are defined in the FORTRAN program.

The /ERROR switch allows the user program to  trap  all  SORT  errors.
The  /SUPPRESS  switch allows the user program to prevent the terminal
user from seeing SORT messages.  And the /FATAL switch allows the user
program to decode the SORT error and perhaps do something useful.


2.5  Memory allocation algorithm

The memory allocation algorithm for all flavours  of  SORT  have  been
centralized  so  than all requests for memory space now go through one
routine.  In particular the FORTRAN SORT now allocate space  only  via
the  FUNCT.   call.   This  means that SORT should be able to co-exist
with LINK overlays, and other  user  routines  which  allocate  memory
correctly via FUNCT.


Two new switches have been added to replace the /CORE  switch.   These
are:

/BUFFER-PAGES:n  Allocate n pages for I/O buffers

/LEAVES:n        Build a SORT tree with n leaves

The /BUFFER-PAGES switch sets the total amount of space to be used for
all  I/O  buffers  during  the  sort  phase of a sort.  If this is not
enough space to do the sort, the user will get an  error  message  and
sort  will  use  the  minimum  space possible in which to do the sort.
Similarly, if the argument is too large  SORT  will  use  the  largest
possible buffer area.  If the argument specified is large enough to do
the sort but not large enough to do  the  merge,  SORT  will  allocate
enough additional core to do the merge.

The /LEAVES switch sets the size of the sort tree (for an  explanation
of  the  sort  tree,  see  Knuth,  vol.   3.   The  algorithm  used is
replacement selection).  This is the number of records which  will  be
held  in  core  at  any  time  during  the  sort  phase.  Varying this
parameter will affect the performance of the sort, but no simple rules
can  predict  what the effect will be.  Generally speaking, the larger
this number the better, until it becomes so  large  that  SORT  causes
thrashing.

In general, the  default  values  SORT  takes  for  /BUFFER-PAGES  and
/LEAVES   are   adequate   for   normal  sorts.   For  unusual  cases,
experimentation may indicate that giving these switches will speed  up
the  sort.   There  are  no  cookbook  rules  for  what values to use,
however.  These must be determined by experimentation.
SRT4.DOC                                                        Page 6


2.6  SORT/MERGE in Native Mode

The compatibility package has been removed from SORT which now runs in
native mode.  All TOPS-10 UUOs have been replaced with TOPS-20 JSYSes.
This has produced a significant performance  improvement  in  SORT  on
TOPS-20.   In addition, temporary files are now automatically expunged
after a sort so that they do not needlessly consume disk space.   SCAN
has  been  removed  and  replaced with a TOPS-20 style command scanner
using the COMND JSYS.  Also, the merge algorithm has been modified  to
use  more  internal  files  thus cutting down on data copying for very
large sorts.


2.7  COBOL SORT

Put the COBOL SORT into the high segment, this  provides  for  all  of
SORT  except the impure data to be shareable.  The impure data resides
in two pages (676 and 677) between LIBOL and PA1050.  These two  pages
appear in the user's address space only when SORT is being used.
SRT4.DOC                                                        Page 7


3.0  KNOWN BUGS AND DEFICIENCIES

As of 12-Apr-78, the following deficiencies are known:


     1.  Stand-alone SORT may not reinitialize itself after a sort, so
         that  restarting  it  by  ^C,  and  running  it  again may be
         necessary.

     2.  If more than 262,143 (2**18-1) records in the input file  all
         have  identical  keys  (which  is  unlikely), SORT's internal
         sequence number overflows.  This causes the  sort  to  become
         unstable;   i.e., the output data is sorted correctly but the
         records with identical keys will not be in the same order  as
         they  were  in  the input file.  In previous versions of SORT
         this simply happened without  any  warning.   Version  4  now
         prints  a  warning  message  if  this  occurs  and  continues
         sorting.

     3.  Edits 303, 304, and 305 were made after the SORT sources were
         frozen.  These patches are in the beware file SORT.BWR.  They
         are also included for your convenience in  the  *.NEW  files.
         Note  however  that  these  patches  have  had  only  minimal
         testing.
SRT4.DOC                                                        Page 8


4.0  INSTALLATION INSTRUCTIONS


4.1  Instructions for Loading and Installing SORT

Mount the SORT tape on MTA0: and type the following commands:

DUMPER
TAPE MTA0:
REWIND
DENSITY 1600-BPI
SKIP 1
RESTORE <*>SORT.EXE (TO) <SUBSYS>*.*.-1,
<*>SORT.HLP (TO) <SUBSYS>*.*.-1
REWIND


4.2  Instructions for Building SORT

Full building instructions for SORT can be found in SORT.CTL.

To generate stand-alone SORT, load all of the SORT  sources  from  the
SORT tape into your directory area, and type:

     SUBMIT SORT/TIME:20:00

If LIBOL.REL, LIBSHR.REL and FTDEFS.UNV are in your directory, a COBOL
SORT will be built and inserted into LIBO12.EXE and LIBOL.REL.

If FORLIB.REL is in your directory, a FORTRAN SORT will be  built  and
inserted into FORLIB.REL.

To build only a COBOL SORT type:

     SUBMIT SORT/TIME:20:00/TAG:COBOL


To build only a FORTRAN SORT type:

     SUBMIT SORT/TIME:20:00/TAG:FORTRA

The following table contains  the  assembly  switches,  their  default
value, location and intended use.

  ASSEMBLY   DEFAULT  DEFINED    USE WHEN SWITCH HAS
  SWITCH     VALUE    IN MODULE  A NONZERO VALUE
  ------------------------------------------------------------------
  FTOPS20    1        SRTPRM     Build SORT for a DECsystem-20
  FTKL10     1        SRTPRM     Use KL instructions (BIS etc.)
  FTKI10     1        SRTPRM     Use KI instructions (DMOVE etc.)
  FTCOL      1        SRTPRM     Allow alternate collating sequence
  FTDEBUG    0        SRTPRM     Debugging aids and additional info
  FTCOBOL    1        SRTCBL     Build SORT for COBOL
SRT4.DOC                                                        Page 9


5.0  INTERNAL CHANGES

Several of the  sort  source  modules  have  been  split  to  separate
DECsystem-10 and DECSYSTEM-20 code and two new modules were added.

The sources now consist of:

SRTPRM.MAC  The common parameter definitions
SRTSTA.MAC  The common stand-alone code
SRTCMD.MAC  The DECSYSTEM-20 command scanner
SRTJSS.MAC  DECSYSTEM-20 specific code
SRTCER.MAC  Standard error message routines
SRTCMP.MAC  The comparison generator
SORT.MAC    The common algorithms

In addition the COBOL SORT has:
SRTCBL.MAC  The interface for COBOL

and the FORTRAN SORT has:
FORSRT.MAC  The FORTRAN interface to the stand-alone SORT.



The following edits were made to SORT as a result of bugs found:

EDIT #

201  SPR 10-22768

     Attempts to compine /BINARY with  /ASCII  in  FORTRAN  SORT,  and
     additionally  with /SIXBIT or /EBCDIC in stand-alone SORT, caused
     SORT to become confused between the mode for key comparisons  and
     the  mode for actual I/O.  Symptoms of this were bad output files
     or illegal memory references.

202  SPR 10-22767

     FORTRAN SORT incorrectly required all binary files to  have  been
     written   with   MODE='BINARY'   in  the  OPEN  statement.   This
     frequently resulted in the erroneous message

          ?SRTFCW FORTRAN binary control word incorrect

203  SPR 10-22769

     Wide confusion in FORTRAN  and  stand-alone  SORT's  handling  of
     FORTRAN binary files of all kinds.

204

     Change SIXBIT input to ignore zero words.  This allowed sorts and
     merges of COBOL RANDOM files, among other things.

205
SRT4.DOC                                                       Page 10


     Fix the handling of labeled multi-reel tapes for sorts.

206

     Turn on the FTOPS20 conditional for TOPS-20  SORT  on  the  QT011
     tape.

207

     Write variable length .TMP files in COBOL SORT.  This can cause a
     great  savings  of  disk  space  during  sorts and merges in some
     cases.

210, SPR 10-22813

     On TOPS-10, SORT occasionally would pad the last block of  output
     disk  files  with  zeros.   Edit  131  in  a previous release was
     intended to fix this, but one  case  was  left  out.   This  edit
     includes that case.

211

     FORTRAN SORT neglected to free one of the I/O  channels  it  used
     each  time  it  was  called.   With enough calls to SORT, all the
     channels were used up.  This edit frees the extra channel.

212, SPR 10-22379

     Multiple commands to SORT  giving  the  /TEMP  switch  frequently
     caused  SORT  to  blow up.  This was caused by SORT's not zeroing
     the last pointer in its list of temporary devices.

213, SPR 10-24603

     This edit removed many of the restrictions in  the  FORTRAN  SORT
     command scanner.  Among these are:

     1.  Removal of some undocumented, redundant switches.

     2.  Allow recording mode switches anywhere in the command string,
         not just following a /KEY.

     3.  Add /COMPUTATIONAL, /STANDARD and /DENSITY for  compatability
         with stand-alone SORT.

     4.  Make /CORE operate as in stand-alone SORT.

     5.  Allow /FORMAT as a key data type on any /KEY,  not  just  the
         last one.

     6.  Allow the command string to  be  in  an  array,  rather  than
         requiring it to be a literal.

     7.  Try default file types of .CCL and .CMD for indirect  command
         files.
SRT4.DOC                                                       Page 11


214, SPR 10-22409

     Several problems with  SORT's  command  scanner  when  the  /TEMP
     switch  is  used, were fixed.  The fixes included added checks to
     be sure that there was actually an  input  file  (not  all  /TEMP
     specs),  and  allowing  SORT to skip temporary structures if they
     were at least writable, but didn't contain the user's path.

215

     Fix the handling of labeled multi-reel tapes in all cases.   This
     was  a  generalization  of  edit  205,  to allow labeled tapes in
     merges as well as sorts.  This edit was not published due to  its
     immense size.

216

     This fixed the sorting of line-sequenced ASCII files on KL CPUs.

217, SPR 10-24709

     Fixed a bug in edit 207, which caused the data record in a  COBOL
     sort not to be copied into SORT's buffers.

220

     Fixed an illegal instruction when using tapes in FORTRAN SORT.

221, SPR 20-11131

     FORTRAN SORT grew and grew  in  memory  size  each  time  it  was
     called, until all of memory was used up.

222

     Specifying  more  than  one  FORTRAN  floating-point  ASCII   key
     frequently resulted in a FOROTS run-time DECODE error.

223

     When FORTRAN SORT was called by a huge FORTRAN program which  was
     loaded  with FORDDT, SORT got address checks or I/O to unassigned
     channel errors.  This was also caused if the FORTRAN program  ran
     for a while or had complicated expressions preceeding the call to
     SORT.

224

     If a user returned from an output procedure in  a  COBOL  program
     before  the AT END path on the RETURN statement was taken, I/O to
     unassigned channel or no free channel errors resulted.

225 thru 277

     These were spares, and were never used.
SRT4.DOC                                                       Page 12


300

     Development of version 4.

301, 302

     Field test edits.


     The following edits are in the beware file and *.NEW sources

303

     Fix compares of two character EBCDIC  alphanumeric  keys  in  the
     middle of a word.

304

     Fix ?SRTRIE errors on EBCDIC fixed-length files.

305

     On TOPS-20 fix blocking factor problems



6.0  SUGGESTIONS

None.



[End of SRT4.DOC]