Google
 

Trailing-Edge - PDP-10 Archives - decuslib10-09 - 43,50466/spell.doc
There are 6 other files named spell.doc in the archive. Click here to see a list.


			WESTERN MICHIGAN UNIVERSITY
				COMPUTER CENTER

LIBRARY PROGRAM #3.8.1

CALLING NAME:	SPELL
PREPARED BY:	RUSSELL R. BARR III 
PROGRAMMED BY:	*
APPROVED BY:	JACK R. MEAGHER
DATE:		JULY 25, 1974

			SPELLING CHECKER AND CORRECTION PROGRAM.

TABLE OF CONTENTS

1.0  GENERAL DESCRIPTION
2.0  FORMAT OF THE INPUT AND OUTPUT FILES
3.0  PROGRAM QUESTION AND HOW TO ANSWER THEM
4.0  FILE NAME FORMAT
5.0  HOW TO USE MULTIPLE DICTIONARIES
6.0  OTHER MODES
7.0  EXAMPLES

1.0  GENERAL DESCRIPTION

SPELL IS A PROGRAM TO READ TEXT FILES AND CHECK THEM FOR CORRECTNESS OF 
SPELLING.  IN ADDITION, IT WILL ATTEMPT TO GUIDE THE USER IN CORRECTING THOSE
WORDS WHICH THE PROGRAM DETERMINES MAY BE MISSPELLED.  THERE ARE FOUR ERRORS
THE SPELL ATTEMPTS TO CORRECT:

		1)  ONE WRONG LETTER
		2)  ONE MISSING LETTER
		3)  ONE EXTRA LETTER
		4)  TWO TRANSPOSED LETTERS

SPELL HAS A DICTIONARY OF OVER 10,000 WORDS AND ALLOWS THE USER TO ADD ONE
OR MORE AUXILLIARY DICITONARIES TO COVER OMISSIONS AND SPECIALIZED JARGON.

2.0  FORMAT OF THE INPUT AND OUTPUT FILES

THE TEXT FILE (THE ONE TO BE SPELLING CHECKED) AND THE USER'S  OPTIONAL 
DICTIONARY FILE MAY BE IN EITHER LINED, SOS OR TECO FORMAT.**  WHEN A WORD IS 
CORRECTED, THE OUTPUT TEXT FILE WILL BE WRITTEN WITH THE SAME  UPPER/LOWER
 CASE FORMAT AS THE ORIGINAL WORD.  THE DICTIONARY INPUT FILE CONSISTS OF
 ONE WORD PER LINE.  WORDS ARE STRICTLY ALPHABETIC AND LESS THAN 40 CHARACTERS
 LONG.  THE DICTIONARY NEED NOT BE IN ORDER.



*SPELL IS THE WMU IMPLEMENTATION OF THE DECUS (DIGITAL EQUIPEMNT COMPUTER 
USERS SOCIETY)LIBRARY PROGRAM #10-184.  SPELL WAS WRITTEN BY RALPH GORIN OF 
THE STANFORD ARTIFICIAL INTELLIGENCE LABORATORY, STANFORD, CALIFORNIA.

	**LINED,SOS, AND TECO ARE TEXT EDITORS.  LINED AND SOS FILES HAVE 5
DIGIT NUMBER IN THE FIRST WORD OF EACH LINE AND BIT 35 OF THAT  WORD IS SET TO 1.


3.0  PROGRAM QUESTIONS AND HOW TO ANSWER THEM

IN THIS SECTION QUESTIONS TYPED BY THE COMPUTER ARE UNDERLINED, <CR> MEANS 
PRESS THE RETURN KEY, <ALTMODE> MEANS PRESS THE "ALT" OF "ESC" KEY.

3.1  DO YOU WANT TO AUGMENT THE DICTIONARY?

IF YOU HAVE A DICTIONARY ON DISK THAT YOU WISH TO ADD, TYPE "Y<CR>".  THE NEXT
QUESTION IS 3.2. IF YOU DO NOT WISH TO ADD TO THE DICTIONARY, JUST ENTER "<CR>".
THE NEXT QUESTION WILL BE 3.4


3.2  DICTIONARY FILE NAME:

ENTER THE NAME OF YOUR DICTIONARY FILE.  (SEE SECTION 4.0 FOR THE FILE NAME
FORMAT) THE NEXT QUESTION IS 3.3

3.3  TYPE "I" TO  MARK THESE INCREMENTAL INSERTIONS:

IF YOU ENTER "I<CR>" THE WORDS IN YOUR DICTIONARY WILL BE AVAILABLE FOR AN
INCREMENTAL DUMP.  SEE SECTION 5.0 FOR INFORMATION ON INCREMENTAL DUMPS AND
INSERTIONS.  IF YOU DO NOT WISH TO MARK THE WORDS IN THIS DICTIONARY AS 
INCREMENTAL INSERTIONS, TYPE "<CR>".  IN EITHER CASE,THE PROGRAM WILL TYPE 
OUT THE NUMBER OF WORDS IN THE DICTIONARY AND THE CURRENT SIZE OF THE PROGRAM.
THE NEXT QUESTION IS 3.4

3.4  NAME OF THE FILE TO CHECK AND CORRECT:

ENTER THE NAME OF YOUR INPUT TEXT FILE.  THE NEXT QUESTION IS 3.5  IF ONLY
"<CR>" IS ENTERED PROGRAM WILL TERMINATE.

3.5  FILE NAME FOR OUTPUT:

ENTER A FILE NAME IF YOU WISH TO MAKE CORRECTIONS.  OTHERWISE TYPE "<CR>".
THE NEXT QUESTION IS 3.6

3.6  FILE NAME FOR EXCEPTIONS:

ENTER THE NAME OF A FILE IF YOU WISH AN EXCEPTION FILE.  THIS FILE WILL CONTAIN
ALL THE LINES OF WHICH EXCEPTIONS (MISSPELLINGS) WERE  FOUND AND THE REJECTED 
(MISSPELLED) WORDS.  THIS FILE IS USUALLY UNNECESSARY.  TO OMIT THE EXCEPTION
FILE TYPE "<CR>".  IN EITHER CASE THE PROGRAM WILL NOW START WORKING.  WHEN
THE PROGRAM ENCOUNTERS A WORD THAT ISN'T IN THE DICTIOANRY, IT WILL TYPE THE
PAGE NUMBER, THE LINE ON WHICH THE WORD WAS FOUND AND WORD ITSELF.  THEN 
FOLLOWING  'ONCE-ONLY' MESSAGE WILL BE TYPED:

IN GENERAL, YOU HAVE THE FOLLOWING OPTIONS:
	A  ACCEPT WORD
	I  ACCEPT WORD AND INSERIT
	   IN THE DICTIONARY
	R  REPLACE THIS WORD. USER WILL BE 
	   ABLE TO TYPE REPLACEMENT WORD
	X  ACCEPT THIS WORD.
	   THEN FINISH RECOPYING WITHOUT ANY CHECKING.
	W  SAVE MY PRESENT INCREMENTAL INSERTIONS.
	   THEN ASK AGAIN ABOUT THIS WORD.
	D  DISPLAY THE CURRENT LINE AGAIN.
	S  SELECT FROM LIST OF GUESSES.


IF YOU WANT TO REVIEW THIS LIST, THEN TYPE SOMETHING NOT IN THE LIST.

IF A MISSPELLED WORD IS FOUND, THE NEXT QUESTION IS 3.7.  IF NOT, THE NEXT
QUESTOIN IS 3.12.

3.7  TYPE A,I,R,X,D,W, OR S

		OR

I GUESS:  #### TYPE C TO MAKE THIS CORRECTION, OR A,I,R,X,D, OR W

#### STANDS FOR THE PROGRAM'S GUESS.  THE LIST OF OPTIONS MAY CHANGE FROM TIME
TO TIME DEPENDING ON WHAT TYPE OF MATCHES THE PROGRAM FINDS IN ITS DICTIONARIES
IF YOU ENTER "A", "I", "X", OR "D", THE PROGRAM WILL PERFORM THE SPECTIFED
ACTION, CONTINUE WORKING AND THE NEXT QUESTION IS EITHER 3.7 OR 3.12.  IF
YOU ENTER "R", THE NEXT QUESTION IS 3.8.  IF YOU ENTER "W" THE NEXT QUESTION
IS 3.10.  IF YOU ENTER "S" THE NEXT QUESTOIN IS 3.11.

3.8  REPLACE WITH:

ENTER THE NEW WORD.  IF THE NEW WORD IS IN THE DICTIONARY, THE NEXT QUESTION
IS EITHER 3.7 OR 3.12.  IF THE NEW WORD IS NOT IN THE DICTIONARY, THE NEXT 
QUESTION IS 3.9.

3.9  TYPE "I" TO INSERT THIS REPLACEMENT IN THE DICTIONARY:

THE NEXT QUESTION IS EITHER 3.7 OR 3.12.

3.10  FILE NAME:

ENTER THE FILE NAME FOR THE INCREMENTAL DUMP.  THE NEXT QUESTION IS 3.7

3.11 TO SELECT A CHOICE:  TYPE
     Y<CR> TO SELECT A WORD
     <CR> TO SEE NEXT WORD
     <ALTMODE> TO ESCAPE FROM THIS MODE
     OR <CR> TO BACK UP IN THIS LIST
     TO REVIEW THIS LIST, TYPE ANYTHING ELSE
     TYPE Y,^,<ALTMODE> OR <CR>

IF YOU TYPE "Y", THE WORD JUST TYPED WILL BE INSERTED IN THE OUTPUT  FILE AND
THE NEXT QUESTION WILL BE 3.7 OR 3.12.  IF YOU TYPE "<CR>", THE NEXT 
WORD WILL BE DISPLAYED AND YOUR CHOICE OF RESPONSES IS THE SAME.  IF YOU TYPE
"<CR>", THE PREVIOUS WORD WILL BE DISPLAYED AND YOUR CHOICE OF  RESPONSES
IS THE SAME.  IF YOU TYPE "<ALTMODE>", THE NEXT QUESTION IS 3.7.

3.12  FINISHED

TYPE:
E	EXIT
C	CORRECT ANOTHER FILE
A	AUGMENT THE DICTIONARY AND CORRECT ANOTHER FILE
D	DUMP THE DICTIONARY TO DISK (DON'T DO THIS)
I	DUMP INCREMENTAL.  (REWRITES A FILE!)

TYPING "E" TERMINATES THE PROGRAM.  IF YOU TYPE "C", THE NEXT QUESTION IS
3.4.  IF YOU TYPE "A", THE NEXT QUESTION 3.1. DO NOT TYPE "D" AS THIS WILL
WASTE A CONSIDERABLE AMOUNT OF SPACE ON THE DISK (ABOUT 150 BLOCKS).  TYPE "I"
TO DUMP ALL NEW WORDS AND ALL AUGMENTATIONS TO THE DICTIONARY THAT YOU MARKED
AS INCREMENTAL (QUESTION 3.3), THE NEXT QUESTION IS 3.13.

3.13  FILE NAME:

ENTER THE FILE NAME FOR THE INCREMENTAL DUMP.  THE DEFAULT IS "WORDS.LST".
THE NEXT QUESTION IS 3.12

4.0  FILE NAME FORMAT

IN GENERAL, FILE NAMES CONSIST OF 3 PARTS; A NAME, AND EXTENSTION AND A PROJECT
-PROGRAMMER NUMBER (E.G. FILNAM.EXT[PJ,PG]).

"NAME" IS A STRING OF UP TO 6 LETTERS AND/OR NUMBER WITH NO IMBEDDED SPACES OR
SPECIAL CHARACTERS.

". EXT" IS A PERIOD FOLLOWED BY A STRING OF UP TO 3 LETTERS AND/OR NUMBERS
WHICH NO IMBEDDED SPACES OR SPECIAL CHARACTERS.

"[PJ,PG]"IS THE DESIRED PROJECT-PROGRAMMER NUMMBER ENCLOSED IN SQUARE BRACKETS.

DEFAULTS:  IF "[PJ,PG]" IS OMITTED, THE USER'S OWN IS ASSUMED.

	   IF ".EXT" IS OMTIIED A NULL (BLANK) EXTENSION IS ASSUMED.

	   IF THE ENTIRE FILE NAME IS OMITTED, THE DEFAULT FILE NAME FOR THAT
	   QUESTION WILL BE USED.

5.0  HOW TO USE MULTIPLE DICTIONARIES

SPELL HAS A SET OF FEATURES WHEREBY THE USER CAN CAUSE THE CREATION OF SEVERAL
DISJOINTED INCREMENTAL DICTIONARIES.  IN THIS WAY, THE USER MAY COLLECT SEVREAL
DICTIONARIES OF SPECIAL TERMS.  INTERNALLY, ALL DICITONAY ENTIRES ARE CON-
SIDERED EQUIVALENT AS REGARDS SEARCHES FOR WORDS.  THE DISTINCTION BETWEEN
DICTIONARIES HAS ITS GREATEST IMPACT WHEN DOING INCREMENTAL DUMPS ( THE I
COMMAND DURING THE EXIT SEQUENCE OR THE W COMMAND WHILE IN THE MIDDLE OF 
EXECUTION).  WHEN AN INCREMENTAL DUMP IS REQUESTED, THE USER MAY SPECIFY A 
NUMBER, E.G. W9, WHICH SELECTS THE PARTICULAR INCREMENTAL DICTIONARY TO DUMP.
IN THIS EXAMPLE, DICTIONARY #9 WILL BE DUMPED.  "DUMPING A DICTIONARY' MEANS
IT WILL BE WRITTEN ON DISK UNDER THE NAME SPECIFIED BY THE USER.

DICTIONARY 0 IS THE MAIN DICTIONARY.  WORDS CANNOT BE ADDED TO THIS DICTIONARY,
EXCEPT BY READING AN AUXILIARY FILE.  ALL WORDS THAT ARE INCREMENTAL INSERTIONS
IN THE DICTIONARY WILL BE MARKED IN DICTIONARY #1, UNLESS THE USER SPECIFIES
OTHERWISE.

THE FOLLOWING PLACES ARE WHERE THE USER MAY SPECIFY WHICH DICTIONARY TO ADD TO:

	1)  WHEN LOADING AN AUXILIARY DICTIONARY, IF THE USER RESPONDS WITH 
"INN" TO THE QUESTION ABOUT MARKING NEW ENTRIES AS INCREMENTAL (QUESTION 3.3),
THEN THE NEW ENTRIES WILL BE MARKED IN DICTIONARY NUMBER NN (WHERE NN IS 
INTERPRETED AS DECIMAL AND SHOULD BE LESS THAT 32.

	2)  AFTER A WORD HAS BEEN REJECTED, TYPE "INN" TO INSERT THE WORD IN
DICTIONARY NUMBER  NN (QUESTION 3.7).

	3)  AFTER REPLACING A WORD, IF THE REPLACEMENT IS NOT IN THE  DIC-
TIONARY THEN TYPE "INN" TO INSERT THE REPLACEMENT INTO DICTIONARY (QUESTION 3.9).

WHEN REQUESTING AN INCREMENTAL DUMP, THE USER MAY SPECIFY THE PARTICUALR
DICTIONARY TO DUMP.  THIS IS ALLOWED IN TWO CASES:

	1)  AFTER SOME WORD HAS BEEN REJECTED, THE COMMAND "WNN" WILL CAUSE
DICTIONARY NUMBER NN TO BE DUMPED (QUESTION 3.3).

	2)  DURING THE EXIT SEQUENCE, THE COMMAND "INN" WILL CAUSE THE DIC-
TIONARY NUMBER NN TO BE DUMPED (QUESTON 3.12).

IN ALL FIVE CASES ABOVE, IF NN IS EITHER 0 OR OMITTED, THEN IT WILL BE TAKEN 
AS BEING 1.

CAUTION:  THERE IS NO PROVISION IN SPELL FOR REMEMBERING WHICH DICTIONARY
NUMBERS HAVE BEEN USED.  THEREFORE, IT REMAINS THE INDIVIDUAL USER'S 
RESPONSIBILITY TO REMEMBER THE NUMBERS OF ALL THE DICITONARIES THAT HE 
CREATES.  (FORGETTING THE NUMBER WILIL MEAN THAT THE FORGOTTEN DICTIONARY CAN-
NOT BE DUMPED INCREMENTALLY.)  THE WORDS IN A FORGOTTEN DICTIONARY WILL STILL
BE AVAILABLE,  BUT THE ONLY WAY TO ACTUALLY GET THEM DUMPED OUT IS TO DUMP THE 
ENTIRE DICTIONARY.

6.0  OTHER MODES

6.1  THE PICKUP FEATURE

IF ANY OF THE THREE FILE NAMES IN THE ENTRY SEQUENCE (WHERE THE SOURCE
CORRECTION AND EXCEPTION FILES ARE SPECIFIED) IS FOLLOWED BY THE SWITCH "/P"
THEN, AFTER ACCEPTING THE THREE FILE NAMES, SPELL WILL ENTER PICKUP MODE.
THE USER WILL BE ASKED TO SPECIFY A PAGE NUMBER AND, IF THE FILE IS IN SOS
OR LINED FORMAT, A LINE NUMBER FOR PICKUP.  THE EFFECT IS TO SUSPEND SPELLING
CHECKING UNTIL THE PAGE AND LINE SPECIFIED.  WHEN THE USER HAS A  PARTIALLY
CORRECTED FILE, THE COMMAND WILL ENABLE HIM TO SKIP OVER THE PORTION OF THE 
FILE THAT HAS ALREADY BEEN CORRECTED. THE INPUT FILE WILL BE COPIED WITHOUT
CHECKING TO THE OUTPUT UMTIL THE PAGE AND LINE SPECIFIED, AT WHICH POINT
SPELLING CHECKING BEGINS.

6.2  THE TRAINING FEATURE

IF THE FILE NAME OF THE INPUT FILE IS FOLLOWED BY THE SWITCH "/T" THEN INSTEAD
OF CORRECTING THE FILE, SPELL WILL TREAT THE FILE AS A TRAINING SET.  ALL WORDS
IN THE FILE THAT ARE UNFAMILIAR TO SPELL WILL BE ENTERED IN THE DICTIONARY AS
INCREMENTAL INSERTIONS.  AFTER SPELL FINISHES READING THE FILE, THE USER HAS
AN OPPORTUNITY TO DUMP ALL THE WORDS THAT WERE INSERTED THIS WAY.  THE  RESULTING
LIST OF WORDS MAY BE EDITED AND ANY WORDS WHICH ARE INCORRECT MAY BE DELETED.
THEN THIS FILE CAN BE USED AS AN AUXILIARY DICTIONARY WHILE CORRECTING THE
ORIGINAL SOURCE FILE.

THIS FILE IS PROVIDED FOR THE PURPOSE OF EASING THE PROBLEM OF CREATING  A
SPECIALIZED DICTIONARY OF JARGON AND INFREQUENTLY USED WORDS.

6.3  Q-TRAINING

Q-TRAINING IS SPECIFIED BY THE SWITCH "/Q".  IN THIS MODE, ALL WORDS IN THE
SOURCE FILE THAT ARE UNFAMILAR TO SPELL WILL BE ADDED TO THE DICTIONARY;
THE DIFFERENCE IS IF ANY "NEW" WORD IS "CLOSE TO" SOME OLD WORD THEN THE NEW
WORD WILL BE OUTPUT IN THE EXCEPTION FILE.  THE EXCEPTION FILE WILL CONTAIN
ONLY SUCH WORDS.  IN THIS WAY, THE SPELLING CHECKER CALLS TO YOUR ATTENTION
THE FACT THAT THIS WORD MAY BE MISSPELLED.



7.0  EXAMPLES

7.1  TERMINAL EXAMPLES

THIS RUN SHOWS A SIMPLE CORRECTION OF A SMALL FILE ("TEXT") USING
A SPECIALIZED DICTIONARY ("COMPUT.DCT").  USER RESPONSES ARE UNDERLINED IN THIS EXAMPLE.
.TYPE TEXT<CR>
THE FORTRSN COMPILER TRANSLATES SOURCE PRGRAMS
WRITTEB IN TTHE FORTRAN IV LANGUAGE INTO THE MACJINE
LANGAUGE OF THE COMPUTER.  THIS TRANSLATED VERSION
OF THE FORTRAN PROGRAM IS THEN WRITTEN ON A STORRAGE
DEVICE.


.TYPE COMPUT.DCT<CR>
SNOBOL
ALGOL
TECO
LINED
SOS
SPEAKEASY
AID
RAID
STATPACK
BASIC
COBOL
FORTRAN
MACRO


.R SPELL<CR>


Do you want to augment the dictionary? Y<CR>
Dictionary file name: COMPUT.DCT<CR>
Type "I" to mark these as incremental insertions: <CR>
There are now 10381 words in the dictionary.  32 K Core used.
Do you want to augment the dictionary? <CR>

Name of the file to check and correct: TEXT<CR>

File name for output: TEXTA<CR>

File for exceptions: <CR>
No exception file.
working...
Page 1
THE FORTRSN COMPILER TRANSLATES SOURCE PRGRAMS
FORTRSN

In general, you have the following options:
A Accept word
I Accept word and insert it
  in the dictionary
R Replace this word. User will
  be able to type replacement word
X accept this word,
  then finish recopying without
  any checking.
W Save my present incremental insertions,
  then ask again about this word.
D Display the current line again.
S Select from list of guesses.

If you want to review this list, then type something not in the list
I guess: FORTRAN  Type C to make this correction, or A,I,R,X,D, or W
*C<CR>
SOURCE
Type A,I,R,X,W or D
*I<CR>
PRGRAMS
Type A,I,R,X,W or D
*R<CR>
Replace with: PROGRAMS<CR>

type "I" to insert this replacement in the dictionary: I<CR>
Page 1
WRITTEB IN TTHE FORTRAN IV LANGUAGE INTO THE MACJINE
WRITTEB
I guess: WRITTEN  Type C to make this correction, or A,I,R,X,D, or W
*C<CR>
TTHE
Type A,I,R,X,D,W or S
*R<CR>
Replace with: THE<CR>
IV
Type A,I,R,X,D,W or S
*A<CR>
MACJINE
I guess: MACHINE  Type C to make this correction, or A,I,R,X,D, or W
*C<CR>
Page 1
LANGAUGE OF THE COMPUTER.  THIS TRANSLATED VERSION
LANGAUGE
I guess: LANGUAGE  Type C to make this correction, or A,I,R,X,D, or W
*C<CR>
Page 1
OF THE FORTRAN PROGRAM IS THEN WRITTEN ON A STORRAGE
STORRAGE
I guess: STORAGE  Type C to make this correction, or A,I,R,X,D, or W
*C<CR>

Finished.
Type:
E	Exit
C	Correct another file
A	Augment the dictionary and correct another file
D	Dump the dictionary to disk (DON'T DO THIS)
I	Dump incremental. (Rewrites a file!)
*I<CR>
file name: COMPUT.INC<CR>
Type:
E	Exit
C	Correct another file
A	Augment the dictionary and correct another file
D	Dump the dictionary to disk (DON'T DO THIS)
I	Dump incremental. (Rewrites a file!)
*E<CR>

EXIT

.TYPE TEXTA<CR>
THE FORTRAN COMPILER TRANSLATES SOURCE PROGRAMS
WRITTEN IN THE FORTRAN IV LANGUAGE INTO THE MACHINE
LANGUAGE OF THE COMPUTER.  THIS TRANSLATED VERSION
OF THE FORTRAN PROGRAM IS THEN WRITTEN ON A STORAGE
DEVICE.

.TYPE COMPUT.INC<CR>
PROGRAMS
SOURCE



7.2  BATCH DECK EXAMPLE

THIS SETUP WILL PRODUCE AN INCREMENTAL LIST OF THE UNRECOGNIZED WORDS FROM A
FILE ("TEXT").  NOTE ALL BLANK CARD SHOULD HAVE A STAR ("*") IN COLUMN 1.
COMMENTS TO THE RIGHT ARE NOT TYPED OF THE CARDS.

$JOB [##,##]			;REPLACE ##,## WITH USER PROJECT-PROGRAMMER
				 NUMBER
$PASSWORD ----			; REPLACE ---- WITH USER PASSWORD
.R SPELL			;START PROGRAM
*				; NO EXTRA DICITONARY
TEXT/T				;FILE TO CHECK USING TRAINING FEATURE
I				;MAKE INCREMENTAL DICITONARY
*				; DICITONAY NAME IS DEFAULT WORDS.LST
E				; TERMINATE PROGRAM
.PRO<155> WORDS.LST		;  PROTECT DICTIONARY
.KJOB				;TERMINATE JOB
(EOF)				;END OF FILE CARD