Trailing-Edge
-
PDP-10 Archives
-
decus_20tap5_198111
-
decus/20-0149/mulinp.hlp
There are 2 other files named mulinp.hlp in the archive. Click here to see a list.
Multiple Linear Regression Analysis
THE INPUT SPECIFICATION
To indicate which numbers or series of numbers in the input data
belong to which variable in the model formula and which numbers can be
skipped, the program expects an input specification. It consists of the
keyword "Input", followed by a formula (the input statement) which
describes the arrangement of the observations in the input data, while it
is terminated with a ';' (semicolon). The basic idea is that numbers from
the input data are identified with the names from the input formula in such
a way that (in order of entry) numbers belonging to the same name are put
in a queue appended to that name. For instance:
"Input" 100 * (codenr, 10 * [yvar], [xvar1, xvar2], -1);
means that one hundred series of numbers (each, as a check, terminated in
this example by -1) are present in the input data. Each series consists of
fourteen numbers: first one value which is read and assigned to the name
codenr, then ten values for the name yvar, then one value for the name
xvar1, followed by one value for the name xvar2 and finally the value -1.
The basic constituent of an input formula is a variable enclosed in
square brackets, in the example: [yvar]. The corresponding number from the
input data will be appended to the queue for that name. Several variables
can be put together in a variable list by separating them by commas and
enclosing them in square brackets, in the example: [xvar1, xvar2]. This
only serves to save the writing of several opening and closing brackets.
Separate numbers, series or blocks of numbers can be treated by
putting a repetition factor (control) followed by an asterisk in front of a
variable list (or in front of an input formula which must then be enclosed
in parentheses), in the example: 100 * and 10 * .
If a repetition factor is 1, it may be omitted together with the
asterisk and a parentheses pair, but square bracket pairs must remain.
When a name is used as a repetition factor, a value must already have been
assigned to it, which is done by giving that name, without square brackets
and followed by a comma (or closing parenthesis), earlier in the input
formula than the use of that name as a repetition factor. The corres-
ponding number from the input data is then assigned as a value to that
name. If such names are used repeatedly in the input formula, the
corresponding numbers from the input data are compared with the first one
and, in the case of inequality, an error message is supplied. This may
serve as a check against shifted data reading. A similar check can be
obtained by giving an explicit number followed by a comma (or closing
parenthesis), in the example: the -1. The corresponding number from the
input data is then compared with that given number and, in the case of
inequality, an error message is produced.
Also an expression is allowed as a repetition factor, or for that
matter, as a check value, provided that it is enclosed in angle brackets,
for instance: <k+n>. As in the case of single names used as a repetition
factor each (non-standard function and non-option) name used in such a
(special) expression must have been given, followed by a comma, earlier in
the input formula than the use of that name in the expression.
The linkage between the model formula and the input formula is
established by using the same names in the model terms and in the input
variable lists. Numbers from the input data that belong to such input
names will be treated as observations for the model variables, while
numbers that belong to input names between square brackets which do not
appear in the model formula, are skipped.
Often, repeated observations for the dependent variable are available.
In order to be able to process these observations automatically, it is
necessary that a variable list consisting entirely of dependent variables
is preceeded by a repetition factor (followed by an asterisk) indicating
the number of repetitions. If a variable list contains independent as well
as dependent variables, the number of replications is assumed to be 1. A
series of (say 100) observations for a dependent variable with no
replications is denoted as:
100 * ([dep var])
The repetition factor in front of the opening square bracket is omitted
(because it is 1), although the parentheses are not. Without the
parentheses it would mean 100 replications of [dep var].
EXAMPLE "Input" k, n, <k+n> * (c, m, m * [y], [x1,x2,x3,x4], c), -99;
means that: the first number is read and its value assigned to k,
the next number is read and its value assigned to n,
then k+n times the following happens:
a number is read and its value assigned to c,
the next number is read and its value assigned to m,
then the m replications for y are read,
next the observations for x1, x2, x3 and x4 are read,
then a number is read and its value compared with c,
finally a number is read and its value compared with -99.
If the comparisons fail, an error message is supplied and execution of the
job is terminated, otherwise (k+n) observations for x1, x2, x3, x4 and for
each quadruple m replications for y, have been identified.