Trailing-Edge
-
PDP-10 Archives
-
decuslib20-05
-
decus/20-0149/multhe.rnh
There are 2 other files named multhe.rnh in the archive. Click here to see a list.
.LM0;.RM75;.TS72;.LC;.AP;.FLAG CAPITAL;.NO PAGING;.NO NUMBER;#
.BR;^MULTIPLE ^LINEAR ^REGRESSION ^ANALYSIS
.SK;^^THE REGRESSION MODEL\\
^IN A REGRESSION PROBLEM THE RESEARCHER POSTULATES A CERTAIN RELATION- SHIP
BETWEEN A RANDOM VARIABLE Y (THE REALIZATIONS OF WHICH ARE SUBJECT
TO SOME FORM OF DISTURBANCE) ON THE ONE SIDE AND A NUMBER OF VARIABLES
X1,...,XP
(WHICH ARE WITHOUT OR AT LEAST ALMOST WITHOUT DISTURBANCES) ON THE
OTHER SIDE. ^THIS RELATIONSHIP IS EXPRESSED BY A MATHEMATICAL FORMULA,
WHICH IS CALLED THE (LINEAR) REGRESSION MODEL, FOR INSTANCE:
.TS72;.SK;.I18;Y = A0 + A1 * X1 +#...#+ AP * XP + E (1)
.SK;IN WHICH A0,...,AP REPRESENT UNKNOWN REGRESSION COEFFICIENTS (PARAMETERS)
WHICH ARE TO BE ESTIMATED AND E REPRESENTS THE DISTURBANCE.
^IF A CONSTANT TERM IS PRESENT IN THE MODEL FORMULA (IN (1) THE A0), THE MODEL IS
SAID TO BE AN 'INTERCEPT#MODEL', IF NO CONSTANT TERM IS PRESENT, THE MODEL IS
CALLED A 'NO-INTERCEPT#MODEL'.
.BR;^THE VARIABLES X1,...,XP AND THE VARIABLE Y CAN ALSO REPRESENT
(OTHER) TRANSFORMED VARIABLES. ^THE RESEARCHER MIGHT HAVE REASONS
TO BELIEVE (FROM BACKGROUND INFORMATION CONCERNING THE EXPERIMENT)
THAT TRANSFORMATIONS ARE NECESSARY, FOR INSTANCE:
.BR;1)#TO OBTAIN NORMALLY DISTRIBUTED DISTURBANCES,
.BR;2)#TO OBTAIN A GREATER HOMOGENEITY OF THE VARIANCES OF THE DISTURBANCES,
.BR;3)#TO LINEARIZE NON-LINEAR REGRESSION MODELS (IF POSSIBLE).
.BR;^THE TRANSFORMED REGRESSION MODEL CAN BE WRITTEN AS:
.SK;.I5;^G(Y) = A0 + A1 * ^F1(X1,...,XM) +#...#+ AP * ^FP(X1,...,XM) + E (2)
.SK;IN WHICH ^G, ^F1,...,^FP REPRESENT THE TRANSFORMATIONS,
.BR;.I12;A0,...,AP REPRESENT THE PARAMETERS TO BE ESTIMATED,
.BR;.I20;Y REPRESENTS THE DEPENDENT VARIABLE,
.BR;.I12;X1,...,XM REPRESENT THE INDEPENDENT VARIABLES,
.BR;.I20;E REPRESENTS THE DISTURBANCE.
^THE CHOICE OF A TRANSFORMATION BY MEANS OF 'TRIAL AND ERROR' IS RATHER
TIME CONSUMING AND COSTLY. ^THE IMPORTANCE OF THE LOCATION PARAMETER MAKES
FOR THE DIFFICULTY. ^IT IS NOT UNUSUAL THAT ^LOG#(X) YIELDS NO IMPROVEMENT,
BUT THAT ^LOG#(C+X) GIVES BETTER RESULTS FOR A PARTICULAR CHOICE OF C.
^BECAUSE THIS HOLDS FOR ALMOST ANY TRANSFORMATION OF SOME IMPORTANCE,
WE MUST ACTUALLY SOLVE IN EACH CASE A NONLINEAR ADJUSTMENT PROBLEM. ^OFTEN
THOUGH, A SIMPLE FORM OF THE TRANSFORMATION IS SUGGESTED BY THE RESEARCHER
WHO IS BETTER ACQUAINTED WITH THE PECULIARITIES OF THE EXPERIMENT.
.SK2;^^LEAST SQUARES\\
^REGRESSION ANALYSIS CONSISTS IN FACT OF THE ADJUSTMENT OF A HYPERPLANE
OF THE REQUIRED DIMENSION TO THE DATA. ^THE FITTING IS DONE WITH THE METHOD
OF LEAST SQUARES, WHICH MEANS THAT THE SUM OF THE SQUARES OF THE DIFFERENCES
BETWEEN THE OBSERVED VALUES FOR Y AND THE ESTIMATED VALUES FOR THE EXPECTATION
OF Y, ARE MINIMIZED. ^THIS SUM OF SQUARES IS ALSO CALLED THE RESIDUAL
SUM OF SQUARES.
.BR;^IN MATRIX NOTATION THE REGRESSION MODEL CAN BE WRITTEN AS:
.SK;.I30;^Y = ^XA + E (3)
.SK;IN WHICH ^Y IS A (N*1) RANDOM VECTOR OF OBSERVATIONS,
.BR;.I9;^X IS A (N*P) MATRIX OF KNOWN (FIXED) VALUES,
.BR;.I9;A IS A (P*1) VECTOR OF (UNKNOWN) PARAMETERS,
.BR;.I5;AND E IS A (N*1) RANDOM VECTOR OF DISTURBANCES.
.SK;^IT IS SUPPOSED THAT ^E(E)#=#0 AND VAR(E)#=#^ISIGMA_^2, IN WHICH ^I
IS THE UNIT MATRIX, THUS:
.SK;.I31;^E(^Y) = ^XA (4)
^THE SUM OF SQUARES OF THE DIFFERENCES BETWEEN THE OBSERVED VALUES OF ^Y AND
THE ESTIMATED VALUES FOR THE EXPECTATION OF ^Y THUS EQUALS:
.SK;.I17;(^Y-^XA)'(^Y-^XA) = <Y'Y - 2A'<X'Y + A'^X'^XA (5)
.SK;(FOR A'<X'Y IS A SCALAR AND THEREFORE EQUAL TO ^Y'^XA).
^CHOOSING AS LEAST SQUARES ESTIMATOR B THAT VALUE OF A WHICH MINIMIZES (5),
INVOLVES DIFFERENTIATING WITH RESPECT TO THE ELEMENTS OF A AND EQUATING
THE RESULT TO ZERO:
.SK;.I19;-2<X'Y + 2^X'^XB = 0,##THUS:##<X'Y = ^X'^XB (6)
.SK;^THIS SYSTEM IS CALLED THE NORMAL EQUATIONS.
^IF THE RANK OF ^X EQUALS P, <X'X IS NONSINGULAR AND THE INVERSE OF <X'X
EXISTS. ^IN THAT CASE THE SOLUTION OF THE NORMAL EQUATIONS CAN BE WRITTEN AS:
.SK;.I29;B = INV(<X'X)X'Y (7)
.SK;^OBSERVE THAT P _<= N MUST HOLD, IN ORDER THAT THE RANK OF ^X CAN BE P AT ALL.
^THEREFORE AT LEAST AS MANY OBSERVATIONS MUST BE MADE, AS THERE ARE PARAMETERS IN THE MODEL.
^ALSO OBSERVE THAT ^E(B)#=#INV(<X'X)^X'^E(^Y)#=#A, THUS B IS AN UNBIASED
ESTIMATOR OF A.
.SK;^THE LEAST SQUARES ESTIMATOR HAS THE FOLLOWING PROPERTIES:
.LM+3;.I-3;1.#^IT IS AN ESTIMATOR WHICH MINIMIZES THE SUM OF SQUARES OF
DEVIATIONS, IRRESPECTIVE OF ANY DISTRIBUTION PROPERTIES OF THE DISTURBANCES.
^THE ASSUMPTION THAT THE DISTURBANCES ARE NORMALLY DISTRIBUTED IS, OF COURSE,
NECESSARY FOR TESTS WHICH DEPEND ON THIS ASSUMPTION, SUCH AS T- OR ^F- TESTS,
OR FOR OBTAINING CONFIDENCE INTERVALS BASED ON T- OR ^F- DISTRIBUTIONS.
.BR;.I-3;2.#^ACCORDING TO THE ^GAUSS-^MARKOV THEOREM, THE ELEMENTS OF B ARE
UNBIASED ESTIMATORS, WHICH HAVE MINIMUM VARIANCE (OF ANY LINEAR FUNCTION OF
THE ^Y'S WHICH PROVIDES UNBIASED ESTIMATORS), AGAIN IRRESPECTIVE OF THE
DISTRIBUTION PROPERTIES OF THE DISTURBANCES.
.BR;.I-3;3.#^IF THE DISTURBANCES ARE MUTUALLY INDEPENDENT AND NORMALLY
DISTRIBUTED (WITH ^E(E)#=#0 AND VAR(E)#=#^ISIGMA_^2), THEN B IS ALSO THE
MAXIMUM LIKELIHOOD ESTIMATOR.
.LM-3;.SK;^THE VARIANCE-COVARIANCE MATRIX OF B IS:
.SK;.I25;VAR(B) = INV(^X'^X)SIGMA_^2 (8)
.SK;^THE VARIANCES ARE THE DIAGONAL AND THE COVARIANCES THE
OFF-DIAGONAL ELEMENTS.
.SK;^AN UNBIASED ESTIMATOR FOR SIGMA_^2 IS GIVEN BY:
.SK;.I23;S_^2 = (<Y'Y - B'<X'Y) / (N-P) (9)
.SK;^THE SQUARE ROOT OF THIS ESTIMATOR IS FREQUENTLY CALLED 'STANDARD
ERROR OF ESTIMATE'. ^IN THE PRINTED OUTPUT OF THE PROGRAM IT IS
INDICATED MORE PROPERLY AS 'STANDARD DEVIATION OF THE ERROR TERM'.
^LET VIJ BE THE ELEMENT IN THE I-TH ROW AND J-TH COLUMN OF INV(<X'X),
THEN SDI = S * ^SQRT(VII) ESTIMATES THE STANDARD DEVIATION OF BI, AND
CIJ = VIJ / ^SQRT(VII * VJJ) GIVES THE CORRELATION COEFFICIENT BETWEEN
BI AND BJ FOR I = 1,...,P AND J = 1,...,P. ^THUS:
.TS71;.SK;.I27;VII = (SDI / S)_^2 (10)
.BR;AND
.BR;.I10;VIJ = CIJ * ^SQRT(VII * VJJ) = CIJ * (SDI * SDJ) / S (11)
.SK; ^A FREQUENTLY USED STATISTICAL MEASURE FOR EVALUATING REGRESSION MODELS
IS THE MULTIPLE CORRELATION COEFFICIENT ^R WHICH IS DEFINED IN THE INTERCEPT MODEL AS THE SQUARE
ROOT OF THE PROPORTION OF THE CORRECTED TOTAL SUM OF SQUARES ACCOUNTED FOR BY THE
MODEL. ^IF THE CORRECTION FOR MEANS IS DENOTED BY NU_^2, WITH U = ^SUM(I,1,N,YI)/N,
THEN ^R CAN BE DEFINED AS:
.SK;.I7;^R_^2 = (B'^X'^Y-NU_^2)/(^Y'^Y-NU_^2) = 1 - (^Y'^Y-B'^X'^Y)/(^Y'^Y-NU_^2) (12)
.SK;^HOWEVER, WE MUST DIVIDE ^Y'^Y-B'^X'^Y BY N-P, NOT BY N, TO OBTAIN AN
UNBIASED ESTIMATOR OF SIGMA_^2, MOREOVER IT IS CUSTOMARY TO DIVIDE ^Y'^Y-NU_^2
BY N-1, NOT BY N. ^IF WE ADOPT BOTH MODIFICATIONS WE OBTAIN THE ADJUSTED
MULTIPLE CORRELATION COEFFICIENT, WHICH CAN THUS BE DEFINED AS:
.SK;.I11;ADJ(^R)_^2 = 1 - (N-1)/(N-P) * (^Y'^Y-B'^X'^Y)/(^Y'^Y-NU_^2) (13)
^IN THE NO-INTERCEPT MODEL THE CORRECTION FOR MEANS IS IGNORED, GIVING
AS DEFINITION OF ^R_^2: B'^X'^Y/^Y'^Y#=#1#-#(^Y'^Y-B'^X'^Y)/^Y'^Y,#
WHILE THE ADJ(^R)_^2 IS DEFINED CORRESPONDINGLY AS: 1#-#N/(N-P)#*#(^Y'^Y-B'^X'^Y)/^Y'^Y.
^R_^2 ITSELF IS OFTEN CALLED THE 'PROPORTION OF VARIATION EXPLAINED'.
.SK2;^^WEIGHTED LEAST SQUARES\\
^IT SOMETIMES HAPPENS THAT SOME OF THE OBSERVATIONS FOR THE DEPENDENT
VARIABLE ARE 'LESS RELIABLE' THAN OTHERS. ^THIS USUALLY MEANS THAT THE
VARIANCES OF THE OBSERVATIONS ARE NOT ALL EQUAL; IN OTHER WORDS THE MATRIX
^V#=#VAR(E) IS NOT OF THE FORM ^ISIGMA_^2, BUT IS DIAGONAL WITH UNEQUAL
DIAGONAL ELEMENTS. ^THE BASIC IDEA TO SOLVE THIS PROBLEM IS, TO TRANSFORM
^Y TO OTHER VARIABLES, WHICH DO APPEAR TO SATISFY THE USUAL TENTATIVE
MODEL ASSUMPTIONS, AND THEN APPLY THE USUAL (UNWEIGHTED) ANALYSIS
TO THE VARIABLES SO OBTAINED. ^THE ESTIMATES CAN THEN BE RE-EXPRESSED IN
TERMS OF THE ORIGINAL VARIABLES ^Y.
^LET THE ORIGINAL REGRESSION MODEL BE: ^Y#=#^XA#+#E, WITH ^E(E)#=#0 AND
VAR(E)#=#^VSIGMA_^2, WITH ^V DIAGONAL WITH UNEQUAL DIAGONAL ELEMENTS,
AND LET ^P#=#INV(^V). ^PREMULTIPLYING THE ORIGINAL REGRESSION MODEL
WITH ^Q#=#^SQRT(^P) GIVES AS TRANSFORMED REGRESSION MODEL:
.SK;.I29;<QY = ^Q^XA + ^QE (14)
.SK;WITH ^E(^QE)#=#0 AND VAR(^QE)#=#^ISIGMA_^2. ^THE NORMAL EQUATIONS THEN BECOME:
.SK;.I27;<(QX)'QY = (^Q^X)'^Q^XA (15)
.SK;GIVING AS SOLUTION IF THE INDICATED INVERSE MATRIX EXISTS:
.SK;.I16;B = INV((<QX)'QX)(QX)'QY = INV(<X'PX)X'PY (16)
.SK;WITH VARIANCE-COVARIANCE MATRIX:
.SK;.I23;VAR(B) = INV(^X'^P^X)SIGMA_^2 (17)
^IN PRACTICAL SITUATIONS IT IS OFTEN DIFFICULT TO OBTAIN SPECIFIC INFORMATION
ON THE FORM OF ^V AT FIRST. ^FOR THIS REASON IT IS SOMETIMES NECESSARY TO MAKE
THE (KNOWN TO BE ERRONEOUS) ASSUMPTION ^V#=#^I AND THEN ATTEMPT TO DISCOVER
SOMETHING ABOUT THE FORM OF ^V BY EXAMINING THE RESIDUALS FROM THE REGRESSION
ANALYSIS.
.SK2;^^RESIDUAL ANALYSIS\\
^THE VECTOR OF RESIDUALS ^D IS DEFINED AS THE DIFFERENCE BETWEEN THE VECTOR
OF OBSERVATIONS ^Y AND THE VECTOR OF FITTED VALUES ^Z, OBTAINED BY USING THE
REGRESSION EQUATION##^Z#=#^XB. ^SO ^D#=#^Y#-#^Z OR DI#=#YI#-#ZI FOR I#=#1,...,N.
^IF THE MODEL IS CORRECT, THE RESIDUAL MEAN SQUARE <MSE = S_^2 ESTIMATES SIGMA_^2, AND
THE ESTIMATED STANDARD DEVIATION OF THE FITTED VALUE ZI AT XI = (XI1,...,XIP)' IS:
.SK;.I21;SD(ZI)#=#S * ^SQRT(XI'INV(^X'^X)XI) (18)
.SK;WHICH CAN BE USED TO CONSTRUCT A CONFIDENCE INTERVAL FOR THE EXPECTED
VALUE OF YI:#^E(YI) AT XI = (XI1,...,XIP)', OR TO CONSTRUCT A PREDICTION
INTERVAL FOR THE MEAN OF H NEW OBSERVATIONS AT THIS POINT. ^IN THE FIRST
CASE THE CONFIDENCE INTERVAL IS:
.SK;.I11;ZI +- T(N-P-1,1-ALPHA/2) * S * ^SQRT(XI'INV(^X'^X)XI) (19)
.SK;AND IN THE SECOND CASE THE PREDICTION INTERVAL IS:
.SK;.I8;ZI +- T(N-P-1,1-ALPHA/2) * S * ^SQRT(1/H + XI'INV(^X'^X)XI) (20)
^RESEARCHERS OFTEN DIVIDE THE RESIDUALS DI BY S, RESULTING IN THE STANDARDIZED
RESIDUALS, WHICH CAN BE EXAMINED TO SEE IF THEY MAKE IT APPEAR THAT THE ASSUMPTION
EI/SIGMA ~ ^N(0,1) IS VIOLATED. ^IT MIGHT BE EXPECTED THAT ROUGHLY 95% OF THE
DI/S WERE BETWEEN THE LIMITS (-2,2).
^HOWEVER, THE VARIANCES OF THE
RESIDUALS ARE NOT CONSTANT BUT A FUNCTION OF THE ^X MATRIX (SEE (18)),
WHICH SUGGESTS AS STANDARDIZATION:
.SK;.I19;TI = DI / S / ^SQRT(1 - XI'INV(^X'^X)XI) (21)
.SK;GIVING THE STUDENTIZED RESIDUAL.
^THE MAXIMUM STUDENTIZED RESIDUAL CAN BE USED IN A TEST FOR DETECTING
OUTLIERS, AS FOLLOWS: LET T_^2#=#MAX(TI_^2),
THEN##MIN(1,#N#*#(1-^FISHER(1,#N-P-1,#T_^2*(N-P-1)/(N-P-T_^2)))) IS AN
'UPPER BOUND FOR THE RIGHT TAIL PROBABILITY OF THE LARGEST
ABSOLUTE STUDENTIZED RESIDUAL'.
.BR;#