Trailing-Edge
-
PDP-10 Archives
-
decuslib10-02
-
43,50145/stprg.doc
There are 2 other files named stprg.doc in the archive. Click here to see a list.
SUBROUTINE STPRG
PURPOSE
TO PERFORM A STEPWISE MULTIPLE REGRESSION ANALYSIS FOR A
DEPENDENT VARIABLE AND A SET OF INDEPENDENT VARIABLES. AT
EACH STEP, THE VARIABLE ENTERED INTO THE REGRESSION EQUATION
IS THAT WHICH EXPLAINS THE GREATEST AMOUNT OF VARIANCE
BETWEEN IT AND THE DEPENDENT VARIABLE (I.E. THE VARIABLE
WITH THE HIGHEST PARTIAL CORRELATION WITH THE DEPENDENT
VARIABLE). ANY VARIABLE CAN BE DESIGNATED AS THE DEPENDENT
VARIABLE. ANY INDEPENDENT VARIABLE CAN BE FORCED INTO OR
DELETED FROM THE REGRESSION EQUATION, IRRESPECTIVE OF ITS
CONTRIBUTION TO THE EQUATION.
USAGE
CALL STPRG (M,N,D,XBAR,IDX,PCT,NSTEP,ANS,L,B,S,T,LL,IER)
DESCRIPTION OF PARAMETERS
M - TOTAL NUMBER OF VARIABLES IN DATA MATRIX
N - NUMBER OF OBSERVATIONS
D - INPUT MATRIX (M X M) OF SUMS OF CROSS-PRODUCTS OF
DEVIATIONS FROM MEAN. THIS MATRIX WILL BE DESTROYED.
XBAR - INPUT VECTOR OF LENGTH M OF MEANS
IDX - INPUT VECTOR OF LENGTH M HAVING ONE OF THE FOLLOWING
CODES FOR EACH VARIABLE.
0 - INDEPENDENT VARIABLE AVAILABLE FOR SELECTION
1 - INDEPENDENT VARIABLE TO BE FORCED INTO THE
REGRESSION EQUATION
2 - VARIABLE NOT TO BE CONSIDERED IN THE EQUATION
3 - DEPENDENT VARIABLE
THIS VECTOR WILL BE DESTROYED
PCT - A CONSTANT VALUE INDICATING THE PROPORTION OF THE
TOTAL VARIANCE TO BE EXPLAINED BY ANY INDEPENDENT
VARIABLE. THOSE INDEPENDENT VARIABLES WHICH FALL
BELOW THIS PROPORTION WILL NOT ENTER THE REGRESSION
EQUATION. TO ENSURE THAT ALL VARIABLES ENTER THE
EQUATION, SET PCT = 0.0.
NSTEP- OUTPUT VECTOR OF LENGTH 5 CONTAINING THE FOLLOWING
INFORMATION
NSTEP(1)- THE NUMBER OF THE DEPENDENT VARIABLE
NSTEP(2)- NUMBER OF VARIABLES FORCED INTO THE
REGRESSION EQUATION
NSTEP(3)- NUMBER OF VARIABLE DELETED FROM THE
EQUATION
NSTEP(4)- THE NUMBER OF THE LAST STEP
NSTEP(5)- THE NUMBER OF THE LAST VARIABLE ENTERED
ANS - OUTPUT VECTOR OF LENGTH 11 CONTAINING THE FOLLOWING
INFORMATION FOR THE LAST STEP
ANS(1)- SUM OF SQUARES REDUCED BY THIS STEP
ANS(2)- PROPORTION OF TOTAL SUM OF SQUARES REDUCED
ANS(3)- CUMULATIVE SUM OF SQUARES REDUCED UP TO
THIS STEP
ANS(4)- CUMULATIVE PROPORTION OF TOTAL SUM OF
SQUARES REDUCED
ANS(5)- SUM OF SQUARES OF THE DEPENDENT VARIABLE
ANS(6)- MULTIPLE CORRELATION COEFFICIENT
ANS(7)- F RATIO FOR SUM OF SQUARES DUE TO
REGRESSION
ANS(8)- STANDARD ERROR OF THE ESTIMATE (RESIDUAL
MEAN SQUARE)
ANS(9)- INTERCEPT CONSTANT
ANS(10)-MULTIPLE CORRELATION COEFFICIENT ADJUSTED
FOR DEGREES OF FREEDOM.
ANS(11)-STANDARD ERROR OF THE ESTIMATE ADJUSTED
FOR DEGREES OF FREEDOM.
L - OUTPUT VECTOR OF LENGTH K, WHERE K IS THE NUMBER OF
INDEPENDENT VARIABLES IN THE REGRESSION EQUATION.
THIS VECTOR CONTAINS THE NUMBERS OF THE INDEPENDENT
VARIABLES IN THE EQUATION.
B - OUTPUT VECTOR OF LENGTH K, CONTAINING THE PARTIAL
REGRESSION COEFFICIENTS CORRESPONDING TO THE
VARIABLES IN VECTOR L.
S - OUTPUT VECTOR OF LENGTH K, CONTAINING THE STANDARD
ERRORS OF THE PARTIAL REGRESSION COEFFICIENTS,
CORRESPONDING TO THE VARIABLES IN VECTOR L.
T - OUTPUT VECTOR OF LENGTH K, CONTAINING THE COMPUTED
T-VALUES CORRESPONDING TO THE VARIABLES IN VECTOR L.
LL - WORKING VECTOR OF LENGTH M
IER - 0, IF THERE IS NO ERROR.
1, IF RESIDUAL SUM OF SQUARES IS NEGATIVE OR IF THE
PIVOTAL ELEMENT IN THE STEPWISE INVERSION PROCESS IS
ZERO. IN THIS CASE, THE VARIABLE WHICH CAUSES THIS
ERROR IS NOT ENTERED IN THE REGRESSION, THE RESULT
PRIOR TO THIS STEP IS RETAINED, AND THE CURRENT
SELECTION IS TERMINATED.
REMARKS
THE NUMBER OF DATA POINTS MUST BE AT LEAST GREATER THAN THE
NUMBER OF INDEPENDENT VARIABLES PLUS ONE. FORCED VARIABLES
ARE ENTERED INTO THE REGRESSION EQUATION BEFORE ALL OTHER
INDEPENDENT VARIABLES. WITHIN THE SET OF FORCED VARIABLES,
THE ONE TO BE CHOSEN FIRST WILL BE THAT ONE WHICH EXPLAINS
THE GREATEST AMOUNT OF VARIANCE.
INSTEAD OF USING, AS A STOPPING CRITERION, A PROPORTION OF
THE TOTAL VARIANCE, SOME OTHER CRITERION MAY BE ADDED TO
SUBROUTINE STOUT.
SUBROUTINES AND FUNCTION SUBPROGRAMS REQUIRED
STOUT(NSTEP,ANS,L,B,S,T,NSTOP)
THIS SUBROUTINE MUST BE PROVIDED BY THE USER. IT IS AN
OUTPUT ROUTINE WHICH WILL PRINT THE RESULTS OF EACH STEP OF
THE REGRESSION ANALYSIS. NSTOP IS AN OPTION CODE WHICH IS
ONE IF THE STEPWISE REGRESSION IS TO BE TERMINATED, AND IS
ZERO IF IT IS TO CONTINUE. THE USER MUST CONSIDER THIS IF
SOME OTHER STOPPING CRITERION THAN VARIANCE PROPORTION IS TO
BE USED.
METHOD
THE ABBREVIATED DOOLITTLE METHOD IS USED TO (1) DECIDE VARI-
ABLES ENTERING IN THE REGRESSION AND (2) COMPUTE REGRESSION
COEFFICIENTS. REFER TO C. A. BENNETT AND N. L. FRANKLIN,
'STATISTICAL ANALYSIS IN CHEMISTRY AND THE CHEMICAL INDUS-
TRY', JOHN WILEY AND SONS, 1954, APPENDIX 6A.