NB. Subroutines to process and solve Linear Statistical Models NB. L. Halliwell, 08Jan2007 coclass 'LSM' (;: 'z LSM') copath 'base' NB. ========================================================================================== LSMSolve =: 3 : 0 NB. Solves the Linear Statistical Model y = X*beta + e, Var[e] = s2*Phi NB. Unboxing reports with ";n{" avoids a leading singleton axis NB. To paste with blank rows between reports: "PasteToExcel ;, ([x] LSMSolve y),.a:" NB. L. Halliwell, 11Jan2007 (default_tol =. 1e_8) LSMSolve y : tolernce =. | x NB. relative tolerance for eigen significance mmult =. +/ . * NB. matrix multiplication diag =. ((2##) $ ,@([ ,"(0 1) 0"0)) NB. diagonal matrix from a rank-0-or-1 list comment =. > 0{y =. ,y NB. boxed comment (y was 2x1 matrix) model =. > 1{y NB. table containing model cols =. {. model NB. column headings table =. }. model NB. body of table Ycol =. +./ ('Y' ; 1$'Y') =/ cols NB. 1 for y column 'Model must have exactly one Y column' assert 1 = +/Ycol Xcols =. +./ ('X' ; 1$'X') =/ cols NB. 1 for x columns Vcols =. +./ ('PHI';'SIGMA') =/ cols NB. 1 for variance cols IDcols =. -. Ycol +. Xcols +. Vcols NB. 1 for remaining cols Vtype =. ~. Vcols#cols NB. <'PHI', <'SIGMA', or a: Y =. (Select =. #"1 & table) Ycol NB. dependent variable Predict =. ,(Y=a:)+.(Y=<_.)+.(<'literal')=datatype each Y NB. ways of signaling a prediction Actual =. -. Predict NB. observed Y y1 =. {.&> Actual#Y NB. unboxed observations X =. {.&> Select Xcols NB. indep vars Phi =. {.&> Select Vcols NB. variance structure (abs or rel) Labels =. Select IDcols NB. row labels Phi =. (1 >. +/Vcols) {."1 Phi ,. 1 NB. unit var if null (heteroskedastic ) Vtype =. (+./Vcols){ (<'PHI'),Vtype NB. <'PHI' if null (var relativities) X1 =. Actual#X [ X2 =. Predict#X NB. partition of X DiagVar =. 1 = 1{$Phi NB. 1 if diagonal variance structure if. DiagVar do. Phi =. diag ,Phi end. NB. change diagonal into square array 'asymmetric variance' assert (-: |:) Phi NB. check for symmetry Phi11 =. Actual # Actual #"1 Phi NB. 1st partition of variance matrix Phi12 =. Actual # Predict #"1 Phi NB. 2nd partition of variance matrix Phi21 =. Predict # Actual #"1 Phi NB. 3rd partition of variance matrix Phi22 =. Predict # Predict #"1 Phi NB. 4th partition of variance matrix erase 'Phi' NB. save space 'W lambda' =. 0 2 { Eigen Phi11 NB. eigen decompostion of Phi11 lambda =. lambda*(tolernce Signif lambda) NB. zero insignificant eigenvalues 'negative variance' assert *./ (0<:lambda) NB. stop if negative eigenvalue poseigen =. lambda > 0 NB. markers of positive eigenvals Ilambda =. poseigen } 0,:%lambda NB. g-inverses of eigenvals lambda1 =. poseigen#lambda NB. positive eigenvalues W1t =. |: W1 =. #"1&W poseigen NB. W columns corresponding to positive eigenvalues W2t =. |: W2 =. #"1&W (-. poseigen) NB. W columns corresponding to zero eigenvalues Phi11_1 =. W1 mmult (poseigen#Ilambda)*W1t NB. Moore-Penrose inverse of Phi11 PP21 =. Phi21 mmult Phi11_1 NB. intermediate calculation ProcVar =. Phi22 - PP21 mmult Phi12 NB. Process Variance (relative) lambda2 =. > 2 { Eigen ProcVar NB. eigenvalues of ProcVar maxeig2 =. >./ | lambda2 NB. largest absval eigenvalue negeig2 =. lambda2 < -tolernce*maxeig2 NB. 1 if significantly negative eigenvalue 'negative process variance' assert 0= +./ negeig2 NB. stop if ProcVar is not non-negative definite sqrPhi22 =. %: | (<0 1) |: Phi22 NB. sqrt of abs of Phi22 diagonal sigcov =. tolernce*sqrPhi22* %: <./lambda1 NB. zero significance for cov12 cov12 =. W2t mmult Phi12 NB. cov[W2t*e1, e2], should be zero test =. (| cov12) >"1 sigcov NB. 1 if any cov element significantly non-zero if. +./^:2 test do. NB. test for any non-zero ErrorReport =: Phi12;cov12;test NB. record problem 1 { 0 [ 'Phi12 improper; see ErrorReport' NB. fatal error to stop execution end. 'beta0 Q' =. SolveAxb (W2t&mmult) each X1;y1 NB. beta=beta0+Q*gamma solves W2t*y1 = Wt2*X1*beta if. (_ e. ,beta0) do. NB. _ in beta0 indicates inconsistency ErrorReport =: W2t;y1;X1 NB. record problem 1 { 0 [ 'Inconsistent contraints; see ErrorReport' NB. fatal error to stop execution end. y00 =. X1 mmult beta0 NB. y-shift required by gamma-transformation Ty1 =. (#y1) {. W1t mmult y1-y00 NB. transformed observations (0-fill for W2t rows) TX1 =. (#X1) {. W1t mmult X1 mmult Q NB. transformed design with gamma (0-filled) InvL =. (#Phi11) {. %lambda1 NB. diagonal g-inverse variance relativities XtZ =. |: InvL*TX1 NB. intermediate calculation XtZX =. XtZ mmult TX1 NB. first term of gammahat XtZy =. XtZ mmult Ty1 NB. secnd term of gammahat gammahat =. XtZy %. XtZX NB. the BLUE of gamma fitTy1 =. TX1 mmult gammahat NB. fitted values of Ty1 nonStoch =. ,. -.poseigen NB. constraint that has no random error setequal =. (Ty1=fitTy1)+.nonStoch NB. equality by J tolerance or by constraint fitTy1 =. setequal{"1 fitTy1,.Ty1 NB. set fitted to Ty1 (perfect fit) resid =. Ty1 - fitTy1 NB. residual values AFR =. Ty1 ,. fitTy1 ,. resid NB. actual, fitted, residual SSCP =. (|: InvL * AFR) mmult AFR NB. sums of squares and crossproducts SSCP =. tolernce CleanVar SSCP NB. zero insignificant covariances rhosq =. (<0 1)|: SSCP NB. diagonal of SSCP, prior to var decomp df =. (] , -/) df =. (#W1t),#gammahat NB. corresponding degrees of freedom s2hat =. rhosq%df NB. estimate of variance scale (s2hat is last) s2sel =. (Vtype = <'SIGMA') { (2{s2hat),1 NB. select relative or absolute variance scale rhosq =. (% {.) rhosq NB. variance decomp Vargamma =. s2sel* %. XtZX NB. variance of BLUE of gamma Varresid =. s2sel*diag (#Phi11){.lambda1 NB. begin calc Varresid =. Varresid - TX1 mmult Vargamma mmult|: TX1 NB. end calc of Var[e hat] Varresid =. tolernce CleanVar Varresid NB. zero insignificant covariances diagres =. , (<0 1) |: Varresid NB. diagonal elements (variances) mask =. *./~ tolernce Signif diagres NB. significance-indicators of var matrix Varresid =. mask * Varresid NB. insignificant rows/cols zeroed betahat =. beta0 + Q mmult gammahat NB. BLUE of beta Varbeta =. Q mmult Vargamma mmult |: Q NB. Variance of BLUE of beta (absolute) Varbeta =. tolernce CleanVar Varbeta NB. zero insignificant covariances diagVb =. , (<0 1) |: Varbeta NB. significance-indicators of variances mask =. *./~ tolernce Signif diagVb NB. significance-indicators of var matrix Varbeta =. mask * Varbeta NB. insignificant rows/cols zeroed Stdbeta =. ,. %: (<0 1)|: Varbeta NB. StdError of BLUE of beta AFR0 =. (y00 ,. y00 ,. 0) + W mmult AFR NB. AFR for original (untransformed) model setnonS =. (*: W) mmult setequal,.nonStoch NB. setequal and nonStoch for orig model nonStoch =. 0 1#"1 setnonS NB. nonStoch for orig model (continuum [0,1]) preveql =. 1 0#"1 setnonS NB. equality after transform (continuum) y1deriv =. 1 0 0#"1 AFR0 NB. actual column from AFR0 fity1 =. 0 1 0#"1 AFR0 NB. fitted column from AFR0 setequal =. (fity1 = y1)+.(fity1=y1deriv) NB. new equality by J tolerance (boolean) setequal =. setequal +. (1=preveql) NB. equality by J tolerance or by constraint fity1 =. setequal{"1 fity1,.y1 NB. set fitted to y1 (perfect fit) AFR0 =. y1 ,. fity1 ,. res0 =. y1-fity1 NB. AFR0 corrected for rounding errors Varres0 =. W mmult Varresid mmult |: W NB. Variance of residuals of original model diagres0 =. (<0 1)|: Varres0 NB. diagonal elements (variances) diagres0 =. (* tolernce & Signif) diagres0 NB. zero insignificant variances Stdres0 =. ,. %: diagres0 NB. StdErr of resids of original model tresid =. res0 % Stdres0 NB. studentized residuals Div00 =. (res0=0)+.(Stdres0=0) NB. indicator of 0%0 tresid =. (Div00*.nonStoch<1){"1 tresid,._. NB. stochastic 0/0 is indeterminate XX21 =. X2 - PP21 mmult X1 NB. intermediate calculation y2hat =. (X2 mmult betahat)+(PP21 mmult 0 0 1 #"1 AFR0) NB. BLUE of prediction ParmVar =. XX21 mmult Varbeta mmult |: XX21 NB. parameter variance y2hatVar =. ParmVar+s2sel*ProcVar NB. variance of predition error y2hatVar =. tolernce CleanVar y2hatVar NB. zero insignificant covariances diagVb =. , (<0 1) |: y2hatVar NB. significance-indicators of variances mask =. *./~ tolernce Signif diagVb NB. significance-indicators of var matrix y2hatVar =. mask * y2hatVar NB. insignificant rows/cols zeroed y2hatStd =. ,. %: (<0 1)|: y2hatVar NB. StdError of predition error report =. 5 1$a: NB. empty report (5x1 matrix) report =. (.xdrop) }. RawTable NB. selected rows RawTable =. ytake {."1 (0>.ydrop) }."1 RawTable NB. selected rows RTabtrim =. -.&' ' each RawTable NB. delete spaces loctitle =. ((-:&'Title:')+.(-:&'TITLE:')) @ (6&{.) &> RTabtrim NB. possible title title =. deb_jijs_ @ (6&}.) each (,loctitle) # ,RawTable NB. extract possible titles title =. {. (~:&a: title) # title NB. take first non-blank title x =. (x = a:) { x,title NB. override blank comment with title ysrch =. +./"1 ('y';'Y';(1$'y');(1$'Y')) e."1 RTabtrim Xsrch =. +./"1 ('x';'X';(1$'x');(1$'X')) e."1 RTabtrim datafield =. (+./\ ysrch *. Xsrch) NB. possible data 'yX search failed'] (0={:datafield) {0 NB. fatal error Table =. datafield # RawTable NB. top row has names NullCol =. *./ =&a: Table NB. Null columns NullColIx =. (# i.&#) NullCol NB. Null ColID indices Table =. (<'NoName') NullColIx"_} Table NB. Label null columns ColID =. toupper @: (-.&' ') each {. Table NB. Column IDs ColID =. ((~:&a:)@] { ,) /\ ColID NB. Fill null IDs from the left ColID =. (DelNC =. (-. NullCol)&#"1) ColID NB. Delete IDs of null columns Table =. DelNC Table NB. Delete null columns Table =. (*./\@ (+./"1@(~:&a:)) # ]) Table NB. Delete from first empty row Xsrch =. +. / ('X';1$'X') =/ ColID NB. Search for X labels Xcols =. }. Xsrch #"1 Table NB. Extract X columns; drop labels numeric =. 3!:0 each 1;_2;%:2 NB. Numeric datatypes validX =. e.&numeric 3!:0 each Xcols NB. Numeric cells in X columns validX =. (~:&a: Xcols) *. validX NB. and non-empty X cells validX =. *. /\ *./"1 validX NB. X rows until first invalid cell Table =. validX # }. Table NB. ID-less and trimmed to validX t =. # Table NB. # of observations or predictions ysrch =. +. / ('Y';1$'Y') =/ ColID NB. Search for y labels y =. {."1 ysrch #/"1 Table NB. DepVar (only one allowed) X =. Xsrch #/"1 Table NB. InDepVar Fsrch =. +. / ('PHI' ;'F';1$'F') =/ ColID NB. Search for Phi labels Ssrch =. +. / ('SIGMA';'S';1$'S') =/ ColID NB. Search for Sigma labels 'Variance Ambiguous'] ((+./Fsrch)*.(+./Ssrch)){0 NB. fatal error ColID =. (<'PHI' ) (I. Fsrch)} ColID NB. Standard rel-var labels ColID =. (<'SIGMA') (I. Ssrch)} ColID NB. Standard abs-var labels Var =. (Vsrch =. Fsrch +. Ssrch)#"1 Table NB. Provisional variances n0var =. +/ Vsrch NB. actual number of variance columns n1var =. (n0var>:#y) { (1<.n0var),#y NB. allowed " " " " VarCols =. Vsrch * +/\ Vsrch NB. 0 if ~var, sequence for var cols keepcols =. VarCols <: n1var NB. excessive var columns get 0 ColID =. keepcols # ColID NB. drop var labels Table =. keepcols #"1 Table NB. drop var columns (: (|x)*{:cumul NB. ignore lowest x portion of total signif (order{ i.#y)}signif NB. put markers in y-order ) CleanVar =: 3 : 0 NB. Zeroes negligibly small covariance elements to clean up a matrix (default =. 1e_8) CleanVar y : sdiag =. %: | (<0 1) |: y NB. sqrt of abs of diagonal nonzero =. (*:y) > (*/ ~ x*sdiag) NB. CorrCoef tolerantly non-zero nonzero =. (|: +. ]) nonzero NB. symmetry; either side can pass nonzero =. 1 (<"1 |: i. &> $ nonzero)} nonzero NB. preserve the diagonal elements nonzero*y ) SolveAxb =: 3 : 0 NB. Solution set for x of Ax=b, where A is (jxk), x (kxl), b (jxl). NB. y is A;b. x is the relative tolerance for consistency. (default =. 1e_6) SolveAxb y : 'A b' =. 2{.y NB. right argument tol =. 0 >. x NB. left argument assert 2 = #$A NB. A must be a matrix assert 2 = #$b NB. b must be a matrix 'j k' =. $A NB. A has j rows and k columns 'j1 l' =. $b NB. b has j1 rows and l columns assert j = j1 NB. conformity Ab =. A ,. b NB. A,b tableau Ab =. ~. Ab NB. duplicate equations (rows) eliminated zerorow =. *./"1 =&0 Ab NB. 1 if row all zero Ab =. (-. zerorow)#Ab NB. zero rows eliminated A =. k {."1 Ab NB. first k columns for A b =. (-l){."1 Ab NB. last l columns for b j =. 0{$A NB. A and b now have new j rows 'V Lambda W' =. SVD A NB. singular-value decomposition lambda =. (diag =. (<0 1)&|:) Lambda NB. diagonal of Lambda nonzero =. lambda ~: 0 NB. non-zero eigenvalues s =. +/ nonzero NB. number of ~:0 eigenvals, or rank(A) nzrows =. j {. nonzero NB. non-zero row IDs nzcols =. k {. nonzero NB. non-zero col IDs Lambd11 =. nzcols #"1 nzrows # Lambda NB. square non-singular submatrix of Lambda V1 =. nzrows #"1 V NB. column-wise partition of V V2 =. (-. nzrows) #"1 V W1 =. nzcols # W NB. row-wise partition of V W2 =. (-. nzcols) # W V1t =. |: V1 [ V2t =. |: V2 NB. transposes of V partitions W1t =. |: W1 [ W2t =. |: W2 NB. transposes of W partitions mmult =. +/ . * NB. matrix product GinvA =. W1t mmult V1t %. Lambd11 NB. Moore-Penrose inverse of A, viz. A+ x0 =. GinvA mmult b NB. base solution of Ax=b, viz. (A+)b error =. b - A mmult x0 NB. zero error if Ax=b consistent dist_e =. %: +/ *: error NB. distances of error columns from origin dist_b =. %: +/ *: b NB. distances of depvar columns from origin nosolve =. dist_e > tol*dist_b NB. 0 if error tolerably zero; 1 otherwise NB.'Inconsistent Equations' assert 0 = +. / nosolve nocols =. ./0, -(<&0 # ]) lambda2 NB. zero tolerance from negatives lambda2 =. 0 >. lambda2 NB. small negatives possible, set to 0 lambda =. %: lambda2 NB. square roots since really L^2 zerotol =. zerotol >. (x * >./lambda) NB. maxed with default tolerance rlambda =. lambda > zerotol NB. 1 if tolerably non-zero eigenvalue lambda =. rlambda * lambda NB. small eigenvalues set to zero V1t =. |: rlambda #"1 V NB. columns of V matching non-zero eigenvalues diag =. <@(2&#)"0 i. Trn{m,n NB. indices of diagonal elements Lambda =. lambda diag} Lambda NB. eigenvalues of y Lambda1 =. (m{.rlambda) # (n{.rlambda) #"1 Lambda W1 =. (V1t %. Lambda1) mmult matrix NB. first s rows of W, need to find n-s rows QF =. |: mmult ] NB. quadratic form XtX nrows =. m >. n NB. rows of W are max(m,n) ident =. Iden nrows NB. I(nrows) identity row =. s =. +/ rlambda NB. start with row with index s (0,1,...,nrows-1) W =. W1 NB. start W while. (row < nrows) do. N =. ident - QF W NB. symmetric idempotent orthogonal to W diagN =. (<0 1) |: N NB. diagonal of N, sums to rank(N) maxN =. (i. >./) diagN NB. index of largest diagonal element newW =. maxN { N NB. vector selected from N newW2 =. *: newW NB. squared elements not0 =. newW2 >: *: x NB. non-zero elements newW =. (not0*newW)% %: +/not0*newW2 NB. re-orthogonalized W =. W , newW NB. add new vector to W row =. >: row NB. increment row end. 'V W' =. (|.^:Trn) (|:^:Trn) each V;W end. V;Lambda;W ) Eigen =: 3 : 0 NB. Jacobi method of eigen-decomposition NB. Matrix assumed to be symmetric, not tested NB. Test statement: "result =. Eigen sigma =. -: (transpose + [)sigma =. (2#n) $ _10 + ? (*: n =. 100) # 20" (default =. 1e_10) Eigen y : mmult =. +/ . * trans =. |: Sigma =. y toler =. x toler2 =. *: toler m =. 0 { $Sigma n =. 1 { $Sigma rowindex =. n # i. m colindex =. (m*n) $ i.n diagonal =. , (i. m) (=/) (i. n) NB. diagonal elements upper =. , (i. m) (:n)*normd)*toler2) < (2*normu) do. sigmau2 =. *: sigmau maxu2 =. >. / sigmau2 maxid =. sigmau2 i. maxu2 p =. (maxid { rowidu) NB. Zero this element q =. (maxid { colidu) NB. Zero this element pq =. p,q r =. (i.n) -. pq spp =. (:iter end. order =. \: sigmad NB. order by descending eigenvalues sigmad =. order { sigmad W =. order {"1 W Sigma =. order { (order {"1 Sigma) W; Sigma; sigmad; iter; normd; normu )