Modified SAS macro from Kowalski and Tu (2001) for analyzing repeated ordinal/categorical data with incompatible response and/or covariate data formats and/or covariate measurement error using GEE NOTE A: -------- Modifications were made and permission was obtained to make available our extended version of the initial SAS macro from Lipsitz, Kim and Zhao (Stat. In Med. 1994) for analyzing repeated categorical data using GEE. NOTE B: -------- If a user finds any errors or the need to add/modify parts of this SAS macro, please document any changes below before distributing it for use by others and obtain appropriate permissions. Documentation of Additional Features by Kowalski and Tu (2001): =============================================================== For the following options: 0=no (absence) and 1=yes (presence) CMEFLG = indicates presence of covariate measurement error into model CMEV = refers to name of variable for empirical Bayes variance, i.e. for Var(xit|zitobs) NUMCME = number of covariates with measurement error. The program is currently restricted to only allowing one covariate with measurement error which should be listed at the end of the covariates (x), i.e. list all covariate without measurement first EBFLG = indicates whether an empricial Bayes estimate is being used as one of the covariates (used mostly for error checking) LAPPX = indicates whether or not to use a probit approximation to the logit, rather than the logit link (this is done for covariate measurement error models which require a logit link). Note that should use link=cprobit when using this option. However, do not need to specify it for including covariate measurement error. It will use the approximation by default in this case. RIF = indicates presence of incompatible response formats RIFVAR = refers to name of varible which flags records to indicate that the older format is observed, in which case the response contains a predicted probabilities. RRELAT = refers to the variable names given to response relational model parameter estimates, {gam01 gam02 gam1 gam2 btpos btind} CELLPROB = refers to variable names given to the cell probabilities calculated based on the response relational model. (Be sure that the ordinal structure used for the relational model matches the one used to model the probabilities for the model of interest so that the interpretation of parameter estiamtes is correct.). CIF = indicates presence of incompatible covariate formats CIFVAR = refers to name of varible which flags records (with a 1) to indicate that the older covariate format is observed for a given individual at a given time. MEPARM = refers to the names of the variables given to the measurement error parameter estimates, {muxt, sigxt, siget} CRELAT = refers to the names of the variables given to the covariate relational model parameter estimates, {eta0,eta1,mseon}; COVMSEV= estimates of the variance of mse within each group (negatives, indeterminates, positives) obtained in covariate relational sample REPDAT = refers to the names of the variables given to the number of replicates based on the observed covariate format for a given individual at a given time and the average (over the replicates) of the observed covariate format, i.e. {nrep,muz}. QX = number of parameters for covariate relational model CRELATV = estimate of asymptotic covariance corresponding to parameter estimates based on response relational model. Should be input as a vector to be read in horizontally. QY = number of parameters for response relational model RRELATV = estimate of asymptotic covariance corresponding to parameter estimates based on response relational model. Should be input as a vector to be read in horizontally. VARFLG = indicates whether or not variance should be modeled according to groups, time, etc. Varmod = refers to the names of the variables indicating which groups the variance should be modeled for and at which time points within each group. Note that if there is no time-dependent structure within a particular group, then the corresponding time-dependent variable should be set to 0. ssrelat = dataset indication sample sizes used in each relational sample nrx,nry /* BEGIN SAS MACRO */ %macro delem(reg,num); %do k=1 %to # %let covk=%scan(®,&k); if &covk=. then delete; %end; %mend delem; %macro multrpmd(data=_last_,cmeflg=cmeflg,cmev=cmev,meparm=meparm, ssrelat=ssrelat,numcme=numcme,y=y, rif=rif,rifvar=rifvar, rrelat=rrelat,qy=qy,rrelatv=rrelatv, cellprob=cprob, x=x,cif=cif,cifvar=cifvar, crelat=crelat,crelatv=crelatv,covmsev=covmsev, repdat=repdat,ebflg=ebflg, varflg=1,varmod=varmod, id=id,maxit=50, corr=1,time=,print=no,crit=.000001,link=clogit, lappx=lappx,it_hist=last, c1=,title1=,c2=,title2=, format=2.0); %* id = 1 to n; %* time = time of observation; %* y = ordinal response ; %* corr = 1 = independence = 2 = exchangeable = 3 = banded (must use time variable) = 4 = unstructured (must use time variable or have balanced data); %if &corr=2 %then %do; %let covj= ; %end; %*count the covariates; %let p=0; %let ordt=0; %do %while(%scan(&x,&p+1)^=); %let p=%eval(&p+1); %end; %if (&rif=1) %then %do; data predy(keep=&cellprob &id &x &cmev &meparm &crelat &rrelat &repdat &rifvar &cifvar &time &varmod) discy; set &data; if (&rifvar=1) then do; %delem(&x,&p); output predy; end; if (&rifvar=0) then do; one=1; if &y=. then delete; %delem(&x,&p); output discy; end; data one; set discy predy; data resasymv; set &rrelatv; %end; %else %do; data one; set &data; if &y=. then delete; %delem(&x,&p); run; %end; proc sort data=one; by &id; run; data one; set one; by &id; if first.&id then idm+1; run; proc sort data=one; by &id; run; *USE CORRECT SORTING OF RESPONSE VAR FOR INTERPRETATION; /* proc sort data=one out=mindat; by descending y; run; */ %if &link=clogit %then %do; proc logistic outest=esti; model &y=&x / link=logit; run; %end; %if &link=cprobit %then %do; %if &rif=1 %then %do; proc sort data=discy; by &id; run; proc logistic data=discy outest=esti; model &y=&x / link=normit; run; %end; %else %do; proc logistic data=one outest=esti; model &y=&x / link=normit; run; %end; %end; %if &link=logit %then %do; proc catmod data=one ; response logits/ outest=esti; direct &x; model &y=&x / ml nogls noprofile noiter nodesign; run; %end; %if &link=logit %then %do; data par(drop=_type_ _name_); set esti; if (_type_ ne 'PARMS') then delete; run; %end; %else %do; data par(drop=_type_ _name_ _lnlike_); set esti; if (_type_ ne 'PARMS') then delete; run; %end; data x(keep=&x); set one; length &x 8; run; %if &cmeflg=1 %then %do; data cmev(keep=&cmev); length &cmev 8; set one; run; data meparm(keep=&meparm); set one; run; %end; %if &cif=1 %then %do; data crelat(keep=&crelat); set one; run; data covasymv; set &crelatv; run; data coldind(keep=&cifvar); set one; run; data covmsev; set &covmsev; run; data ssrelat; set &ssrelat; run; %end; %if &rif=1 %then %do; data rrelat(keep=&rrelat); set one; run; data roldind(keep=&rifvar); set one; run; %end; data repdat(keep=&repdat); set one; run; data varmod(keep=&varmod); set one; run; %if &rif=1 %then %do; proc freq data=discy; tables &y/out=new noprint; run; %end; %else %do; proc freq data=one; tables &y/out=new noprint; run; %end; data new(keep=&y ordy); set new; ordy+1; run; data y(keep=&y junk &cellprob); set one; junk+1; run; %if &rif=0 %then %do; proc sort data=y; by &y; run; data y(keep=junk ordy); merge y new; by &y; run; %end; proc sort data=y out=y(drop=junk); by junk; run; data id(keep=idm); set one; run; %if &time^= %then %do; proc freq data=one; tables &time /out=new noprint; run; data new (keep=&time ordt); set new; ordt+1; run; data occas ( keep = &time junk); set one; junk+1; run; proc sort data= occas; by &time; run; data occas(keep=junk ordt); merge occas new; by &time; run; proc sort data=occas out=occas(drop=junk); by junk; run; %end; proc datasets; delete one new mindat; run; proc iml worksize=5000; *reset nolog noprint; corr = &corr; print "STARTING IML CODE"; /*initial marginal parameters */ USE PAR; READ ALL INTO BETA; beta=beta`; nbeta = nrow(beta); USE Y; READ ALL INTO Y; %if &rif=1 %then %do; maxy=max(y[,1]); USE RESASYMV; READ ALL INTO RESASYMV; USE RRELAT; READ ALL INTO RRELAT; USE ROLDIND; READ ALL INTO ROLDIND; %end; %else %do; maxy = max(y); %end; /* # levels of multinomial */ %if &link^=logit %then %do; trans = j(maxy-1,maxy-1,1); /* cumulative transform matrix */ do r = 1 to maxy-1; do c = 1 to maxy-1; if c > r then do; trans[r,c]=0; end; end; end; transinv =inv(trans); %end; USE X; READ ALL INTO X; USE REPDAT; READ ALL INTO REPDAT; %if &cmeflg=1 %then %do; USE CMEV; READ ALL INTO CMEV; USE MEPARM; READ ALL INTO MEPARM; qm=ncol(meparm); %end; %if &varflg=1 %then %do; USE VARMOD; READ ALL INTO VARMOD; *maxvgrp=max(varmod[,1]); /* Number of variance groups */ maxvgrp=3; totqm=maxvgrp*qm; /* Total number of parameters for sigma delta (covariance of m.e. parameters) */ maxvgrpt=J(maxvgrp,1,0); *do jk=1 to maxvgrp; /* Number of Levels within each variance group */ * maxvgrpt[jk,1]=max(varmod[,1]=jk); *end; *print"maxvgrpt=" maxvgrpt; %end; %if &cif=1 %then %do; USE SSRELAT; READ ALL INTO SSRELAT; nrx=ssrelat[,1]; nry=ssrelat[,2]; USE CRELAT; READ ALL INTO CRELAT; USE COVASYMV; READ ALL INTO COVASYMV; qx=nrow(covasymv)*maxvgrp; nqxprm=qx/maxvgrp; chkit=nrow(covasymv); cavnlo1=covasymv[,1:nqxprm]; cavnlo2=covasymv[,nqxprm+1:nqxprm+2]; cavnlo3=covasymv[,nqxprm+3:nqxprm+4]; crelpnlo=covasymv[,nqxprm+5:nqxprm+7]; crelpoln=covasymv[,nqxprm+8:nqxprm+10]; *covrelav=block(cavnlo1,cavnlo2,cavnlo3); USE COVMSEV; READ ALL INTO COVMSEV; blk1=block(cavnlo1,covmsev[1,1]); blk2=block(cavnlo2,covmsev[1,2]); blk3=block(cavnlo3,covmsev[1,3]); v3nlo=blk1//blk2//blk3; covrelav=block(blk1,blk2,blk3); totqx=nrow(covrelav); msenlo=covmsev[1,4]||covmsev[1,5]||covmsev[1,6]; mseoln=covmsev[2,4]||covmsev[2,5]||covmsev[2,6]; USE COLDIND; READ ALL INTO COLDIND; %end; nall =nrow(x); /* number records in dataset, times*ind */ USE ID; READ ALL INTO ID; n = max(id); /* number of indiv. (clusters) */ npair =0; /* number of pairs of times */ maxt = 0; /* maximum # times an indiv was seen */ do i=1 to n; times = loc(id=i); times = ncol(times); npair = npair + times#(times-1); if times > maxt then maxt=times; end; Imaxt = I(maxt); %if &time^= %then %do; USE occas; READ ALL INTO occas; %end; crit=1; Do it=1 to &maxit while (crit > &crit ); *variables used for estimating sigma delta; numoldt=J(maxvgrp,1,0); numnewt=J(maxvgrp,1,0); voldt=J(totqm,maxvgrp,0); vnewt=J(totqm,maxvgrp,0); avgrept=J(maxvgrp,1,0); aoldt=J(maxvgrp,1,0); boldt=J(maxvgrp,1,0); U= J(nbeta,1,0); dvd= J(nbeta,nbeta,0); usq = dvd; G=J(nbeta,totqx,0); F=J(nbeta,&qy,0); H=J(nbeta,qm,0); %if (&varflg=1) %then %do; H=J(nbeta,totqm,0); %end; /* ***************** Correlation matrix ******************** */ if corr > 1 then do; R=j(maxt#(maxy-1),maxt#(maxy-1),0); obs=R; Do i=1 to n; free Y_i X_i p_i A_i Yvar_i Xvar_i; times = loc(id=i); T_i = ncol(times); Yvar_i = Y[times,]; Xvar_i = X[times,]; %if &time^= %then %do; occas_i = occas[times,]; %end; A_i = 0; Do j=1 to T_i; Y_it = j(maxy,1,0); Y_it[ Yvar_i[j,] , 1 ] = 1; Y_it = Y_it[1:maxy-1,1]; Y_i = Y_i // Y_it; %if &link^=logit %then %do; X_it = j(maxy-1,1,1) @ Xvar_i[j,]; X_it = I(maxy-1) || X_it; X_i = X_i // X_it; %if &link=clogit %then %do; mp_it = exp(X_it*beta)/(1 + exp(X_it*beta) ); %end; %if &link=cprobit %then %do; mp_it = probnorm(X_it*beta); %end; p_it = transinv*mp_it; %end; p_i = p_i // p_it; A_it = diag(p_it) - p_it*p_it`; A_i = Block(A_i,A_it); end; nrA = nrow(A_i); A_i = A_i[2:nrA,2:nrA]; call eigen(M,ev,A_i); e_i = inv(ev*sqrt(DIAG(M))*ev`)*(Y_i - p_i); %if &time= %then %do; e_i = shape(e_i, maxt#(maxy-1), 1, 0); %end; %else %do; I_i = Imaxt[,occas_i`]; I_i = I_i @ I(maxy-1); e_i = I_i*e_i; obs_i = I_i[,+]; obs = obs + obs_i*obs_i`; %end; R = R + e_i*e_i`; end; end; if corr = 4 then do; %if &time= %then %do; R = R/(n-nbeta); %end; %else %do; R = R/(obs-nbeta); %end; ch = 1 - ( (I(maxt)) @ j(maxy-1,maxy-1,1) ); R = ch#R + I(nrow(R)); end; if corr = 3 then do; free co; do ti = 2 to maxt; ch = j(maxy-1,maxt#(maxy-1),0); ch[, (ti-1)#(maxy-1)+1 : ti#(maxy-1) ] = j(maxy-1,maxy-1,1); ch = toeplitz(ch); R_st = ch#R; obs_st = ch#obs; ch = j(maxt,1,1) @ i(maxy-1) ; R_st = (ch`*R_st*ch/2)/(ch`*obs_st*ch/2-nbeta); co = co || R_st; end; R = I(maxy-1) || co; R = toeplitz(R); end; if corr = 2 then do; ch = 1 - ( (I(maxt)) @ j(maxy-1,maxy-1,1) ); R = ch#R; ch = j(maxt,1,1) @ i(maxy-1) ; R_st = (ch`*R*ch/2)/(npair/2-nbeta); end; /* **************END Correlation matrix ******************** */ Do i=1 to n; free Y_i X_i p_i D_i A_i Yvar_i Xvar_i; times = loc(id=i); T_i = ncol(times); Xvar_i = X[times,]; %if (&varflg=1) %then %do; free Varg_i Vargt_i Dmeg1_i Dmeg2_i Dmeg3_i Deta1_i Deta2_i Deta3_i; Varg_i=Varmod[times,1]; Vargt_i=Varmod[times,2]; %end; %if (&rif=1) %then %do; free ROLD_i YOLD_i ygam F_i Dgam_i; Yvar_i = Y[times,1:3]; Yold_i = rrelat[times,1:2]; ygam = rrelat[times,3:6]; ygam = ygam[1,]; ygam = ygam`; ROLD_i = ROLDIND[times,]; %end; %else %do; Yvar_i = Y[times,]; %end; %if (&cmeflg=1) %then %do; free CMEV_i CMEF_i MEPARM_i REPDAT_i H_i Dme_i; CMEvar_i = CMEV[times,]; MEPARM_i = MEPARM[times,1:3]; REPDAT_i = REPDAT[times,1:3]; %end; %if (&cif=1) %then %do; free CRELAT_i COLD_i G_i Deta_i; CRELAT_i = CRELAT[times,1:3]; COLD_i = COLDIND[times,]; %end; %if &time^= %then %do; occas_i = occas[times,]; %end; A_i = 0; timeflgb=0; Do j=1 to T_i; Y_it = j(maxy,1,0); %if &rif=1 %then %do; if (Yvar_i[j,1] ^= .) then do; Y_it[ Yvar_i[j,1] , 1 ] = 1; Y_it = Y_it[1:maxy-1,1]; Y_i = Y_i // Y_it; end; else do; do s=2 to maxy; Y_it[s-1,1]=Yvar_i[j,s]; end; Y_it = Y_it[1:maxy-1,1]; Y_i = Y_i // Y_it; end; %end; %else %do; Y_it[ Yvar_i[j,] , 1 ] = 1; Y_it = Y_it[1:maxy-1,1]; Y_i = Y_i // Y_it; %end; %if (&varflg=1) %then %do; *Variable indicating which group at each time; varg_it=Varg_i[j,1]; *Varible indicating which time within a variance group; vargt_it=Vargt_i[j,1]; %end; %if &link^=logit %then %do; X_it = j(maxy-1,1,1) @ Xvar_i[j,]; X_it = I(maxy-1) || X_it; X_i = X_i // X_it; cappx=(15#(2#arcos(0)))/(16#sqrt(3)); c2appx=cappx##2; *Locate covariate with m.e.; covwme=nbeta-(maxy-1); timeflg=.; timeflga=.; *timeflgb=.; %if &cmeflg=1 %then %do; if (vargt_it ^= .) then do; *For modeling the variance between groups which does not depend on t, vargt_it=0; if (it=1 & j=2 & vargt_it=0) then timeflg=vargt_it; *For modeling the variance within a group which depends on t, vargt_it ne 0; if (vargt_it > 0) then do; *if (it=1) then do; timeflg=vargt_it; timeflga=vargt_it; if (j=1) then timeflgb=timeflga; varg_it=vargt_it; *end; end; end; best=beta[maxy,]; a1est=beta[1,]; a2est=beta[2,]; alphaest=a1est // a2est; CMEM_it = Xvar_i[j,covwme]; CMEV_it = CMEvar_i[j,]; CMEV_i = CMEV_i // CMEV_it; CMEF_it = 1/(sqrt(c2appx+((best##2)#CMEV_it))); CMEF_i = CMEF_i // CMEF_it; mux_it=meparm_i[j,1]; sige_it=meparm_i[j,2]; sigx_it=meparm_i[j,3]; muz_it=repdat_i[j,1]; r_it=repdat_i[j,2]; muzvarit=repdat_i[j,3]; sigz_it=((1/r_it)#sige_it)+sigx_it; sigzlxit=((1/r_it)#sige_it); a_it=(((1/r_it)#sige_it)#mux_it)/sigz_it; b_it=sigx_it/sigz_it; %end; %if &cif=1 %then %do; cnlo1_it=crelat[j,1]; cnlo2_it=crelat[j,2]; cnlov_it=crelat[j,3]; cold_it=cold_i[j,]; %if &cmeflg=1 %then %do; GFACT_it=b_it/CMEF_it; %end; %end; *Code for calculating sigma delta within each group, regardless of time; if (timeflg ^= .) then do; zerofill=J(maxvgrp,maxvgrp,0); avgrept[varg_it,]=avgrept[varg_it,]+r_it; if (cold_it=0) then do; numnewt[varg_it,]=numnewt[varg_it,]+1; antmp=muzvarit-sige_it; bntmp=muz_it-mux_it; cntmp=((muz_it-mux_it)##2)-sigz_it; vnewit=bntmp//antmp//cntmp; nonzero1=vnewit*vnewit`; if (varg_it=1) then addvnew=nonzero1//zerofill//zerofill; if (varg_it=2) then addvnew=zerofill//nonzero1//zerofill; if (varg_it=3) then addvnew=zerofill//zerofill//nonzero1; vnewt=vnewt+addvnew; end; if (cold_it=1) then do; numoldt[varg_it,]=numoldt[varg_it,]+1; aoldt[varg_it,]=aoldt[varg_it,]+muz_it; boldt[varg_it,]=boldt[varg_it,]+muzvarit; aotmp=muzvarit-((crelpoln[2,varg_it]##2)#(sige_it+msenlo[,varg_it])); botmp=muz_it-mux_it; cotmp=((muz_it-mux_it)##2)-((crelpoln[2,varg_it]##2)#(sigz_it+((1/r_it)#msenlo[,varg_it]))); voldit=botmp//aotmp//cotmp; nonzero2=voldit*voldit`; if (varg_it=1) then addvold=nonzero2//zerofill//zerofill; if (varg_it=2) then addvold=zerofill//nonzero2//zerofill; if (varg_it=3) then addvold=zerofill//zerofill//nonzero2; voldt=voldt+addvold; end; end; %if &rif=1 %then %do; rold_it=rold_i[j,]; %end; %if &link=clogit %then %do; mp_it = exp(X_it*beta)/(1 + exp(X_it*beta) ); %end; %if &link=cprobit %then %do; %if &cmeflg=1 %then %do; check_it=X_it*beta; mecov_it = (X_it*beta)#CMEF_it; mp_it = probnorm(mecov_it); %end; %else %do; %if &lappx=1 %then %do; mp_it = probnorm((X_it*beta)#(1/cappx)); %end; %else %do; mp_it = probnorm(X_it*beta); %end; %end; *USE FOR EMP BAYES LOGIT APPX; %if (&ebflg=1 | &rif=1) %then %do; do jk=1 to maxy-1; if (mp_it[jk,] < 0.0000001) then mp_it[jk,]=0.000001; if (mp_it[jk,] > 0.9999999) then mp_it[jk,]=0.999999; end; %end; %end; p_it = transinv*mp_it; %end; p_i = p_i // p_it; do jk=1 to maxy-1; if (p_it[jk,] < 0.0000001) then p_it[jk,]=0.000001; if (p_it[jk,] > 0.9999999) then p_it[jk,]=0.999999; end; A_it = diag(p_it) - p_it*p_it`; A_i = Block(A_i,A_it); %if &link=clogit %then %do; D_it = X_it`*Diag( mp_it#(1-mp_it) )*transinv`; %end; *Derivative needed for response incompatible formats, based on using cumulative logit model for relating the two responses; %if &rif=1 %then %do; Dgam_it=J(maxy-1,&qy,0); if rold_it=1 then do; yold_it = j(maxy-1,1,1) @ yold_i[j,]; yold_it = I(maxy-1) || yold_it; ycp_it = exp(yold_it*ygam)/(1 + exp(yold_it*ygam) ); Dgam_it = yold_it`*Diag( ycp_it#(1-ycp_it) )*transinv`; Dgam_it = Dgam_it`; end; %end; %if &link=cprobit %then %do; nrmden = (1/sqrt( 4#arcos(0) ))#exp( (-1/2) # ( (mecov_it)##2 ) ); *Derivative needed for covariate incompatible formats, based on using probit approximation to logit for model of interest; %if &cif=1 %then %do; Deta_it=J(maxy-1,nqxprm+1,0); if cold_it=1 then do; tmpgit=nrmden[1,] // (nrmden[2,]-nrmden[1,]); sigfact=-0.5#b_it#(1/r_it)#(CMEF_it##2); Deta_it=(best#tmpgit) || (muz_it#tmpgit) || (sigfact#tmpgit) ; Deta_it=(b_it#CMEF_it)#Deta_it; end; %end; %if &cmeflg=1 %then %do; D_it = X_it`*Diag(( 1/sqrt( 4#arcos(0) ) )#exp( (-1/2) # ( (mecov_it)##2 ) ))*transinv`; *do jk=1 to maxy-1; * if (nrmden[jk,] < 0.0000001) then nrmden[jk,]=0.000001; *end; *count number of covariates without m.e.; covwome=nbeta-&numcme; D_it[nbeta,1]=nrmden[1,]#((c2appx#CMEM_it)-(best#a1est#CMEV_it)); D_it[nbeta,2]=(nrmden[2,]#((c2appx#CMEM_it)-(best#a1est#CMEV_it)))-D_it[nbeta,1]; cmefact=(CMEF_it##3); D_it[nbeta,1]=D_it[nbeta,1]#cmefact; D_it[nbeta,2]=D_it[nbeta,2]#cmefact; do jk=1 to covwome; do st=1 to maxy-1; D_it[jk,st]=D_it[jk,st]#CMEF_it; end; end; *Calculate derivative of mean wrt measurement error parameters; Dmmu_it=sigzlxit/sigz_it; Dmsx_it=(sigzlxit/(sigz_it##2))#(muz_it-mux_it); Dvsx_it=(sigzlxit##2)/(sigz_it##2); Dmse_it=(((1/r_it)#sigx_it)/(sigz_it##2))#(mux_it-muz_it); Dvse_it=((1/r_it)#(sigx_it##2))/(sigz_it##2); %if &cif=1 %then %do; if cold_it=1 then do; Dmsx_it=(sigzlxit/(sigz_it##2))#((cnlo1_it+cnlo2_it#muz_it)-mux_it); Dvsx_it=Dvsx_it+((2#cnlov_it#sigx_it#sige_it#(1/(r_it)##2))/(sigz_it##3)); Dmse_it=(((1/r_it)#sigx_it)/(sigz_it##2))#(mux_it-(cnlo1_it+cnlo2_it#muz_it)); Dvse_it=Dvse_it-((2#cnlov_it#(sigx_it##2)#(1/(r_it)##2))/(sigz_it##3)); end; %end; tmpgit=nrmden[1,] // (nrmden[2,]-nrmden[1,]); Dmu_it=((best#Dmmu_it)#CMEF_it)#tmpgit; Dsx_it=Dmsx_it // Dvsx_it; Dse_it=Dmse_it // Dvse_it; Ds_it=Dsx_it || Dse_it; dma_it=-1#(((alphaest+best#CMEM_it)#best#Dvsx_it)#(CMEF_it##2)#0.5); Dfs_it= J(maxy-1,1,1) || dma_it; Dmev_it=Dfs_it*Ds_it; Dsig1_it=(best#CMEF_it)#(transinv*Diag(nrmden)*Dmev_it); Dsig_it=Dsig1_it[,2] || Dsig1_it[,1]; Dme_it=Dmu_it || Dsig_it; *For modelling variance between groups; %if (&varflg=1) %then %do; *HIV Stable Negatives; if (varg_it=1) then do; Dmeg1_it=Dme_it; Dmeg2_it=J(maxy-1,qm,0); Dmeg3_it=J(maxy-1,qm,0); Deta1_it=Deta_it; Deta2_it=J(maxy-1,nqxprm+1,0); Deta3_it=J(maxy-1,nqxprm+1,0); end; *HIV Stable Indeterminates; if (varg_it=2) then do; Dmeg2_it=Dme_it; Dmeg1_it=J(maxy-1,qm,0); Dmeg3_it=J(maxy-1,qm,0); Deta2_it=Deta_it; Deta1_it=J(maxy-1,nqxprm+1,0); Deta3_it=J(maxy-1,nqxprm+1,0); end; *HIV Stable Positives; if (varg_it=3) then do; Dmeg3_it=Dme_it; Dmeg1_it=J(maxy-1,qm,0); Dmeg2_it=J(maxy-1,qm,0); Deta3_it=Deta_it; Deta1_it=J(maxy-1,nqxprm+1,0); Deta2_it=J(maxy-1,nqxprm+1,0); end; %end; %end; %else %do; %if &lappx=1 %then %do; nrmdenla = ( 1/sqrt( 4#arcos(0) ) )#exp( (-1/2) # ( ((X_it*beta)#(1/cappx))##2 ) ); *USE ONLY FOR STDOD READING; %if &ebflg=0 %then %do; do jk=1 to maxy-1; if (nrmdenla[jk,] < 0.0000001) then nrmdenla[jk,]=0.000001; end; %end; D_it = X_it`*Diag(nrmdenla)*transinv`; D_it = D_it#(1/cappx); %end; %else %do; nrmdenla = ( 1/sqrt( 4#arcos(0) ) )#exp( (-1/2) # ( (X_it*beta)##2 ) ); *USE ONLY FOR STDOD READING; %if &ebflg=0 %then %do; do jk=1 to maxy-1; if (nrmdenla[jk,] < 0.0000001) then nrmdenla[jk,]=0.000001; end; %end; D_it = X_it`*Diag(nrmdenla)*transinv`; %end; %end; %end; D_i = D_i || D_it; %if &cmeflg=1 %then %do; %if (&varflg=1) %then %do; Dmeg1_i = Dmeg1_i // Dmeg1_it; Dmeg2_i = Dmeg2_i // Dmeg2_it; Dmeg3_i = Dmeg3_i // Dmeg3_it; Dme_i = Dmeg1_i || Dmeg2_i || Dmeg3_i; %end; %else %do; Dme_i = Dme_i // Dme_it; %end; %end; %if &cif=1 %then %do; %if (&varflg=1) %then %do; Deta1_i = Deta1_i // Deta1_it; Deta2_i = Deta2_i // Deta2_it; Deta3_i = Deta3_i // Deta3_it; Deta_i = Deta1_i || Deta2_i || Deta3_i; %end; %else %do; Deta_i = Deta_i // Deta_it; %end; %end; %if &rif=1 %then %do; Dgam_i = Dgam_i // Dgam_it; %end; timeflgb=timeflga; end; nrA = nrow(A_i); A_i = A_i[2:nrA,2:nrA]; nt_i = nrow(Y_i) / (maxy -1); if corr = 1 then do; V_i = A_i; end; if corr = 2 then do; R_i = I(maxy-1) || repeat(R_st,1,nt_i -1); R_i = toeplitz(R_i); call eigen(M,ev,A_i); A_i1_2 = ev*sqrt(DIAG(M))*ev`; V_i = A_i1_2*R_i*A_i1_2; end; %if &time= %then %do; if corr = 4 then do; call eigen(M,ev,A_i); A_i1_2 = ev*sqrt(DIAG(M))*ev`; V_i = A_i1_2*R*A_i1_2; end; %end; %else %do; if corr > 2 then do; I_i = Imaxt[occas_i`,]; I_i = I_i @ I(maxy-1); call eigen(M,ev,A_i); A_i1_2 = ev*sqrt(DIAG(M))*ev`; V_i = A_i1_2*I_i*R*I_i`*A_i1_2; end; %end; nrvi=nrow(V_i); jkinvar=solve(V_i,I(nrvi)); %if &cmeflg=1 %then %do; H_i=D_i*jkinvar*Dme_i; H=H+H_i; %end; %if &cif=1 %then %do; G_i=D_i*jkinvar*Deta_i; G=G+G_i; %end; %if &rif=1 %then %do; F_i=D_i*jkinvar*Dgam_i; F=F+F_i; %end; u_i = D_i*jkinvar*(Y_i - p_i); u = u + u_i; usq = usq + u_i*u_i`; dvd = dvd + D_i*jkinvar*D_i`; end; dvd=(1/n)#dvd; usq=(1/n)#usq; u=(1/n)#u; *Code for estimating sigma delta within each group; *print "cmeflg=" &cmeflg; %if (&cmeflg=1) %then %do; numt=numoldt+numnewt; propoldt=numoldt/numt; propnewt=numnewt/numt; pioldt=numoldt/nrx; boldt=boldt/numt; aoldt=aoldt/numt; avgrept=avgrept/numt; onefill=J(maxvgrp,maxvgrp,1); coldt=(1/numoldt); cnewt=(1/numnewt); repoldt=(coldt`@onefill)`; repnewt=(cnewt`@onefill)`; vvoldt=repoldt#voldt; vvnewt=repnewt#vnewt; row1b={1 0 0}; row2b={0 1 0}; do newjk=1 to maxvgrp; row3b=0||avgrept[newjk,]||1; basemat=row1b//row2b//row3b; pnewt=sqrt(propnewt[newjk,]); A11_newt=pnewt#basemat; A1_newt=A1_newt//A11_newt; poldt=sqrt(propoldt[newjk,]); etanlot=crelpnlo[2,newjk]; A21_newt=((etanlot##2)#poldt)#basemat; A21_newt[1,]=(1/etanlot)#A21_newt[1,]; A21_newt[3,2]=A21_newt[3,2]#-1; A2_newt=A2_newt//A21_newt; a3fact=(poldt#sqrt(pioldt[newjk,])); row31=1||aoldt[newjk,]||0; row322=2#etanlot#boldt[newjk,]; row32=0||row322||-1; row332=-1#row322#avgrept[newjk,]; row33=0||row332||avgrept[newjk,]; A31_newt=a3fact#(row31//row32//row33); A3_newt=A3_newt//A31_newt; end; sigmep1a=(A1_newt[1:3,])`*vvnewt[1:3,]*A1_newt[1:3,]; sigmep2a=(A1_newt[4:6,])`*vvnewt[4:6,]*A1_newt[4:6,]; sigmep3a=(A1_newt[7:9,])`*vvnewt[7:9,]*A1_newt[7:9,]; sigmep1b=(A2_newt[1:3,])`*vvoldt[1:3,]*A2_newt[1:3,]; sigmep2b=(A2_newt[4:6,])`*vvoldt[4:6,]*A2_newt[4:6,]; sigmep3b=(A2_newt[7:9,])`*vvoldt[7:9,]*A2_newt[7:9,]; sigmep1c=(A3_newt[1:3,])`*v3nlo[1:3,]*A3_newt[1:3,]; sigmep2c=(A3_newt[4:6,])`*v3nlo[4:6,]*A3_newt[4:6,]; sigmep3c=(A3_newt[7:9,])`*v3nlo[7:9,]*A3_newt[7:9,]; sigmep1=sigmep1a+sigmep1b+sigmep1c; sigmep2=sigmep2a+sigmep2b+sigmep2c; sigmep3=sigmep3a+sigmep3b+sigmep3c; sigmep=block(sigmep1,sigmep2,sigmep3); %end; DELTA= solve(dvd,U); beta = beta+DELTA; CRIT= MAX( ABS(DELTA)); end; print, it crit; intc = { "int" }; stop=maxy-1; do j=1 to stop; intn = char(j,1); int = concat(intc,intn); variable = variable || int; end; variable = variable || { &x }; variable = variable`; vb=inv(dvd); *variance matrix; sebeta=sqrt((vecdiag(vb))/n); *vector of estimated standard errors of beta; z=beta/sebeta; *z-statistics; zsq=z#z; p=1-probchi(zsq,1); *two-sided p-value; estimate=beta; se_est=sebeta; %if &link=clogit %then %do; print, { 'LINK: Cumulative Logit' }; %end; %if &link=cprobit %then %do; %if &lappx=0 %then %do; print, { 'LINK: Cumulative Probit' }; %end; %if (&lappx=1 | &cmeflg=1) %then %do; print, { 'LINK: Cumulative Probit Approximation to Logit' }; %end; %end; %if &link=logit %then %do; print, { 'LINK: Polytomous Logit' }; %end; if corr = 1 then do; print, { 'CORRELATION: Independence' }; R = I( maxt#(maxy-1) ); end; if corr = 2 then do; print, { 'CORRELATION: Exchangeable' }; R = ( 1 - Imaxt ) @ R_st; R = R + I(nrow(R)); end; if corr = 3 then do; print, { 'CORRELATION: banded' }; end; if corr = 4 then do; print, { 'CORRELATION: unstructured' }; end; /* print, { 'PARAMETER ESTIMATES with naive variance' }; print, variable estimate se_est z p; vb=vb*usq*vb; *robust variance matrix; sebeta=sqrt((vecdiag(vb))/n); *vector of estimated standard errors of beta; z=beta/sebeta; *z-statistic; zsq=z#z; p=1-probchi(zsq,1); *two-sided p-value; se_est=sebeta; print, { 'PARAMETER ESTIMATES with robust variance' }; print, variable estimate se_est z p; print, { 'ROBUST COVARIANCE MATRIX OF PARAMETER ESTIMATES' }; print, vb; */ /* print, { 'Sample Sizes Used' }; print, 'nrx=' nrx; print, 'nry=' nry; print, 'n=' n; */ print, { 'COVARIANCE COMPONENTS' }; bhatinv=inv(dvd); print, "Bhatinv"; print, bhatinv; print, "COVARIANCE OF SCORE"; print, "Sig_U="; print, usq; print, "Binv*Sig_U*Binv"; bsigu=bhatinv*usq*bhatinv; print, bsigu; %if &cmeflg=1 %then %do; print, "COVARIANCE OF COVARIATE MEASUREMENT ERROR"; print, "H"; H=(1/n)#H; print, H; print, "Sig_Delta"; print, sigmep; print, "H*Sig_Delta*t(H)"; covmeav=H*sigmep*H`; print, covmeav; print, "Binv*H*Sig_Delta*t(H)*Binv"; bsigme=bhatinv*covmeav*bhatinv; print, bsigme; %end; %if &cif=1 %then %do; print, "COVARIANCE OF COVARIATE INCOMPATIBLE FORMATS"; print, "G"; G=(1/n)#G; print, G; print, "Sig_Eta"; print, covrelav; print, "rhorx*G*Sig_Eta*t(G)"; covifav=(n/nrx)#(G*COVRELAV*G`); print, covifav; print, "Binv*rhorx*G*Sig_Eta*t(Ghat)*Binv"; bsigeta=bhatinv*covifav*bhatinv; print, bsigeta; %end; %if &rif=1 %then %do; print, "COVARIANCE OF RESPONSE INCOMPATIBLE FORMATS"; print, "F"; F=(1/n)#F; print, F; print, "Sig_Gamma"; print, resasymv; print, "rhory*F*Sig_Gamma*t(F)"; resifav=(n/nry)#(F*RESASYMV*F`); print, resifav; print, "Binv*rhory*F*Sig_Gamma*t(F)*Binv"; bsiggam=bhatinv*resifav*bhatinv; print, bsiggam; %end; %if (&rif=1 and &cif=1 and &cmeflg=1) %then %do; print "Total Variance"; totvarp=bhatinv*(usq+covmeav+resifav+covifav)*bhatinv; print, totvarp; print "Total Variance without Incompatible Formats"; varwoif=bhatinv*(usq+covmeav)*bhatinv; print, varwoif; %end; /* %if (&rif=0 and &cif=0 and &cmeflg=1) %then %do; print "Total Variance"; totvarp=bhatinv*(usq+covmeav)*bhatinv; %end; */ *ss for non-seroconverters; *ntu=2372; *ss for seroconverters; ntu=153; %if (&cmeflg=1) %then %do; vb=vb*usq*vb; *robust variance matrix; sebeta=sqrt((vecdiag(vb))/n); *vector of estimated standard errors of beta; sebetatu=sqrt((vecdiag(vb))/ntu); zn=beta/sebeta; zntu=beta/sebetatu; *z-statistic; zsqn=z#z; zsqntu=zntu#zntu; pn=1-probchi(zsqn,1); pntu=1-probchi(zsqntu,1); *two-sided p-value; print, { 'PARAMETER ESTIMATES Based on Variance of Score' }; print, variable estimate sebeta n zn pn sebetatu ntu zntu pntu; sebewoif=sqrt((vecdiag(varwoif))/n); *vector of estimated standard errors of beta; sebewotu=sqrt((vecdiag(varwoif))/ntu); zwoif=beta/sebewoif; zwoiftu=beta/sebewotu; *z-statistics; zsqnwoif=zwoif#zwoif; zsqwotu=zwoiftu#zwoiftu; pwoif=1-probchi(zsqnwoif,1); pwoiftu=1-probchi(zsqwotu,1); *two-sided p-value; print, { 'PARAMETER ESTIMATES Based on Variance of Score and Measurement Error' }; print, variable beta sebewoif n zwoif pwoif sebewotu ntu zwoiftu pwoiftu; sebetot=sqrt((vecdiag(totvarp))/n); *vector of estimated standard errors of beta; sebetotu=sqrt((vecdiag(totvarp))/ntu); zntot=beta/sebetot; *z-statistics; ztottu=beta/sebetotu; zsqn=zntot#zntot; zsqtottu=ztottu#ztottu; pntot=1-probchi(zsqn,1); ptottu=1-probchi(zsqtottu,1); *two-sided p-value; print, { 'PARAMETER ESTIMATES Based on Variance of Score, Incompatible Formats and Measurement Error' }; print, variable beta sebetot n zntot pntot sebetotu ntu ztottu ptottu; %end; create out from R; append from R; close out; *Contrast Statements; npars = nrow(estimate); contrast= { 0 &c1 }; do; if contrast = 0 then goto bottom; nc= ncol(contrast); contrast=contrast[1,2:nc]; nr=(nc-1)/(npars); nr=round(nr); contrast=shape(contrast,nr,npars); gsq=(contrast*estimate)`*inv(contrast*vb*contrast`)*(contrast*estimate); df=nrow(contrast); p=1-probchi(gsq,df); c1=contrast; print, {"CONTRAST: &title1" }; print , c1[format=&format]; print, gsq df p; bottom: stop; end; contrast= { 0 &c2 }; do; if contrast = 0 then goto bottom2; nc= ncol(contrast); contrast=contrast[1,2:nc]; nr=(nc-1)/(npars); nr=round(nr); contrast=shape(contrast,nr,npars); gsq=(contrast*estimate)`*inv(contrast*vb*contrast`)*(contrast*estimate); df=nrow(contrast); p=1-probchi(gsq,df); c2=contrast; print, {"CONTRAST: &title2" }; print , c2[format=&format]; print, gsq df p; bottom2: stop; end; quit; proc datasets; delete PAR y x id occas; run; proc print; title 'correlation matrix'; run; title ' '; %mend multrpmd; /* END SAS MACRO */