![]() |
|
Readme fileSERIES A
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 1 | child id (2449 kids, ids in 1..14684) |
| 2 | family id (1558 families, ids in 2782) |
| 3 | community id (161 communities, ids in 1..242) |
| 4 | binary dependent variable (0,1) |
| 5 | corresponding latent variable (logistic) |
| 6 | child-level covariate (mean .0955621, range -.452557 .541957) |
| 7 | family-level covariate (mean -.083816 , range -1.485043 to 2.284106) |
| 8 | community-level covariate (mean -.6857591, range -1.818267 to -.0097808) |
All datasets were generated using true parameter values equal to 0.665267 for the constant and equal to 1 for each of the three covariates, and using normal variates with variance 1 for the family and community effect. Note that only variables 4 and 5 vary across datasets.
Actual Data:
File rggudat.zip contains the two datasets analyzed in this paper.
guImmun.dat: The first dataset refers to complete immunization among children receiving any immunization. It has 2159 observations on 19 variables. The very first line is a header with variable names, so the file can be read into R or S-Plus using read.table(filename,header=T). The variables include child, family and community id numbers, the outcome coded 0-1, and a set of individual, family and community variables used as predictors. These appear in exactly the same order as Table 2 in the paper:
| 1 | kid: child id (2159 kids) |
| 2 | mom: family id (1595 families) |
| 3 | cluster: cluster id (161 communities) |
| 4 | immun: whether fully immunized (1=yes, 0=no) |
| 5 | kid2p: child aged 2+ years |
| 6 | mom25p: mother aged 25+ years |
| 7 | order23: birth order 2-3 |
| 8 | order46: birth order 4-6 |
| 9 | order7p: birth order 7+ |
| 10 | indNoSpa: indigenous, speaks no spanish |
| 11 | indSpa: indigenous, speaks spanish |
| 12 | momEdPri: mother's education primary |
| 13 | momEdSec: mother's education secondary+ |
| 14 | husEdPri: husband's education primary |
| 15 | husEdSec: husband's education secondary+ |
| 16 | husEdDK: husband's education missing |
| 17 | momWork: mother ever worked |
| 18 | rural: rural residence |
| 19 | pcInd81: proportion indigenous in 1981 |
The last predictor is a continuous variable. All others are 0-1 dummy variables, representing discrete factors coded using the reference cell method. The omitted categories are child aged 1 year, mother's age less than 25, birth order 1, ladino, mother with no education, husband with no education, mother never worked, and urban residence.
guPrenat.dat: The second dataset refers to use of modern prenatal care among women using some form of prenatal care. It has 2449 observations on 25 variables. The first line is a header with variable names, so the file can be read into R or S-Plus using read.table(filename,header=T). The variables include level ids, the outcome, and individual, family and community-level predictors. These appear in the same order as Table 3 in the paper.
| 1 | kid: child id (2449 kids) |
| 2 | mom: family id (1558 families) |
| 3 | cluster: cluster id (161 communities) |
| 4 | prenat: used modern prenatal care (1=yes, 0=no) |
| 5 | kid3p: child aged 3-4 years |
| 6 | mom25p: mother aged 25+ years |
| 7 | order23: birth order 2-3 |
| 8 | order46: birth order 4-6 |
| 9 | order7p: birth order 7+ |
| 10 | indNoSpa: indigenous, speaks no spanish |
| 11 | inSpa: indigenous, speaks spanish |
| 12 | momEdPri: mother's education primary |
| 13 | momEdSec: mother's education secondary+ |
| 14 | husEdPri: husband's education primary |
| 15 | husEdSec: husband's education secondary+ |
| 16 | husEdDK: husband's education missing |
| 17 | husProf: husband professional, sales, clerical |
| 18 | husAgrSelf: husband agricultural self-employed |
| 19 | husAgrEmp: husband agricultural employee |
| 20 | husSkilled: husband skilled service |
| 21 | toilet: modern toilet in household |
| 22 | tvNotDaily: television not watched daily |
| 23 | tvDaily: television watched daily |
| 24 | pcInd81: proportion indigenous in 1981 |
| 25 | ssDist: distance to nearest clinic |
All predictors are either continuous variables (numbers 24 and 25) or 0-1 dummy variables (all others) representing discrete factors coded using the reference cell method. Omitted categories are child aged 0-2, mother aged <25, birth order 1, ladino, mother with no education, husband with no education, husband not working or in unskilled occupation, no modern toilet in household, and no television in the household.
For more information please visit the authors' website at http://data.princeton.edu/multilevel or email the author for correspondence:
German Rodriguez
Office of Population Research
Wallace Hall
Princeton University
Princeton
NJ 08544-2091
E-mail: grodri@princeton.edu
Dataset
(rgs3bb1.zip, 916kb)
Dataset
(rgs3bb2.zip, 917kb)
Dataset
(rgs3bb3.zip, 917kb)
Dataset
(rgs3bb4.zip, 917kb)
Dataset
(rggudat.zip, 40kb)
SERIES A
Statistics in
Society
SERIES B
Statistical
Methodology
SERIES C
Applied Statistics
SERIES D
The
Statistician