
|
Readme file
SERIES
C
Applied
Statistics
Analysis of functional status transitions by using a semi-Markov process model in the presence of left-censored spells, by
L. Cai, N. Schenker and J. Lubitz
Journal of the Royal Statistical Society, Series
C, Applied Statistics, Volume 55
(2006), 477 - 491
SAS Programs in text format
These programs are written in SAS. They are recommended to run in SAS
9.0 or later. When using these programs, please change the extension of
these files from .txt to .sas.
1. SEMAG1A.txt and SEMAG1B.txt
These are the main programs to fit the SMP model using the
stochastic EM algorithm. The two programs are identical in
structure and purpose, except SEMAG1A.txt estimates for the
overall population of 65-year-olds and SEMAG1B.txt estimates
for male and female 65-year-olds separately.
Both programs use the same initial assumption of R=0 to start the algorithm
and run 45 iterations. The E-step and M-step are performed at each iteration.
Between iterations 26 and 45, the program takes the current estimates of
transition probabilities to estimate annual health status for 250000 65-year-olds
and record their simulated data in separate data sets to derive summary
estimates. An example of such data is described below.
2. BS_SEMLEA.txt, BS_EMREPA.txt, BS_EMREPA_1.txt, BS_EMREPA_2.txt and
BS_PREV.txt
These programs are used to estimate the bootstrap standard
errors for life expectancy at 65 for the entire population. BS_SEMLEA.txt
is the main program that controls the resampling of 50 bootstrap samples.
Two samples are generated and fit simultaneously in a batch by BS_EMREPA_1.txt
and BS_EMREPA_2.txt. BS_EMREPA.txt is the program that fits the SMP model
to the bootstrap sample using the stochastic EM algorithm with the R=0
assumption. The first 20 iterations are the 'burn in' period, and during
each of the following five iterations the program simulates annual status
for a cohort of 100000 65-year-olds. The average life expectancy estimates
from these five cohorts are taken as the estimates for this bootstrap sample.
BS_PREV.txt is used to calculate weighted disability prevalence estimates
from each bootstrap sample.
3. BS_SEMLEB.txt, BS_EMREPB.txt, BS_EMREPB_1.txt, BS_EMREPB_2.txt
and BS_PREV.txt
These programs are similar to those described in 3, except
they are used to derive standard errors for male and female
65-year olds.
4. MSLE65.txt, BS_MSLE.txt, BS_MSREP.txt, BS_MSREP_1.txt, BS_MSREP_2.txt
and BS_PREV.txt
MSLE65.txt estimates the multistate life-table transition probabilities
and simulates 400000 65-year-olds separately for male, female and both
genders combined. Estimates for life expectancy, disability incidence and
recovery are summarized from these simulated data. The other programs estimate
the bootstrap standard errors for life expectancy at 65. Like the bootstrap
programs for SMP-EM, 50 bootstrap samples are generated, and two are fitted
simultaneously in a batch.
Data
Due to space constraints, C5808_DATA.txt is only a small subset of the
full data set (250000 persons) that we use to derive summary estimates.
It contains simulated health status from age 65 to death for the first
5000 persons who have no functional limitations at age 65. Each row of
the data is an episode or spell of health states. The description of each
column is given below.
| Column 1: |
Personal identification number. |
| Column 2: |
Sex (Male=1, Female=2). |
| Column 3: |
Age at the beginning of the spell. |
| Column 4: |
Status at the beginning of the spell (1=no limitations,
2=1+ limitations in physical functioning, 3=1+ limitations
in IADL, 4=1+ limitations in ADL). |
| Column 5: |
Status at the end of the spell (1=no limitations, 2=1+
limitations in physical functioning, 3=1+ limitations in
IADL, 4=1+ limitations in ADL, 5=death). |
| Column 6: |
Age at the end of the spell. |
| Column 7: |
Duration of the spell, in years. The duration of the
last event, death, is calculated at the middle of the annual
age interval, that is, the difference between column 6
and column 3 minus 0.5. |
Liming Cai
National Center for Health Statistics
3311 Toledo Road, Room 6330
Hyattsville
MD 20782
USA
E-mail: lcai@cdc.gov
Tel: +1 301-458-4133
Datasets (cai2.zip,
size - 226KB)
|
Journals
SERIES A
Statistics
in Society
SERIES B
Statistical
Methodology
SERIES
C
Applied
Statistics
SERIES D
The
Statistician

|