Readme file

SERIES C 
Applied Statistics

Application of Markov chain Monte Carlo methods to projecting cancer incidence and mortality by
I. Bray,
Journal of the Royal Statistics Society, Series C, Applied Statistics, Volume 51 (2002), Part 2, pages
151 - 164

DEPARTMENT OF MATHEMATICS AND STATISTICS
UNIVERSITY OF PLYMOUTH
PLYMOUTH PL4 8AA

E-mail I.Bray@plymouth.ac.uk
PHONE 44 (0) 1752 233316
FAX 44 (0) 1752 232780

http://www.tech.plym.ac.uk/maths/Staff/ibray/research.html

LARYNX.TXT, OESOPHAGUS.TXT, ORAL.TXT, HD.TXT and MM.TXT contain data sets comprising columns of cases, person-years (pyrs), age (a), period (p), cohort (c) and cc (re-parameterisation of c).
These are the full data sets used in the SPLUS functions
described below. Data files for analysis in BUGS should be edited by the removal of column headings, rows for p=5 and p=6, and removal of column cc. An example BUGS data file is given in PROGRAMS.TXT. The data are compiled according to cancer site.

In LARYNX.TXT, OESOPHAGUS.TXT and ORAL.TXT (oral cavity and pharynx), mortality data are given:

MALES FEMALES
bulgariam bulgariaf
czechm czechf
hungarym hungaryf
polandm polandf

In HD.TXT (Hodgkin's disease) and MM.TXT (multiple myeloma), incidence data are given:

MALES FEMALES
albertam albertaf
bombaym bombayf
nzm nzf
oxfordm oxfordf


BUGS FILES:

APC.bug Main BUGS program for Bayesian APC model
start.in File of initial values
dataset.dat Data file

SPLUS FUNCTIONS:

mcmciter Compiles BUGS output in SPLUS
ratemed Calculates rates based on Bayesian APC model
st.res Calculates standardised residuals
ssr Prints sum of standardised residuals (SSR)
m1 Defines model M1
m2 Defines model M2
llm1 Log likelihood for fitting M1
llm2 Log likelihood for fitting M2
hak Program to fit M1 and M2 and print SSR
os Program to fit classical APC model and print SSR

### BUGS PROGRAM ###

PROGRAM: APC.bug

USAGE

This program was run in Classic BUGS Version 0.30 for DOS
(http://www.mrc-bsu.cam.ac.uk/bugs/classic/contents.shtml).
The following commands compile the program, list the parameters to be saved in the output file, and specify the number of iterations. Note that the analyses in this paper used 1,000 burn-in iterations + 10,000 samples = 11,000 iterations.

compile("APC.bug")
monitor(alphac[])
monitor(betac[])
monitor(gammac[])
monitor(taua)
monitor(taup)
monitor(tauc)
monitor(rate[,])
update(number of iterations)
q()

ARGUMENTS

The following arguments can be changed in the BUGS program (see 'const') according to the number of age groups and periods in the data set, and the number of periods you wish to project.
Note that the defaults given in PROGRAMS.TXT are for a dataset with 15 age groups, with 2 periods projected based on 4 observed.

I = A x P
A = number of age groups
P = number of periods
C = number of cohorts = A + P - 1
PP = P + number of projected periods
CC = C + number of projected periods

INPUT FILES

dataset.dat Text file *.dat comprising columns of cases and
person-years (counts), age, period and cohort
(factors), with rows for periods 5 and 6 deleted
(see example in PROGRAMS.TXT)
start.in Text file giving starting values for age, period and cohort effects (including projected periods),and precision parameters for each time scale (see example in PROGRAMS.TXT)

OUTPUT FILES

Bugs.out Contains monitored samples
Bugs.ind Describes structure of Bugs.out
Bugs.log Log of session

### SPLUS FUNCTIONS FOR ANALYSING BUGS OUTPUT ###

FUNCTION: mcmciter

USAGE: mcmciter(file, niter, A, PP, CC)

ARGUMENTS

file Bugs.out imported into SPLUS
niter total number of iterations
A as in BUGS program
PP as in BUGS program
CC as in BUGS program

Note that the defaults given in PROGRAMS.TXT are for a dataset with 15 age groups, with 2 periods projected based on 4 observed.

VALUE

A matrix of MCMC output. The columns correspond to the parameters and rates monitored by BUGS, with a row for each iteration.

FUNCTION: ratemed

USAGE: ratemed(file, niter, A, PP, CC)

ARGUMENTS

file value of function 'mcmciter' above
niter total number of iterations
A as in BUGS program
PP as in BUGS program
CC as in BUGS program

Note that the defaults given in 'programs.txt' are for a dataset with 15 age groups, with 2 periods projected based on 4 observed.

VALUE

Vector of medians of MCMC iterations (excluding burn-in) for
fitted and projected rates.

NOTE

This program can be edited to produce credible intervals based on centiles of the MCMC iterations.


FUNCTION: st.res

USAGE: st.res(fitted, observed)

ARGUMENTS

fitted vector of fitted rates
observed vector of observed rates

VALUE

The function returns a vector of standardised residuals.

NOTE

This function is also used in assessing the fit of linear
power models and classical APC models.


FUNCTION: ssr

USAGE: ssr(dataset)

ARGUMENTS

dataset full dataset (i.e. periods 1-6) with columns
'cases', 'pyrs' and 'fitted' (from 'ratemed'
above)

VALUE

The function returns the SSR (sum of standardised residuals) for the fitted rates (period = 1-4) and the empirical projection
(p=6).

### SPLUS FUNCTIONS FOR LINEAR POWER MODELS ###

FUNCTION: m1

USAGE: m1(par.vec, A)

ARGUMENTS

par.vec vector of parameters or starting values of length A+1
A number of age groups

VALUE

Vector of expected number of cases under model M1, length equal to that of the data set.


FUNCTION: m2

USAGE: m2(par.vec, A)

ARGUMENTS

par.vec vector of parameters or starting values of length
A+1
A number of age groups

VALUE

Vector of expected number of cases under model M2, length equal to that of the data set.


FUNCTION: llm1

USAGE: llm1(par.vec)

ARGUMENTS

par.vec vector of parameters or starting values of length
A+1 where A is the number of age groups

VALUE

Log likelihood for model M1. A minus sign is added because the SPLUS function 'nlmin' mimimises rather than maximises the likelihood.


FUNCTION: llm2

USAGE: llm2(par.vec)

ARGUMENTS

par.vec vector of parameters or starting values of length
A+1 where A is the number of age groups

VALUE

Log likelihood for model M2. A minus sign is added because the SPLUS function 'nlmin' mimimises rather than maximises the likelihood.

FUNCTION: hak

USAGE: hak(reduced, full, model, likelihood, start)

ARGUMENTS

reduced data set with rows for periods 5 and 6 deleted
full full data set (i.e. periods 1-6)
model m1 or m2
likelihood llm1 or mml2
start vector of starting values of length A+1 where A is the number of age groups

VALUE

This function returns 'T' if convergence was achieved, and 'F'
otherwise. In the case of false convergence try different
starting values, but note that convergence is not always
possible (see Section 5.1 of the paper).

The function then returns the SSR for the fitted rates
(period = 1-4) and the empirical projection (p=6).


### SPLUS FUNCTION FOR CLASSICAL AGE-PERIOD-COHORT MODEL ###

FUNCTION: os

USAGE: os(dataset)

ARGUMENTS

dataset full data set (i.e. periods 1-6)

VALUE

The function returns the SSR for the fitted rates (period = 1-4) and the empirical projection (p=6).

NOTE

This program is for a dataset with A=15, PP=6, and CC=20. For a dataset with 9 age groups, for example, the vectors 'age' and 'cohort' need to be edited.

Dataset (bray.zip 19kb)

Journals

SERIES A
Statistics in Society

SERIES B
Statistical Methodology

SERIES C
Applied Statistics

SERIES D
The Statistician