Readme file
SERIES A
Statistics in Society
The role of tobacco taxes in starting and quitting smoking:
duration analysis of British data M. Forster and A. M. Jones
Journal of the Royal Statistics Society, Series A, Statistics
in Society, Volume 164 (2001), Part 3, pages 517 - 547
Most of the estimates reported in our paper are computed using standard
commands from Stata v.6.0. However estimation of the split population
model of starting and the gamma model of quitting with heaping effects
requires custom programs:-
1. SPLIT POPULATION DURATION MODEL WITH TIME VARYING COVARIATES
(TVCs)
Estimation of the split population model is done with Stata programs,
written for the maximum likelihood "d0" and "lf" routines. To deal with
TVCs the dataset is "expanded" by the age of starting. The data are
stset using id(.) and, in the case of method d0, the subroutine mlsum
is used to allow for the repeated observations on each respondent. The
Stata program is contained in an accompanying ASCII file.
2. GAMMA DURATION MODEL WITH A HEAPING EFFECT
Torelli and Trivellato (1993) propose a solution to the "heaping effect"
based on an explicit measurement model. This model is superimposed on
the underlying duration model leading to a reformulation of the log-likelihood
function. Torelli and Trivellato compare four methods of dealing with
heaping:-
i. Re-formulating the likelihood to allow for the measurement model.
This requires specifying a parametric model of the measurement errors.
ii. The ad hoc approach of adding dummy variables for the heaped observations.
iii. Smoothing the data prior to estimation by using random draws from
a uniform distribution to spread the actual heaped observations. This
means that the results are contingent on the random numbers that are
generated.
iv. Ignoring the heaping and estimating the underlying duration model.
Method i: We have programmed ml estimation of the gamma model,
using the "lf" routine in Stata. We assume that the heaped observations
are those where EXFGAGAN is a multiple of 5 or 10. Because heaping is
due to EXFAGAN the problem only relates to complete spells i.e., those
who have quit smoking. For the observations, the usual contribution
to the likelihood, f(ti), is replaced by, F(uti) - F(lti)
where lti is the lower limit and uti. the upper limit of an interval
of length 5 around ti. The stata program is contained in an accompanying
ASCII file.
Method iii: The problem of heaping relates to EXFAGAN, rather
than the other components of the dependent variable, so we apply the
smoothing method to this variable. For each of the potentially heaped
values (5,10,...) the actual observation is smoothed using pseudo random
integers (the stata command generates EXSMOOTH = EXFAGAN - 3 +int(5*uniform())
with the seed set at 123456789). No adjustment is required for the censored
observations whose durations do not depend on EXFAGAN.
Torelli, N. and Trivellato, U. (1993) Modelling inaccuracies in job-search
duration data. Journal of Econometrics, 59:187-211.
Contact details:
Professor Andrew Jones
Department of Economics and Related Studies
University of York
York
YO10 5DD
United Kingdom
Fax: +44 1904 433759
E-mail: amj1@york.ac.uk
Web: http://www.york.ac.uk/res/herc
Dataset
(A580R.txt 11kb)
|