Readme file

SERIES B
Statistical Methodology

On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution,
Y. Wang
volume 69, (2007), 185–198

1. General

The constrained Newton method for computing the NPMLE is implemented in R (http://www.r-project.org/), for mixtures with normal or Poisson components. The implementation makes use of the public-domain Fortran subroutine NNLS that is downloaded from:

http://www.netlib.org/lawson-hanson

The program provided here will eventually be made into an R package and submitted to CRAN. Check the Author's website for latest developments:

http://www.stat.auckland.ac.nz/~yongwang

2. Normal mixtures

For normal mixtures, the function for CNM is cnm.normmix(). The argument list is:

x Observations
verb Verbosity level (0-4)
prec Precision level to combine components
maxit Maximum number of iterations
check.points Number of grid points for computing the derivatives of
the gradient function
tol Tolerance level for sup gradient
mix Initial estimate of the NPMLE
method Either cnm (for CNM) or cn1 (for CN1)
lsm Line search method, either halving (for step-halving) or
optim (for optimal line search)

An illustration of the function usage as follows, which uses the mixture
obtained by Efron (2004), as in Example 2 of the paper. Inside the R
environment:

> mix.efron = normmix(mu=c(-10.9, -7.0, -4.9, -1.8, -1.1, 0.0, 2.4, 6.1),
pi=c(1.5, 1.3, 5.6, 12.3, 13.6, 60.8, 2.7, 2.2))
> set.seed(1)
> x = rnormmix(n=1000, mix=mix.efron)
> cnm.normmix(x, tol=1e-5)
$mix

  mu pi va
[1,] -11.0211470 0.013787881 1
[2,] -7.5956141 0.007920682 1
[3,] -5.2393841 0.057358669 1
[4,] -1.4091182 0.308104456 1
[5,] 0.0958183 0.576236418 1
[6,] 2.9801987 0.014313713 1
[7,] 5.9731478 0.019657020 1
[8,] 7.0544410 0.002621161 1


$ll
[1] -2051.403

$num.iterations
[1] 21

$max.gradient
[1] 8.02974e-06

$convergence
[1] 0

The output includes:

mix The computed NPMLE of the mixing distribution
ll Log-likelihood value at the computed NPMLE
num.iterations Number of iterations needed
max.gradient Sup gradient value
convergence = 0 converged successful;
= 1, reached maxium number of iterations

3. Poisson mixtures

For Poisson mixtures, the function for CNM is cnm.poismix(). The argument list is:

x Observations
verb Verbosity level (0-4)
prec Precision level to combine components
maxit Maximum number of iterations
check.points Number of grid points for computing the derivatives of
the gradient function
tol Tolerance level for sup gradient
mix Initial estimate of the NPMLE
method Either CNM or CN1
lsm Line search method, either halving (for step-halving) or
optim (for optimal line search)

An illustration of the function usage as follows, which uses Bohning's
(2000) Thailand data set, as in Example 1 of the paper. Inside the R
environment:

> thai = data.frame( x=c(0:21,23,24),
freq=c(120,64,69,72,54,35,36,25,25,19,18,18,13,4,3,6,6,5,1,3,1,2,1,2) )
> cnm.poismix(x=thai$x, w=thai$freq)
$mix

  lambda pi
[1,] 0.1433948 0.1969314
[2,] 2.8172966 0.4799752
[3,] 8.1641894 0.2692575
[4,] 16.1558529 0.0538359

$ll
[1] -1553.810

$num.iterations
[1] 21

$max.gradient
[1] 3.858321e-07

$convergence
[1] 0

The output includes:

mix The computed NPMLE of the mixing distribution
ll Log-likelihood value at the computed NPMLE
num.iterations Number of iterations needed
max.gradient Sup gradient value
convergence = 0 converged successful;
= 1, reached pre-given maxium number of iterations

4. Others

For more details about the program, please refer to the R code. Most functions are commented inside the code.

Yong Wang
Department of Statistics
University of Auckland
Private Bag 92019
Auckland
New Zealand

E-mail: yongwang@stat.auckland.ac.nz

Journals

SERIES A
Statistics in Society

SERIES B
Statistical Methodology

SERIES C
Applied Statistics

SERIES D
The Statistician