Readme file

SERIES C  
Applied Statistics

A nonparametric Bayesian model for inference in related longitudinal studies, P. Mueller, G. Rosner, M. De Iorio and S. MacEachern
Journal of the Royal Statistical Society, Series C, Applied Statistics, Volume 54 (2005) part 3, 611 - 626

DESCRIPTION OF DATA SET:

In the paper, we analyze white blood cell count data from three studies carried out by the Cancer and Leukemia Group B (CALGB), a co-operative group of university hospitals funded by the U.S. National Cancer Institute. See the paper, Section 2 ("Data") for a short description of the three studies. See the original papers listed below for more details.

CALBG 8541
As indicated in the paper, we only include data from the high-dose group. The paper did not use make use of the cyclophosphamide dose in study 8541. It was 600 mg per meter squared body-surface area). The treatment arm also includes two other drugs, doxorubicin and 5-fluorouracil, as indicated in the paper. See the references listed below for more details.

In the accompanying data file we provide the data used in the paper as two rectangular ASCII files. The data are stripped of patient identifiers.

File 'data.txt'
The first 2 lines are headers. Records the WBC responses as a (3096 x 3) matrix with columns
1. patient number (1 through 608)
2. log white blood cell count, in log thousands, log(WBC/1000)
3. day within 1st cycle (negative numbers are before day 1 of the protocol)

File 'pat.txt'
The first 2 lines are headers. Records patient characteristics as a (608 x 5) matrix with columns
1. patient number (1 through 608)
2. study identifier (1 = CALGB 8881, 2=CALGB 9160, 3=CALGB 8541)
3. dose cyclophosphamide (for 8881 and 9160) For 8541 we record -1 (see comments above)
4. dose GM/CSF (for 8881 and 9160) For 8541 we record -1 (see comments above)
5. Number of observations for this patient

CONSISTENCY CHECKS
As a simple check that you read the data correctly,
-- Verify that the sum of ni ('pat.txt', column 5) equals the total number of lines in 'data.txt'. It should be n=3096.
-- The total number of patients should be N=608.
-- Following are min, max and mean of log WBC ('data.txt', column 2), CTX ('pat.txt', column 3)and GM-CSF ('pat.txt', column 4): min max mean WBC -2.996 4.355 1.15644 CTX -1.000 6.000 -0.32566 GM -1.000 10.000 0.08882

CALGB has kindly agreed to make these data available for interested readers, subject to the following conditions:-
-- Any paper using these data should acknowledge CALGB for the use of the data.
-- The paper should reference the original papers describing the studies:

CALGB 8541
Wood, W., Budman, D., Korzun, A., Cooper, M., Younger, J., Hart, R., Moore, A., Ellerton, J., Norton, L., Ferree, C., Ballow, A., Ill, E. and Henderson, I. (1994) "Dose and dose intensity of adjuvant chemotherapy for stage ii, node positive breast cancer." New England Journal of Medicine, 330, 1253--1259.

CALGB 8881
Lichtman, S. M., Ratain, M. J., Echo, D. A., Rosner, G., Egorin, M. J., Budman, D. R., Vogelzang,N. J., Norton, L. and Schilsky, R. L. (1993) "Phase I trial and granulocyte-macrophage colony-stimulating factor plus high-dose cyclophosphamide given every 2 weeks: a Cancer andLeukemia Group B study." Journal of the National Cancer Institute, 85, 1319--1326.

CALGB 9160
Budman, D., Rosner, G., Lichtman, S., Miller, A., Ratain, M. and Schilsky, R. (1998) "A randomized trial of wr-2721 (amifostine) as a chemoprotective agent in combination with high-dose cyclophosphamide and molgramostim (GM-CSG)."Cancer Therapeutics, 1, 164--167.

Peter Mueller
Department of Biostatistics
University of Texas M. D. Anderson Cancer Center
1515 Holcombe Boulevard
Box 447
Houston
TX 77030-4009
USA

E-mail: pm@odin.mdacc.tmc.edu

Journals

SERIES A
Statistics in Society

SERIES B
Statistical Methodology

SERIES C
Applied Statistics

SERIES D
The Statistician