Readme file

SERIES C  
Applied Statistics

Bayesian disclosure risk assessment: predicting small frequencies in contingency tables, by J. J. Forster and E. L. Webb, pages 551-570

The Folder Data contains 13 files, one for each of the 13 regions of residence used in the SAR data. The files are named sampind **, where ** ranges from 01 to 13, reflecting the codes below. The files contain a string of (space separated) 0s and 1s, indicating, for each member of the SAR sample, whether an individual was included (1) or excluded (0) from the 3% subsample used for the analysis reported in the paper. Most of the analysis in the paper was based on the data for South West England and corresponding file sampind08.

Code Region SAR Sample size
01 North East 78378
02 North West 210040
03 Yorkshire and the Humber 154927
04 East Midlands 130358
05 West Midlands 164430
06 East of England 168536
07 South East 250699
08 South West 154295
09 Inner London 86255
10 Outer London 137693
11 Scotland 164307
12 Wales 90713
13 Northern Ireland 52894

See http://www.ccsr.ac.uk/sars/ for details of how to obtain the 2001 individual SAR file.

The variables used in the analysis in the paper were:

acctype Accommodation Type
age0 Age of Respondents-Grouped
cars0 Cars/Vans Owned or Available for Use
famtyp Family Type
isco International Standard Classification of Occupations
sex Sex

(for details of variable codes, see http://www.ccsr.ac.uk/sars/2001/indiv/variables/)

For the variable age0, some further grouping was done for our analysis, the groups being 0-4, 5-10, 11-15, 16-19, 20-29, 30-44, 45-59, 60-69, 70-79, 80-89, 90+

For any further queries, please contact:

Jonathan J. Forster
School of Mathematics
University of Southampton
Highfield
Southampton
SO17 1BJ
UK

Email: J.J.Forster@soton.ac.uk

Journals

SERIES A
Statistics in Society

SERIES B
Statistical Methodology

SERIES C
Applied Statistics

SERIES D
The Statistician