![]() |
|
Readme fileSERIES B
|
| Gabriel Huerta | Mike West |
| 2006 Sheridan Road | Box 90251-0251 |
| Department of Statistics | ISDS |
| Northwestern University | Duke University |
| Evanston, IL 60208, U.S.A. | Durham, NC 27708, U.S.A. |
| gabriel@bayes.stats.nwu.edu | mw@stat.duke.edu |
This file describes how to use the Fortran 77 programs and Splusfunctions to fit AR models with priors on structure components. The Fortranconsist of 4 files: "main.f","gibbs.f", "routines.f" and "ord.f". TheSplus functions are contained in a file named "analysis.s" which readsthe output created by the Fortran programs and produces picturessimilar to those presented in the paper. The first three Fortran filesare used to implement the MCMC scheme, the last one to order the reciprocals roots in terms of modulus and wavelength and to order thereal roots. The current version of the programs allow for a maximum number ofobservations equal to 2500, model order up to 100 with a maximum of 50complex pairs of reciprocal roots and 100 real roots. The prioradopted is the default "component reference prior".
In contrast, this version does not permit missing values for the timeseries, only computes the decomposition and forecasts atthe posterior mean of the AR coefficients and innovation variance. This aspects will be upgraded in further versions along with issues concerning spectral estimation.
Input Files
To run the Fortran programs, the user needs to prepare 3 ASCII files:"info.dat","series.dat" and "init.dat". "info.dat" should be a filethat has the information in columns an in the following order:
1.- The number of requested posterior samples.
2.- The maximum number of complex pairs of roots (could be equal to zero).
3.- The maximum number of real roots (could be equal to zero).
4.- The number of iterations to waste before collecting samples and
also known as number of burn-in iterations.
5.- The number of iterations to skip between collected samples.
6.- The total number of observations in the time series (maximum 2500).
7.- The upper bound on wavelengths that must always be greater than 2
and by default can be fixed to n/2, where n is the sample size.
An example for the contents of this file "info.dat" is:
5000
15
10
5000
1
540
270
"series.dat" should be a one column file that has the data points whose total must be at least the number of observations specified in item 6 ofthe file "info.dat". The last file "init.dat" must contain initial values for the AR reciprocal roots, including first the complex roots andthen the real roots. Complex roots have to be specified in terms of modulus and wavelength. Ordering the initial values is not important here, since the sampler is developed over the space of unidentified roots. Assuming that the user specified a maximum number of complex roots equal to 4 and a maximum number of real roots equal to 3 so the maximum model order is 11, the contents of "init.dat" could be:
.9
12
.95
7.5
.97
24
.8
5.5
-.3
-.25
.6
Unitary roots are valid for specification of initial values, since the model allows for such type of roots. Traditionally, initial values had been specified through sampling the prior without probability point masses. This version includes examples for input files and particularly "series.dat" contains the Southern Oscillation index, a time series which has form part of debates in climatological issues.
Fortran Programs
The files "main.f", "gibbs.f" and "routines.f" contain the code to obtain MCMC posterior samples. Assuming that the input files and Fortran programs are in the same subdirectory, all it is required is that the user compiles "main.f", "gibbs.f" and "routines.f" simultaneously to produce an executable. In Unix, the compilation takes places after
f77 -o exe1 main.f gibbs.f routines.f
Typing exe1 at the prompt will begin iterating the Gibbs sampler andexhibit the following message:
Gibbs sampling for AR model
iteration no: 1
iteration no: 2
iteration no: 3
iteration no: 4
iteration no: 5
iteration no: 6
iteration no: 7
iteration no: 8
iteration no: 9
iteration no: 10
............ ..
Error messages could be due to misspecification in one the files:"info.dat","series.dat" or "init.dat". Also an error can appear if one of these files is not in the same subdirectory as the executable "exe1". After obtaining the required simulations the program displays the message:
Gibbs sampling for AR model terminated normally
As output the program creates the files:
"phi.dat","root.dat", "var.dat","resid.dat" and "inval.dat".
"phi.dat" has the posterior mean for the AR coefficients up to the maximum value for model order. "root.dat" has the samples for the unidentified roots.
Each of these samples is arranged by first including the modulus-frequency of each complex pair of roots and then the real roots.
"var.dat" has the samples for the innovation variance and "inval.dat" the corresponding posterior mean for initial values. Before using the S functions is necessary to order the unidentified roots saved at "root.dat" using the program "ord.f". In Unix, this program is simply compiled by typing:
f77 -o exe2 ord.f
Where "exe2" contains the executable file corresponding to "ord.f". Run by typing "exe2" at the prompt which displays the message:
Ordering complex and real roots...
when finished the program presents the message:
AR roots are now ordered
Basically, "ord.f" reads in the file "root.dat" and creates new files: "modord.dat", "wavord.dat", "realord.dat" and "orders.dat". "modord.dat" and "wavord.dat" each has the posterior samples of roots in an arrangement of modulus-frequency ordered by modulus and wavelength respectively. "realord.dat" has the samples for real roots in an increasing order while "orders.dat" is a file that in each row reports the number of complex roots and the number of real roots equal to zero per sample. This last file is useful to produce an estimate of the posterior probabilities for model order, number of complex pairs of reciprocal roots and number of real roots.
Splus code
After running both "ex1" and "ex2" start Splus in the directory that has the output files and type: source("analysis.s") The following message will be displayed:
Please wait, reading MCMC output files....
and then several other messages appear like:
Reading posterior mean for AR coefficients...
Reading number of roots equal to zero...
Sourcing S functions and opening graphics window...
"analysis.s" reads in the output created by the Fortran programs and sources nine functions: "arorder", "histreal","histcplx1", "histcplx2", "diagnostic","forecast","picdec","decomp"and "forcar".
-"arorder" plots in one picture the posterior probabilities for model order, number of complex pairs of roots and number of real roots. The only argument of this function is a number between zero and one, that controls the scale on the y-axis of the plots. For example."arorder(.8)" will plot the posterior probabilities and the y-axiswill range from 0 to .8.
-histreal plots histograms for the samples of some of the ordered realroots without point masses. The function has one argument which can be any integer number from 1 to floor(nrea/2) where "nrea" is the maximal number of real roots allowed by the model specification. For example, "histreal(2)" plots the histograms corresponding tosamples of the two smallest and the two largest real roots. Posterior probabilities for point masses at -1,0,1 are reported on top of each histogram.
-histcplx1 plots histograms for samples of modulus and wavelength of the complex roots when these are ordered by modulus. The function hastwo arguments. The first defines the number of modulus-wavelengths to be explored, the second the number of rows in one graph. The scanfunction has been added after every pair of modulus-wavelength. Forexample, "histclpx1(6,2)" builds two graphs, each with 3 pairs ofmodulus-wavelengths. Posterior probabilities for point masses at 0and 1 in modulus are reported on top of its corresponding histogram.
-histcplx2 plots histograms for samples of modulus and wavelength of the complex roots when these are ordered by wavelengths. The function has two arguments. The first defines the number of modulus-wavelengths to be explored, the second the number of rows in one graph. The scan function has been added after every pair of modulus-wavelength. For example, "histclpx1(6,2)" builds two graphs, each with 3 pairs of modulus-wavelengths. Posterior probabilities for point masses at 0 and 1 in modulus are reported on top of its corresponding histogram.
-diagnostic is a function with no specific argument which produces a set of four graphs. A histogram for the samples of the innovation variance and for the posterior mean of the residuals, a time series plot, a histogram and a qqplot. Only type "diagnostic()".
-forecast is a function that plots traces of samples simulated from the predictive distribution conditional on the posterior mean for parameters and the data. The function has two arguments. The first defines the number of predictive samples to be displayed, the second the time units covered by the forecasting horizon. For example, "forecast(4,200)" displays in one graph, four frames with the data followed by 200 future observations.
-picdec plots real and complex elements of the decomposition computed at the posterior means for AR coefficients and initial values. The function has two arguments. The first defines the number of complex components to be displayed, the second the number of real components. For example "picdec(4,3)" produces two plots. The firsthas the data and the 3 main real components when the roots areordered in decreasing magnitude. The second, plots the data with the4 main complex components when these are ordered by wavelengths. Ineach frame the y-axis scale is relative to the y-axis scale for the data.
-decomp is a function used by picdec that computes the elements of thedecomposition at a particular set of AR model coefficients and initial values. The function returns a list of two objects: the components of the decomposition and the number of componentscorresponding to real roots.
-forcar is a function used by forecast and produces a sample of futurevalues for given a particular combination of AR coefficients, innovationvariance and the last "p" observations of the series, where "p" isthe length of the vector of model coefficients.
Note:
In principle, these programs could be used in different computingenvironments. As today, they have only been tested within Unix.
Date: February 13, 1999.
SERIES
A
Statistics
in Society
SERIES B
Statistical Methodology
SERIES C
Applied
Statistics
SERIES D
The
Statistician