AMS2006-4: Modeling disease incidence data with spatial and spatio-temporal Dirichlet process mixtures

Athanasios Kottas, Jason Duan and Alan Gelfand
12/31/2006 09:00 AM
Applied Mathematics & Statistics
Typically, disease incidence or mortality data are available as rates or counts for specified regions, collected over time.

We propose Bayesian nonparametric spatial modeling approaches to analyze such data. We develop a hierarchical specification using spatial random effects modeled with a Dirichlet process prior.

The Dirichlet process is centered around a multivariate normal distribution. This latter distribution arises from a log-Gaussian process model that provides a latent incidence rate surface, followed by block averaging to the areal units determined by the regions in the study. With regard to the resulting posterior predictive inference, the modeling approach is shown to be equivalent to an approach based on block averaging of a spatial Dirichlet process to obtain a prior probability model for the finite dimensional distribution of the spatial random effects. We introduce a dynamic formulation for the spatial random effects to extend the model to spatio-temporal settings.

Posterior inference is implemented with efficient Gibbs samplers through strategically chosen latent variables. We illustrate the methodology with simulated data as well as with a data set on lung cancer incidences for all 88 counties in the state of Ohio over an observation period of 21 years.