AMS2006-14: Detecting selection in DNA sequences: Bayesian Modelling and Inference

Daniel Merl and Raquel Prado
12/31/2006 09:00 AM
Applied Mathematics & Statistics
Recent developments in Bayesian modelling of DNA sequence data for detecting natural selection at the amino acid level are presented.

This article summarizes and discusses empirical model-based approaches.

Key features of the modelling framework include the incorporation of biologically meaningful information via structured priors, posterior detection of sites under selection, and model validation via posterior predictive checks and/or estimation of gene and species trees. In addition, model selection is handled using a minimum posterior predictive loss criterion.

The models presented here can incorporate relevant covariates such as amino acid properties, extending in this way previous approaches. Applications include the analysis of two DNA sequence alignments with different characteristics in terms of evolutionary divergences among the sequences: an abalone sperm lysin alignment with a strong underlying phylogenetic structure and a low divergence sequence alignment encoding the Apical Membrane Antigen-1 (AMA-1) in the human P.falciparum malaria parasite.