Raquel Prado, Daniel Merl and Ananias Escalante
12/31/2006 09:00 AM
Applied Mathematics & Statistics
A novel statistical methodology for inferring natural positive selection at the molecular level is presented. The method is model-based, aiming to describe patterns of codon substitutions in an alignment of DNA sequences.
Bayesian generalized linear models with structured priors are used.
Inference is achieved via customary Markov chain Monte Carlo methods. Model selection between models supporting different hypotheses about site-specific substitutions and/or various degrees of evolutionary divergence, is dealt with via a minimum posterior predictive loss approach.
After a particular model is chosen, posterior distributions for biologically meaningful parameters, such as probabilities of non-synonymous and synonymous substitutions per site and transition to transversion substitution rates ratios, are studied. The proposed methodology was specifically designed to analyze several DNA sequences encoding malaria antigens. Specifically, an analysis of multiple sequences encoding the Apical Membrane Antigen 1 (AMA-1) in the P.falciparum human malaria parasite is presented. The study of genetic variability in the AMA-1 sequences is key to determine whether this antigen is a viable target for a malaria vaccine construct.