UCSC-SOE-12-11: Bayesian Factor Models in Characterizing Molecular Adaptation

Saheli Datta, Raquel Prado, Abel Rodriguez
08/08/2012 08:25 PM
Applied Mathematics & Statistics
Assessing the selective influence of amino acid properties is
important in understanding evolution at the molecular level.
A collection of methods and models have been developed in recent years
to determine if amino
acid sites in a given DNA sequence alignment display
substitutions that are altering or conserving a prespecified set of amino
acid properties. Residues showing an elevated number of substitutions that
favorably alter a physicochemical property are considered targets of
positive natural selection.
Such approaches usually perform independent analyses for each amino
acid property
under consideration, without taking into account the fact that some of
the properties may be highly correlated.
We propose a Bayesian hierarchical regression model with
latent factor structure that allows us to determine which sites
display substitutions that conserve or
radically change a set of amino acid properties, while
accounting for the correlation structure that may be present across such
We illustrate our approach by analyzing simulated data sets and
an alignment of lysin sperm DNA.