UCSC-SOE-09-02: Bayesian estimation of $\Theta=4N_e\mu$ under gamma-distributed mutation rate variation

Eric C. Anderson
01/02/2009 09:00 AM
Applied Mathematics & Statistics
A key parameter in population genetics is $\Theta=4Ne\mu$: four times the effective population size times the per-nucleotide neutral mutation rate. This parameter directly affects the genetic diversity expected in a population. In the last two decades, significant improvements in the estimation of $\Theta$ from sequence data have been achieved by using the coalescent process to derive a likelihood for the data. Use of the coalescent for likelihood or Bayesian inference requires an intractable sum over all possible coalescent trees and a multidimensional integral over the branch lengths on each tree. Accordingly Markov chain Monte Carlo has been successfully applied to approximate likelihoods and posterior probabilities in these situations. The program LAMARC is an actively developed implementation of MCMC for these sorts of calculations. LAMARC allows both likelihood and Bayesian inference of $\Theta$ and other parameters; however the Bayesian implementations are not always complete, and, in particular, Bayesian estimation of $\Theta$ is not available in LAMARC under the assumption that mutation rates vary between different genomic regions according to a gamma distribution. Estimating $\Theta$ from multiple nuclear intronic sequences typically requires such a model. Here I describe a simple method that implements the gamma-distributed mutation rates model in the Bayesian framework using the output from a LAMARC run in which $\Theta$ is considered fixed between genomic regions. The method is shown to provide reliable results in a matter of just a few minutes after the initial LAMARC run. This procedure appears to be computationally more efficient than estimating $\Theta$ in the likelihood framework using LAMARC. The method is implemented in the software GUFBUL (Gamma-model Updating for Bayesians Using LAMARC).

This report is not available for download at this time.