AMS2005-3: Comparison of Bayesian, maximum likelihood and parsimony methods for detecting positive selection

Daniel Merl, Ananias Escalante and Raquel Prado
12/31/2005 09:00 AM
Applied Mathematics & Statistics
From a variety of vantage points, ranging from epidemiological to statistical, the problem of identifying the effects of natural selection at the molecular level is a fascinating one. Recent years have seen an explosion of model based methods for inferring such effects, with particular emphasis on detection of positive selection; some of the most popular of which are the maximum likelihood based method of Yang implemented in PAML, the parsimony based method of Suzuki and Gojorobi implemented in ADAPTSITE, and the hierarchical Bayesian method of Huelsenbeck and Ronquist implemented in MRBAYES. Although each of these three methodologies has appeared in the literature in the analyses of various sequence data, there have been no cross comparison studies of the performance of these methods when applied to the same data, in terms of the methods' abilities to predict amino acid sites influenced by positive selection. To this end, we employed the three methods to detect the presence of positively selected sites in the following sequence data, where each data set was chosen to represent a different level of phylogenetic uncertainty: a previously analyzed abalone sperm lysin alignment, three alignments of the Avian infectious bronchitis (AIB) virus S gene, and two alignments of the homologous S protein o f the SARS coronavirus. The results shown here demonstrate important strengths a nd drawbacks of each method when dealing with data of different levels of phylog enetic uncertainty.