A general method for calculating likelihoods under the coalescent process Journal Article

Author(s): Lohse, Konrad; Harrison, Richard J; Barton, Nicholas H
Article Title: A general method for calculating likelihoods under the coalescent process
Affiliation IST Austria
Abstract: Analysis of genomic data requires an efficient way to calculate likelihoods across very large numbers of loci. We describe a general method for finding the distribution of genealogies: we allow migration between demes, splitting of demes [as in the isolation-with-migration (IM) model], and recombination between linked loci. These processes are described by a set of linear recursions for the generating function of branch lengths. Under the infinite-sites model, the probability of any configuration of mutations can be found by differentiating this generating function. Such calculations are feasible for small numbers of sampled genomes: as an example, we show how the generating function can be derived explicitly for three genes under the two-deme IM model. This derivation is done automatically, using Mathematica. Given data from a large number of unlinked and nonrecombining blocks of sequence, these results can be used to find maximum-likelihood estimates of model parameters by tabulating the probabilities of all relevant mutational configurations and then multiplying across loci. The feasibility of the method is demonstrated by applying it to simulated data and to a data set previously analyzed by Wang and Hey (2010) consisting of 26,141 loci sampled from Drosophila simulans and D. melanogaster. Our results suggest that such likelihood calculations are scalable to genomic data as long as the numbers of sampled individuals and mutations per sequence block are small.
Keywords: Population Genetics; Coalescent; generating function; IM model
Journal Title: Genetics
Volume: 189
Issue 3
ISSN: 0016-6731
Publisher: Genetics Society of America  
Date Published: 2011-11-01
Start Page: 977
End Page: 987
Sponsor: This work was supported by a grant from the European Research Council (250152) (to N.B.) and a grant from the United Kingdom Natural Environment Research Council (NE/I020288/1) (to K.L.).
DOI: 10.1534/genetics.111.129569
Notes: We thank Yong Wang and Jody Hey for sharing the Drosophila data set. We also thank two anonymous reviewers and Jerome Kelleher for thoughtful comments on earlier versions of this manuscript.
Open access: yes (repository)