Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data

Alexei J. Drummond*, Geoff K. Nicholls, Allen G. Rodrigo, Wireniu Solomon

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

832 Citations (Scopus)

Abstract

Molecular sequences obtained at different sampling times from populations of rapidly evolving pathogens and from ancient subfossil and fossil sources are increasingly available with modern sequencing technology. Here, we present a Bayesian statistical inference approach to the joint estimation of mutation rate and population size that incorporates the uncertainty in the genealogy of such temporally spaced sequences by using Markov chain Monte Carlo (MCMC) integration. The Kingman coalescent model is used to describe the time structure of the ancestral tree. We recover information about the unknown true ancestral coalescent tree, population size, and the overall mutation rate from temporally spaced data, that is, from nucleotide sequences gathered at different times, from different individuals, in an evolving haploid population. We briefly discuss the methodological implications and show what can be inferred, in various practically relevant states of prior knowledge. We develop extensions for exponentially growing population size and joint estimation of substitution model parameters. We illustrate some of the important features of this approach on a genealogy of HIV-1 envelope (env) partial sequences.

Original languageEnglish
Pages (from-to)1307-1320
Number of pages14
JournalGenetics
Volume161
Issue number3
Publication statusPublished - 2002
Externally publishedYes

Fingerprint

Dive into the research topics of 'Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data'. Together they form a unique fingerprint.

Cite this