Abstract
Molecular clock models are commonly used to estimate evolutionary rates and timescales from nucleotide sequences. The goal of these models is to account for rate variation among lineages, such that they are assumed to be adequate descriptions of the processes that generated the data. A common approach for selecting a clock model for a data set of interest is to examine a set of candidates and to select the model that provides the best statistical fit. However, this can lead to unreliable estimates if all the candidate models are actually inadequate. For this reason, a method of evaluating absolute model performance is critical. We describe a method that uses posterior predictive simulations to assess the adequacy of clock models. We test the power of this approach using simulated data and find that the method is sensitive to bias in the estimates of branch lengths, which tends to occur when using underparameterized clock models. We also compare the performance of the multinomial test statistic, originally developed to assess the adequacy of substitution models, but find that it has low power in identifying the adequacy of clock models. We illustrate the performance of our method using empirical data sets from coronaviruses, simian immunodeficiency virus, killer whales, and marine turtles. Our results indicate that methods of investigating model adequacy, including the one proposed here, should be routinely used in combination with traditional model selection in evolutionary studies. This will reveal whether a broader range of clock models to be considered in phylogenetic analysis.
Original language | English |
---|---|
Pages (from-to) | 2986-2995 |
Number of pages | 10 |
Journal | Molecular Biology and Evolution |
Volume | 32 |
Issue number | 11 |
DOIs | |
Publication status | Published - Nov 2015 |