A bias in ML estimates of branch lengths in the presence of multiple signals

David Penny*, W. T. White, Mike D. Hendy, Matthew J. Phillips

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    19 Citations (Scopus)

    Abstract

    Sequence data often have competing signals that are detected by network programs or Lento plots. Such data can be formed by generating sequences on more than one tree, and combining the results, a mixture model. We report that with such mixture models, the estimates of edge (branch) lengths from maximum likelihood (ML) methods that assume a single tree are biased. Based on the observed number of competing signals in real data, such a bias of ML is expected to occur frequently. Because network methods can recover competing signals more accurately, there is a need for ML methods allowing a network. A fundamental problem is that mixture models can have more parameters than can be recovered from the data, so that some mixtures are not, in principle, identifiable. We recommend that network programs be incorporated into best practice analysis, along with ML and Bayesian trees.

    Original languageEnglish
    Pages (from-to)239-242
    Number of pages4
    JournalMolecular Biology and Evolution
    Volume25
    Issue number2
    DOIs
    Publication statusPublished - Feb 2008

    Fingerprint

    Dive into the research topics of 'A bias in ML estimates of branch lengths in the presence of multiple signals'. Together they form a unique fingerprint.

    Cite this