QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution

Bui Quang Minh, Cuong Cao Dang, Le Sy Vinh*, Robert Lanfear*

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    41 Citations (Scopus)

    Abstract

    Amino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino acid substitution models; however, they are typically complicated and slow. In this article, we propose QMaker, a new ML method to estimate a general time-reversible $Q$ matrix from a large protein data set consisting of multiple sequence alignments. QMaker combines an efficient ML tree search algorithm, a model selection for handling the model heterogeneity among alignments, and the consideration of rate mixture models among sites. We provide QMaker as a user-friendly function in the IQ-TREE software package (http://www.iqtree.org) supporting the use of multiple CPU cores so that biologists can easily estimate amino acid substitution models from their own protein alignments. We used QMaker to estimate new empirical general amino acid substitution models from the current Pfam database as well as five clade-specific models for mammals, birds, insects, yeasts, and plants. Our results show that the new models considerably improve the fit between model and data and in some cases influence the inference of phylogenetic tree topologies.[Amino acid replacement matrices; amino acid substitution models; maximum likelihood estimation; phylogenetic inferences.]

    Original languageEnglish
    Pages (from-to)1046-1060
    Number of pages15
    JournalSystematic Biology
    Volume70
    Issue number5
    DOIs
    Publication statusPublished - 1 Sept 2021

    Fingerprint

    Dive into the research topics of 'QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution'. Together they form a unique fingerprint.

    Cite this