matOptimize: A parallel tree optimization method enables online phylogenetics for SARS-CoV-2

Cheng Ye, Bryan Thornlow, Angie Hinrichs, Alexander Kramer, Cade Mirchandani, Devika Torvi, Robert Lanfear, Russell Corbett-Detig, Yatish Turakhia*

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    14 Citations (Scopus)

    Abstract

    Motivation: Phylogenetic tree optimization is necessary for precise analysis of evolutionary and transmission dynamics, but existing tools are inadequate for handling the scale and pace of data produced during the coronavirus disease 2019 (COVID-19) pandemic. One transformative approach, online phylogenetics, aims to incrementally add samples to an ever-growing phylogeny, but there are no previously existing approaches that can efficiently optimize this vast phylogeny under the time constraints of the pandemic. Results: Here, we present matOptimize, a fast and memory-efficient phylogenetic tree optimization tool based on parsimony that can be parallelized across multiple CPU threads and nodes, and provides orders of magnitude improvement in runtime and peak memory usage compared to existing state-of-the-art methods. We have developed this method particularly to address the pressing need during the COVID-19 pandemic for daily maintenance and optimization of a comprehensive SARS-CoV-2 phylogeny. matOptimize is currently helping refine on a daily basis possibly the largest-ever phylogenetic tree, containing millions of SARS-CoV-2 sequences.

    Original languageEnglish
    Pages (from-to)3734-3740
    Number of pages7
    JournalBioinformatics
    Volume38
    Issue number15
    DOIs
    Publication statusPublished - 1 Aug 2022

    Fingerprint

    Dive into the research topics of 'matOptimize: A parallel tree optimization method enables online phylogenetics for SARS-CoV-2'. Together they form a unique fingerprint.

    Cite this