TY - JOUR
T1 - matOptimize
T2 - A parallel tree optimization method enables online phylogenetics for SARS-CoV-2
AU - Ye, Cheng
AU - Thornlow, Bryan
AU - Hinrichs, Angie
AU - Kramer, Alexander
AU - Mirchandani, Cade
AU - Torvi, Devika
AU - Lanfear, Robert
AU - Corbett-Detig, Russell
AU - Turakhia, Yatish
N1 - Publisher Copyright:
© 2022 The Author(s). Published by Oxford University Press.
PY - 2022/8/1
Y1 - 2022/8/1
N2 - Motivation: Phylogenetic tree optimization is necessary for precise analysis of evolutionary and transmission dynamics, but existing tools are inadequate for handling the scale and pace of data produced during the coronavirus disease 2019 (COVID-19) pandemic. One transformative approach, online phylogenetics, aims to incrementally add samples to an ever-growing phylogeny, but there are no previously existing approaches that can efficiently optimize this vast phylogeny under the time constraints of the pandemic. Results: Here, we present matOptimize, a fast and memory-efficient phylogenetic tree optimization tool based on parsimony that can be parallelized across multiple CPU threads and nodes, and provides orders of magnitude improvement in runtime and peak memory usage compared to existing state-of-the-art methods. We have developed this method particularly to address the pressing need during the COVID-19 pandemic for daily maintenance and optimization of a comprehensive SARS-CoV-2 phylogeny. matOptimize is currently helping refine on a daily basis possibly the largest-ever phylogenetic tree, containing millions of SARS-CoV-2 sequences.
AB - Motivation: Phylogenetic tree optimization is necessary for precise analysis of evolutionary and transmission dynamics, but existing tools are inadequate for handling the scale and pace of data produced during the coronavirus disease 2019 (COVID-19) pandemic. One transformative approach, online phylogenetics, aims to incrementally add samples to an ever-growing phylogeny, but there are no previously existing approaches that can efficiently optimize this vast phylogeny under the time constraints of the pandemic. Results: Here, we present matOptimize, a fast and memory-efficient phylogenetic tree optimization tool based on parsimony that can be parallelized across multiple CPU threads and nodes, and provides orders of magnitude improvement in runtime and peak memory usage compared to existing state-of-the-art methods. We have developed this method particularly to address the pressing need during the COVID-19 pandemic for daily maintenance and optimization of a comprehensive SARS-CoV-2 phylogeny. matOptimize is currently helping refine on a daily basis possibly the largest-ever phylogenetic tree, containing millions of SARS-CoV-2 sequences.
UR - http://www.scopus.com/inward/record.url?scp=85135697497&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btac401
DO - 10.1093/bioinformatics/btac401
M3 - Article
SN - 1367-4803
VL - 38
SP - 3734
EP - 3740
JO - Bioinformatics
JF - Bioinformatics
IS - 15
ER -