TY - JOUR
T1 - Spectral cluster supertree
T2 - fast and statistically robust merging of rooted phylogenetic trees
AU - McArthur, Robert N.
AU - Zehmakan, Ahad N.
AU - Charleston, Michael A.
AU - Lin, Yu
AU - Huttley, Gavin
N1 - Publisher Copyright:
Copyright © 2024 McArthur, Zehmakan, Charleston, Lin and Huttley.
PY - 2024/10/30
Y1 - 2024/10/30
N2 - The algorithms for phylogenetic reconstruction are central to computational molecular evolution. The relentless pace of data acquisition has exposed their poor scalability and the conclusion that the conventional application of these methods is impractical and not justifiable from an energy usage perspective. Furthermore, the drive to improve the statistical performance of phylogenetic methods produces increasingly parameter-rich models of sequence evolution, which worsens the computational performance. Established theoretical and algorithmic results identify supertree methods as critical to divide-and-conquer strategies for improving scalability of phylogenetic reconstruction. Of particular importance is the ability to explicitly accommodate rooted topologies. These can arise from the more biologically plausible non-stationary models of sequence evolution. We make a contribution to addressing this challenge with Spectral Cluster Supertree, a novel supertree method for merging a set of overlapping rooted phylogenetic trees. It offers significant improvements over Min-Cut supertree and previous state-of-the-art methods in terms of both time complexity and overall topological accuracy, particularly for problems of large size. We perform comparisons against Min-Cut supertree and Bad Clade Deletion. Leveraging two tree topology distance metrics, we demonstrate that while Bad Clade Deletion generates more correct clades in its resulting supertree, Spectral Cluster Supertree’s generated tree is generally more topologically close to the true model tree. Over large datasets containing 10,000 taxa and (Formula presented.) 500 source trees, where Bad Clade Deletion usually takes (Formula presented.) 2 h to run, our method generates a supertree in on average 20 s. Spectral Cluster Supertree is released under an open source license and is available on the python package index as sc-supertree.
AB - The algorithms for phylogenetic reconstruction are central to computational molecular evolution. The relentless pace of data acquisition has exposed their poor scalability and the conclusion that the conventional application of these methods is impractical and not justifiable from an energy usage perspective. Furthermore, the drive to improve the statistical performance of phylogenetic methods produces increasingly parameter-rich models of sequence evolution, which worsens the computational performance. Established theoretical and algorithmic results identify supertree methods as critical to divide-and-conquer strategies for improving scalability of phylogenetic reconstruction. Of particular importance is the ability to explicitly accommodate rooted topologies. These can arise from the more biologically plausible non-stationary models of sequence evolution. We make a contribution to addressing this challenge with Spectral Cluster Supertree, a novel supertree method for merging a set of overlapping rooted phylogenetic trees. It offers significant improvements over Min-Cut supertree and previous state-of-the-art methods in terms of both time complexity and overall topological accuracy, particularly for problems of large size. We perform comparisons against Min-Cut supertree and Bad Clade Deletion. Leveraging two tree topology distance metrics, we demonstrate that while Bad Clade Deletion generates more correct clades in its resulting supertree, Spectral Cluster Supertree’s generated tree is generally more topologically close to the true model tree. Over large datasets containing 10,000 taxa and (Formula presented.) 500 source trees, where Bad Clade Deletion usually takes (Formula presented.) 2 h to run, our method generates a supertree in on average 20 s. Spectral Cluster Supertree is released under an open source license and is available on the python package index as sc-supertree.
KW - molecular evolution
KW - phylogenetics
KW - rooted phylogenetic trees
KW - spectral clustering
KW - supertree
UR - http://www.scopus.com/inward/record.url?scp=85208957239&partnerID=8YFLogxK
U2 - 10.3389/fmolb.2024.1432495
DO - 10.3389/fmolb.2024.1432495
M3 - Article
AN - SCOPUS:85208957239
SN - 2296-889X
VL - 11
JO - Frontiers in Molecular Biosciences
JF - Frontiers in Molecular Biosciences
M1 - 1432495
ER -