TY - JOUR
T1 - A median solver and phylogenetic inference based on double-cut-and-join sorting
AU - Xia, Ruofan
AU - Lin, Yu
AU - Zhou, Jun
AU - Feng, Bing
AU - Tang, Jijun
N1 - Publisher Copyright:
© 2018, Mary Ann Liebert, Inc.
PY - 2018/3
Y1 - 2018/3
N2 - Genome rearrangement is known as one of the main evolutionary mechanisms on the genomic level. Phylogenetic analysis based on rearrangement played a crucial role in biological research in the past decades, especially with the increasing availability of fully sequenced genomes. In general, phylogenetic analysis aims to solve two problems: small parsimony problem (SPP) and big parsimony problem (BPP). Maximum parsimony is a popular approach for SPP and BPP, which relies on iteratively solving an NP-hard problem, the median problem. As a result, current median solvers and phylogenetic inference methods based on the median problem all face serious problems on scalability and cannot be applied to data sets with large and distant genomes. In this article, we propose a new median solver for gene order data that combines double-cut-and-join sorting with the simulated annealing algorithm. Based on this median solver, we built a new phylogenetic inference method to solve both SPP and BPP problems. Our experimental results show that the new median solver achieves an excellent performance on simulated data sets, and the phylogenetic inference tool built based on the new median solver has a better performance than other existing methods.
AB - Genome rearrangement is known as one of the main evolutionary mechanisms on the genomic level. Phylogenetic analysis based on rearrangement played a crucial role in biological research in the past decades, especially with the increasing availability of fully sequenced genomes. In general, phylogenetic analysis aims to solve two problems: small parsimony problem (SPP) and big parsimony problem (BPP). Maximum parsimony is a popular approach for SPP and BPP, which relies on iteratively solving an NP-hard problem, the median problem. As a result, current median solvers and phylogenetic inference methods based on the median problem all face serious problems on scalability and cannot be applied to data sets with large and distant genomes. In this article, we propose a new median solver for gene order data that combines double-cut-and-join sorting with the simulated annealing algorithm. Based on this median solver, we built a new phylogenetic inference method to solve both SPP and BPP problems. Our experimental results show that the new median solver achieves an excellent performance on simulated data sets, and the phylogenetic inference tool built based on the new median solver has a better performance than other existing methods.
KW - big phylogeny problem
KW - median problem
KW - phylogenetic inference
KW - simulated annealing
KW - small phylogeny problem
UR - http://www.scopus.com/inward/record.url?scp=85045419498&partnerID=8YFLogxK
U2 - 10.1089/cmb.2017.0157
DO - 10.1089/cmb.2017.0157
M3 - Article
SN - 1066-5277
VL - 25
SP - 302
EP - 312
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 3
ER -