TY - JOUR
T1 - Query-by-Sketch
T2 - 2021 International Conference on Management of Data, SIGMOD 2021
AU - Wang, Ye
AU - Wang, Qing
AU - Koehler, Henning
AU - Lin, Yu
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021
Y1 - 2021
N2 - Computing shortest paths is a fundamental operation in processing graph data. In many real-world applications, discovering shortest paths between two vertices empowers us to make full use of the underlying structure to understand how vertices are related in a graph, e.g. the strength of social ties between individuals in a social network. In this paper, we study the shortest-path-graph problem that aims to efficiently compute a shortest path graph containing exactly all shortest paths between any arbitrary pair of vertices on complex networks. Our goal is to design an exact solution that can scale to graphs with millions or billions of vertices and edges. To achieve high scalability, we propose a novel method, Query-by-Sketch (QbS), which efficiently leverages offline labelling (i.e., precomputed labels) to guide online searching through a fast sketching process that summarizes the important structural aspects of shortest paths in answering shortest-path-graph queries. We theoretically prove the correctness of this method and analyze its computational complexity. To empirically verify the efficiency of QbS, we conduct experiments on 12 real-world datasets, among which the largest dataset has 1.7 billion vertices and 7.8 billion edges. The experimental results show that QbS can answer shortest-path-graph queries in microseconds for million-scale graphs and less than half a second for billion-scale graphs.
AB - Computing shortest paths is a fundamental operation in processing graph data. In many real-world applications, discovering shortest paths between two vertices empowers us to make full use of the underlying structure to understand how vertices are related in a graph, e.g. the strength of social ties between individuals in a social network. In this paper, we study the shortest-path-graph problem that aims to efficiently compute a shortest path graph containing exactly all shortest paths between any arbitrary pair of vertices on complex networks. Our goal is to design an exact solution that can scale to graphs with millions or billions of vertices and edges. To achieve high scalability, we propose a novel method, Query-by-Sketch (QbS), which efficiently leverages offline labelling (i.e., precomputed labels) to guide online searching through a fast sketching process that summarizes the important structural aspects of shortest paths in answering shortest-path-graph queries. We theoretically prove the correctness of this method and analyze its computational complexity. To empirically verify the efficiency of QbS, we conduct experiments on 12 real-world datasets, among which the largest dataset has 1.7 billion vertices and 7.8 billion edges. The experimental results show that QbS can answer shortest-path-graph queries in microseconds for million-scale graphs and less than half a second for billion-scale graphs.
KW - 2-hop cover
KW - algorithms
KW - breadth-first search
KW - distance labelling
KW - graph sketch
KW - graphs
KW - pruned landmark labelling
KW - shortest paths
UR - http://www.scopus.com/inward/record.url?scp=85108960732&partnerID=8YFLogxK
U2 - 10.1145/3448016.3452826
DO - 10.1145/3448016.3452826
M3 - Conference article
AN - SCOPUS:85108960732
SN - 0730-8078
SP - 1946
EP - 1958
JO - Proceedings of the ACM SIGMOD International Conference on Management of Data
JF - Proceedings of the ACM SIGMOD International Conference on Management of Data
Y2 - 20 June 2021 through 25 June 2021
ER -