Query-by-Sketch: Scaling Shortest Path Graph Queries on Very Large Networks

Ye Wang, Qing Wang, Henning Koehler, Yu Lin

    Research output: Contribution to journalConference articlepeer-review

    18 Citations (Scopus)

    Abstract

    Computing shortest paths is a fundamental operation in processing graph data. In many real-world applications, discovering shortest paths between two vertices empowers us to make full use of the underlying structure to understand how vertices are related in a graph, e.g. the strength of social ties between individuals in a social network. In this paper, we study the shortest-path-graph problem that aims to efficiently compute a shortest path graph containing exactly all shortest paths between any arbitrary pair of vertices on complex networks. Our goal is to design an exact solution that can scale to graphs with millions or billions of vertices and edges. To achieve high scalability, we propose a novel method, Query-by-Sketch (QbS), which efficiently leverages offline labelling (i.e., precomputed labels) to guide online searching through a fast sketching process that summarizes the important structural aspects of shortest paths in answering shortest-path-graph queries. We theoretically prove the correctness of this method and analyze its computational complexity. To empirically verify the efficiency of QbS, we conduct experiments on 12 real-world datasets, among which the largest dataset has 1.7 billion vertices and 7.8 billion edges. The experimental results show that QbS can answer shortest-path-graph queries in microseconds for million-scale graphs and less than half a second for billion-scale graphs.

    Original languageEnglish
    Pages (from-to)1946-1958
    Number of pages13
    JournalProceedings of the ACM SIGMOD International Conference on Management of Data
    DOIs
    Publication statusPublished - 2021
    Event2021 International Conference on Management of Data, SIGMOD 2021 - Virtual, Online, China
    Duration: 20 Jun 202125 Jun 2021

    Fingerprint

    Dive into the research topics of 'Query-by-Sketch: Scaling Shortest Path Graph Queries on Very Large Networks'. Together they form a unique fingerprint.

    Cite this