TY - JOUR
T1 - Analysis and evaluation of the top-k most influential location selection query
AU - Chen, Jian
AU - Huang, Jin
AU - Wen, Zeyi
AU - He, Zhen
AU - Taylor, Kerry
AU - Zhang, Rui
N1 - Publisher Copyright:
© 2014, Springer-Verlag London.
PY - 2015/4
Y1 - 2015/4
N2 - In this paper, we propose a new type of queries to retrieve the top-k most influential locations from a candidate set (Foermula presented.) given sets of customers (Foermula presented.)and existing facilities (Foermula presented.). The influence models the popularity of a facility. Such queries have wide applications in decision support systems. A naive solution sequentially scans (SS) all data sets, which is expensive, and hence, we investigate two branch-and-bound algorithms for the query, namely Estimate Expanding Pruning (EEP) and Bounding Influence Pruning (BIP). Both algorithms follow the best first traverse. On determining the traversal order, while EEP leverages distance metrics between nodes, BIP relies on half plane pruning which avoids the repetitive estimations in EEP. As our experiments shown, BIP is much faster than SS which outperforms EEP, while the worst-case complexity of EEP and BIP is worse than that of SS. To improve the efficiency, we further propose a Nearest Facility Circle Join (NFCJ) algorithm. NFCJ builds an influence R-tree on the influence relationship between customers and existing facilities and joins the candidate R-tree with the influence R-tree to obtain the results. We compare all algorithms and conclude that NFCJ is the best solution, which outperforms SS, EEP, and BIP by orders of magnitude.
AB - In this paper, we propose a new type of queries to retrieve the top-k most influential locations from a candidate set (Foermula presented.) given sets of customers (Foermula presented.)and existing facilities (Foermula presented.). The influence models the popularity of a facility. Such queries have wide applications in decision support systems. A naive solution sequentially scans (SS) all data sets, which is expensive, and hence, we investigate two branch-and-bound algorithms for the query, namely Estimate Expanding Pruning (EEP) and Bounding Influence Pruning (BIP). Both algorithms follow the best first traverse. On determining the traversal order, while EEP leverages distance metrics between nodes, BIP relies on half plane pruning which avoids the repetitive estimations in EEP. As our experiments shown, BIP is much faster than SS which outperforms EEP, while the worst-case complexity of EEP and BIP is worse than that of SS. To improve the efficiency, we further propose a Nearest Facility Circle Join (NFCJ) algorithm. NFCJ builds an influence R-tree on the influence relationship between customers and existing facilities and joins the candidate R-tree with the influence R-tree to obtain the results. We compare all algorithms and conclude that NFCJ is the best solution, which outperforms SS, EEP, and BIP by orders of magnitude.
KW - Efficiency
KW - Location selection
KW - R-tree
KW - Reverse nearest neighbor
UR - http://www.scopus.com/inward/record.url?scp=84891899905&partnerID=8YFLogxK
U2 - 10.1007/s10115-013-0720-0
DO - 10.1007/s10115-013-0720-0
M3 - Article
SN - 0219-1377
VL - 43
SP - 181
EP - 217
JO - Knowledge and Information Systems
JF - Knowledge and Information Systems
IS - 1
ER -