An effective and efficient truth discovery framework over data streams

Tianyi Li, Yu Gu, Xiangmin Zhou, Qian Ma, Ge Yu*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

20 Citations (Scopus)

Abstract

Truth discovery, a validity assessment method for conflicting data from various sources, has been widely studied in the conventional database community. However, while existing methods for static scenario involve time-consuming iterative processes, those for streams suffer from much sacrifice on accuracy due to the incremental source weight learning. In this paper, we propose a novel framework to conduct truth discovery over streams, which incorporates various iterative methods to effectively estimate the source weights, and decides the frequency of source weight computation adaptively. Specifically, we first capture the characteristics of source weight evolution, based on which a framework is modeled. Then, we define the conditions of source weight evolution for the situations with relatively small unit and cumulative errors, and construct a probabilistic model that estimates the probability of meeting these conditions. Finally, we propose a novel scheme called adaptive source reliability assessment (ASRA), which converts an estimation problem into an optimization problem. We have conducted extensive experiments over real datasets to prove the high effectiveness and efficiency of our framework.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT 2017
Subtitle of host publication20th International Conference on Extending Database Technology, Proceedings
EditorsBernhard Mitschang, Volker Markl, Sebastian Bress, Periklis Andritsos, Kai-Uwe Sattler, Salvatore Orlando
PublisherOpenProceedings.org
Pages180-191
Number of pages12
ISBN (Electronic)9783893180738
DOIs
Publication statusPublished - 2017
Externally publishedYes
Event20th International Conference on Extending Database Technology, EDBT 2017 - Venice, Italy
Duration: 21 Mar 201724 Mar 2017

Publication series

NameAdvances in Database Technology - EDBT
Volume2017-March
ISSN (Electronic)2367-2005

Conference

Conference20th International Conference on Extending Database Technology, EDBT 2017
Country/TerritoryItaly
CityVenice
Period21/03/1724/03/17

Fingerprint

Dive into the research topics of 'An effective and efficient truth discovery framework over data streams'. Together they form a unique fingerprint.

Cite this