Data augmentation as a service for single view creation

Ullas Nambiar*, Tanveer A. Faruquie, K. Hima Prasad, L. Venkata Subramaniam, Mukesh K. Mohania

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

Businesses are increasingly realizing the value of creating a single view of its customers and partners by integrating information residing in 'siloed' datasets within and outside the enterprise. However, the task of augmenting data available within the enterprise with data purchased from third-party providers or that residing in a public domain such as Web often results in warehouses that contain databases having incomplete and/or inconsistent data. Hence, before the data can become useful, one must eliminate the inconsistency in values appended to the enterprise data. In this paper, we present Data Augmentation as a service (DAaS) that can help business in creating a consistent and usable single view of entities of interest. Specifically, our service will enable business rule writers to quickly create data augmentation rules by using our approximate functional dependency driven rule generation scheme. An accompanying challenge comes from having to manage a large number of rules and ensuring that new rules do not negate already existing rules. To mitigate this problem a rule-management and evaluation system that uses the Ripple Down Rules (RDR) framework is provided as part of our service. Using several large real-world datasets, we show our ability to learn rules for imputing attribute values with high accuracy and scalability necessary for enterprise users, how conflicts can arise within rules, and finally our ability to effectively handle those conflicts with high accuracy.

Original languageEnglish
Title of host publicationProceedings - 2011 IEEE International Conference on Services Computing, SCC 2011
Pages40-47
Number of pages8
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event2011 IEEE International Conference on Services Computing, SCC 2011 - Washington, DC, United States
Duration: 4 Jul 20119 Jul 2011

Publication series

NameProceedings - 2011 IEEE International Conference on Services Computing, SCC 2011

Conference

Conference2011 IEEE International Conference on Services Computing, SCC 2011
Country/TerritoryUnited States
CityWashington, DC
Period4/07/119/07/11

Fingerprint

Dive into the research topics of 'Data augmentation as a service for single view creation'. Together they form a unique fingerprint.

Cite this