Identifying candidate datasets for data interlinking

Luiz André P.Paes Leme, Giseli Rabello Lopes, Bernardo Pereira Nunes, Marco Antonio Casanova, Stefan Dietze

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

31 Citations (Scopus)

Abstract

One of the design principles that can stimulate the growth and increase the usefulness of the Web of data is URIs linkage. However, the related URIs are typically in different datasets managed by different publishers. Hence, the designer of a new dataset must be aware of the existing datasets and inspect their content to define sameAs links. This paper proposes a technique based on probabilistic classifiers that, given a datasets S to be published and a set T of known published datasets, ranks each T i â̂̂ T according to the probability that links between S and T i can be found by inspecting the most relevant datasets. Results from our technique show that the search space can be reduced up to 85%, thereby greatly decreasing the computational effort.

Original languageEnglish
Title of host publicationWeb Engineering - 13th International Conference, ICWE 2013, Proceedings
Pages354-366
Number of pages13
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event13th International Conference on Web Engineering, ICWE 2013 - Aalborg, Denmark
Duration: 8 Jul 201312 Jul 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7977 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th International Conference on Web Engineering, ICWE 2013
Country/TerritoryDenmark
CityAalborg
Period8/07/1312/07/13

Fingerprint

Dive into the research topics of 'Identifying candidate datasets for data interlinking'. Together they form a unique fingerprint.

Cite this