Empirical Analysis of Ranking Models for an Adaptable Dataset Search

Angelo B. Neves*, Rodrigo G.G. de Oliveira, Luiz André P.Paes Leme, Giseli Rabello Lopes, Bernardo P. Nunes, Marco A. Casanova

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Currently available datasets still have a large unexplored potential for interlinking. Ranking techniques contribute to this task by scoring datasets according to the likelihood of finding entities related to those of a target dataset. Ranked datasets can be either manually selected for standalone linking discovery tasks or automatically inspected by programs that would go through the ranking looking for entity links. This work presents empirical comparisons between different ranking models and argues that different algorithms could be used depending on whether the ranking is manually or automatically handled and, also, depending on the available metadata of the datasets. Experiments indicate that ranking algorithms that performed best with nDCG do not always have the best Recall at Position k, for high recall levels. The best ranking model for the manual use case (with respect to nDCG) may need 13% more datasets for 90% of recall, i.e., instead of just a slice of 34% of the datasets at the top of the ranking, reached by the best model for the automatic use case (with respect to recall@k), it would need almost 47% of the ranking.

Original languageEnglish
Title of host publicationThe Semantic Web - 15th International Conference, ESWC 2018, Proceedings
EditorsAldo Gangemi, Raphaël Troncy, Roberto Navigli, Laura Hollink, Maria-Esther Vidal, Pascal Hitzler, Anna Tordai, Mehwish Alam
PublisherSpringer Verlag
Pages50-64
Number of pages15
ISBN (Print)9783319934167
DOIs
Publication statusPublished - 2018
Externally publishedYes
Event15th International Conference on Extended Semantic Web Conference, ESWC 2018 - Heraklion, Greece
Duration: 3 Jun 20187 Jun 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10843 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th International Conference on Extended Semantic Web Conference, ESWC 2018
Country/TerritoryGreece
CityHeraklion
Period3/06/187/06/18

Fingerprint

Dive into the research topics of 'Empirical Analysis of Ranking Models for an Adaptable Dataset Search'. Together they form a unique fingerprint.

Cite this